A STUDY OF THE ANATOMY OF THE INTEGERS VIA LARGE

advertisement
A STUDY OF THE ANATOMY OF THE INTEGERS VIA LARGE PRIME FACTORS AND
AN APPLICATION TO NUMERICAL FACTORIZATION
By
TODD MOLNAR
A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
UNIVERSITY OF FLORIDA
2012
c 2012 Todd Molnar
⃝
2
ACKNOWLEDGMENTS
There are far too many people without which this thesis could not have been written;
however, I owe special thanks to my parents William and Deborah Molnar and my
brothers Bradley and Andrew Molnar. Without their support this thesis simply would not
have been possible.
I would also like to thank Dr. Peter Sin and Dr. Andrew Rosalsky for their many
useful comments, and in particular I would like to thank my advisor Dr. Krishnaswami
Alladi for his superb guidance and exceptional advice.
3
TABLE OF CONTENTS
page
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
CHAPTER
1
INTRODUCTION AND HISTORY . . . . . . . . . . . . . . . . . . . . . . . . . .
2
THE DISTRIBUTION OF PRIMES AND PRIME FACTORS . . . . . . . . . . . 35
2.1
2.2
2.3
2.4
3
Notation and Preliminary Observations .
The Prime Number Theorem . . . . . .
The Hardy-Ramanujan Theorem . . . .
Remarks . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
35
44
53
66
ARITHMETIC FUNCTIONS INVOLVING THE LARGEST PRIME FACTOR . . . 69
3.1 The Alladi-Erdős Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 Ψ(x, y ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.3 Generalized Alladi-Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4
THE KNUTH-PARDO ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . . 101
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4
Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science
A STUDY OF THE ANATOMY OF THE INTEGERS VIA LARGE PRIME FACTORS AND
AN APPLICATION TO NUMERICAL FACTORIZATION
By
Todd Molnar
December 2012
Chair: Krishnaswami Alladi
Major: Mathematics
Ostensibly, this thesis attempts to discuss some important aspects of the theory
surrounding the anatomy of the integers (to borrow a term from de Koninck, Granville,
and Luca in [15]), by which I mean the theory surrounding the prime factorization of
an integer, and the properties of these prime factors. The anatomy of the integers
is a very dense topic which has attracted the attention of theoretical and applied
mathematicians for many years, due largely to the fact that the questions involved
are often difficult and can be approached from many different angles. Virtually every
branch of mathematics has benefited from the study of the prime factorization of
the integers including (but certainly not limited to) number theory, combinatorics,
algebra, and ergodic theory (to name but a few). This thesis will concern itself with
both the theoretical and computational aspects of this study, and will use that theory to
understand the algorithmic factorization technique introduced by Knuth and Pardo in
1976.
To fully understand this algorithmic procedure, we will discuss a good deal of theory
related to the distribution of prime numbers and the largest prime factor of an integer.
The thesis is divided into four chapters, Chapter 1 is an introduction which gives a brief
overview of the motivation and rich history surrounding these deep problems. Chapter 2
sketches two different proofs of the Prime Number Theorem, which is an indispensable
tool in addressing problems related to integer factorization, as well as containing
5
several comments related to (currently) unproven results concerning the distribution of
primes (in particular, the Riemann hypothesis). One of the proofs of the Prime Number
Theorem we have included in this section appears novel, although it undoubtedly could
be deduced by any individual with sufficient knowledge of analytic number theory.
Following the work of Alladi, Erdős, Knuth, and Pardo, Chapter 3 develops the
necessary theoretical results concerning the largest prime factors of an integer. To
this end, proofs of the average value of the Alladi-Erdős functions, the average value
of the largest prime factor, a proof of a special case of Alladi’s duality principle, and
certain density estimates are supplied. The proofs presented here related to the
Alladi-Erdős functions differ from their original proofs in that we use the theory of
complex variables, whereas in their original paper Alladi and Erdős derive their results
elementarily. This allows us to improve their estimates for bounded functions, as well
as showing the connection of these estimates with the Riemann hypothesis. The proof
of Alladi’s duality principle is also derived in an analytic fashion, although not in its
most general form. This differs from Alladi’s original treatment of the problem which is
entirely elementary; however, in using analytic techniques we need to impose certain
bounds on the arithmetic functions in question, whereas the elementary approach holds
unconditionally. Further results on the largest prime factors of integers are also included,
but they are due to Knuth and Pardo.
The final section, Chapter 4, formally introduces the Knuth-Pardo factorization
algorithm and includes their proof that the probability of a random integer between 1
and N with k-th largest prime factor ≤ N x , for any given x ≥ 0 approaches a limiting
distribution; furthermore, we quote several results from the paper of Knuth and Pardo
that will prove useful for studying their algorithm.
6
CHAPTER 1
INTRODUCTION AND HISTORY
This thesis will concern itself primarily with questions related to numerical
factorization, that is, the study of the prime decomposition of integers-both from a
theoretical angle, and from a practical point of view-namely, the numerical factorization
of integers into primes and prime powers, and the running time for such a procedure.
We will take the basic factorization algorithm introduced by Knuth and Pardo as a
prototypical example due to its rudimentary nature and intuitive appeal. A study of this
algorithm provides theoretical insights into the multiplicative structure of the integers,
and leads to a useful application of the algorithm itself. Following the work of Knuth and
Pardo [14], their algorithm will be treated in a more formal fashion in Chapter 4; however,
a brief overview of the method will be enlightening as an introduction to factorization
techniques. For a given integer n, we test for a prime divisor of n among the numbers
1 < p ≤ n1/2 , and when we find a prime divisor p such that p|n we divide the integer
n by p and repeat the process for m = n/p < n until all prime divisors of n have been
√
determined, which will occur when we find a prime p|m such that p ≥ m. It is to
be noted that when the Knuth-Pardo algorithm terminates, it has already determined
the largest prime factor of the integer being factored, and therefore it offers a way to
determine the largest prime factor.
There has been a considerable amount of work done on the distribution of the
prime factors of integers, but this work is scattered in the literature. Also, while there
are many excellent books on factorization and analytic number theory, a systematic
study of the prime factors of integers with the goal to understand their anatomy is not
included in these books. This thesis will focus on developing the necessary theoretical
results on the anatomy of integers using both elementary and analytic number theory
to understand the Knuth-Pardo factorization method [14]. It is therefore the hope of the
author that this thesis will be of independent interest to those who are already familiar
7
with the published literature, as well as offering a useful introduction to those who are
not.
To fully appreciate the dense theory surrounding the largest prime factor of an
integer, the distribution of the prime numbers, and the algorithmic techniques developed
by Knuth and Pardo, we will take a brief digression in this chapter to discuss the history
and motivation for studying the questions of this thesis.
It is not clear when the classification of integers into the two distinct categories
of being prime or composite was first considered. Prime numbers are those integers
n such that if n = ab then a = 1 or a = n, composite numbers are those for which
there exist integers a and b such that n = ab with a, b > 1. Note that it is a trivial
fact that there exists infinitely many composite numbers; however, it is an important
and highly non-trivial fact that there exist infinitely many prime numbers. Again, it is
unclear who first proved this observation rigorously as it was made so long ago that
no records survive. However, this result was known to the Greek mathematician Euclid
who included its proof in Book VII of his textbook The Elements at around 300 B.C.
and therefore this result is typically attributed to Euclid by modern authors [5]. While
previous mathematicians may have been aware of this result, it was Euclid who first
recognized the necessity to systematically develop the theory of numbers and geometry
from basic axioms and it is for this reason that Euclid is considered the father of both
geometry and number theory. Euclid’s proofs are also brilliant in their simplicity; for
example, Euclid’s proof that there exist infinitely many primes easily generalizes to
prove that any infinite commutative ring with unity contains infinitely many prime ideals
(for the proof of this fact see the introductory text [13] by I. Martin Isaacs). Hence,
this basic result of Euclid anticipated the work of later mathematicians by almost
two millennia; but Euclid was also a great disseminator of mathematical knowledge,
and his magnum opus, The Elements, is considered by many to be the single most
influential text book ever published in the field of mathematics. This can hardly be
8
understated, The Elements is the second most published text in history (second only to
the Bible), was used by Arab and European mathematicians throughout the ancient and
medieval world, was one of the first mathematical texts to be set in type (by Venetian
printers in 1482), and is still widely referenced in our modern age [5]. For example, the
physicist Sir Isaac Newton learned geometry from Euclid’s texts as late as 1667, and
the philosophers Immanuel Kant (who died 1804) and Arthur Schopenhauer (who died
in 1860) vehemently defended Euclid’s geometry in their writings. The re-examination
of Euclid’s work in the 18th and 19th centuries in the hands of N. I. Lobachevsky, J.
Bolyai, C. F. Gauss, and B. Riemann led to the discovery of so called non-Euclidean
geometries. Furthermore, it should be noted that while over two thousand years have
elapsed between the time of its original publication, virtually every introductory number
theory textbook still uses Euclid’s proof to show the infinity of prime numbers.
Around about Euclid’s lifetime there lived another mathematician known as
Eratosthenes of Cyrene, who in addition to being a mathematician was also a celebrated
poet, athlete, geographer, and astronomer. Eratosthenes is perhaps best remembered
for being the first person to accurately measured the circumference of the Earth.
Sometime around 270 B.C. Eratosthenes developed what may be considered the first
algorithm for determining the prime numbers less than a given integer. This method,
known as the sieve of Eratosthenes, rests on the following simple observation (which we
√
will use throughout this thesis): if n = ab and a, b > n then n = ab > n, which is a
√
contradiction; hence, if n = ab and a ≥ b then a ≥ n ≥ b. Therefore, if n is composite,
√
then it must always contain a prime factor less than or equal to n. Eratosthenes noted
that if one lists all the numbers from 1 to N, then starting by circling the prime two, strike
off every second (i.e. even) number beyond it, then go to circle the prime three and
strike off every third number beyond it, and continue this process until you reach the
√
N,
then all the numbers which have been crossed off are composite. All the numbers which
have not been crossed off are precisely the prime numbers less than or equal to N. This
9
process is successful precisely because we are eliminating those numbers less than or
equal to N which are composite, i.e. since each of these composite numbers must have
√
a prime factor p such that p ≤ N, all of these numbers have been eliminated. Thus, we
have an effective method for determining prime numbers.
It is interesting to compare the two methods utilized by Euclid and Eratosthenes
to address the theory of prime numbers. Euclid’s proof is very useful from a theoretical
standpoint, as it always guarantees (within certain bounds) the existence of a prime
number, and in fact a direct application of his proof shows that the number of primes
p ≤ x is greater than log log(x), for x > 2, and a slightly more sophisticated argument
allows us to improve this result to log(x)/2 log(2) (see Propositions 2.4.1 and 2.4.2
in [12]). However, despite its theoretical utility and its ability to derive nontrivial lower
bounds, Euclid’s method does not offer any hope of explicitly computing the prime
numbers p ≤ x. By contrast, Eratosthenes method allows one to explicitly compute
all primes p ≤ x; nevertheless, at present no sieving procedure (be it the sieve of
Eratosthenes or its more modern generalizations such as the Legendre sieve or Brun’s
sieve) can even prove the infinitude of prime numbers (which is easily proved using
Euclid’s theorem). This highlights a major theme which will follow us for the remainder
of the thesis: theoretical results, however powerful, are often useless when attempting
to answer questions related to the explicit computation of the prime factors of integers.
However, these questions may be answered quite easily by using computational
methods (such as the Sieve of Eratosthenes or the Knuth-Pardo algorithm). Of course,
computational methods rarely address questions related to the distribution of primes,
fortunately, this is precisely where theoretical results show their mettle. Hence, if we truly
wish to understand the prime factorization of an integer we must familiarize ourselves
with the allied disciplines of computational and theoretical number theory.
After the fall of the Ptolemaic Dynasty at the hands of the Roman Empire, the era
of Greek mathematics in which Euclid and Eratosthenes lived came to an end. The
10
waning of the great Greek Empires brought with it a stagnation in the state of number
theory, and particularly the theory of prime numbers. For many years number theory,
algorithms, and algebra (relying upon personal research and translations of Greek
mathematical works) was the exclusive domain of Eastern mathematicians living in
the Arab world, and it is for this reason that the words Algebra and Algorithm have
their roots in the Arab and Farsi languages. It was not until after the middle ages that
modern western authors, such as P. Fermat and L. Euler, noted how fundamental the
primes numbers were to basic questions of arithmetic, and began to reexamine their
properties. It is often the case in number theory that one studies a given arithmetic
function f (n), where if n = ab and a and b share no common prime factors (i.e. are
relatively prime) then f (n) = f (ab) = f (a)f (b). Arithmetic functions with this property
are called multiplicative functions; and many fundamental arithmetic functions are
indeed multiplicative. Although Euclid essentially proved in antiquity that all integers can
be written uniquely as the product of prime powers, this statement was not rigorously
proved until the 18th century by the mathematician C.F. Gauss [5], and is of such
importance to the theory of numbers that it is often referred to as the Fundamental
Theorem of Arithmetic. Hence if one is given a multiplicative function f , and as distinct
prime powers are relatively prime, it follows that if we can determine the values of f at
the prime powers then we may always determine the value of f (n) where n is composite.
This fact alone motivates us to delve deeper into the properties of the prime numbers.
An interesting digression follows from not only studying the primes in the ring
Z but by studying primes in a general commutative ring R. Of course, as should be
expected, there are a few technicalities when dealing with R instead of Z. If I is a
submodule over R such that I ⊂ R then I is called an ideal (although R satisfies
this property it is customary to only look at ideals I ̸= R, which are called proper
ideals). An element u ∈ R is called a unit if u −1 ∈ R, i.e. u is a unit if it is invertible. A
nonunit a ∈ R is said to be irreducible if whenever a = bc, with b, c ∈ R then either
11
b or c is a unit (note that this definition of irreducibility was the original definition of
being prime, until later developments in algebra separated the two ideas). A nonunit
a ∈ R is said to be prime if the ideal generated by a, (a) = P, has the property that
if bc ∈ P then either b ∈ P or c ∈ P. Of course, in Z the two concepts of irreducible
elements and prime elements coincide; however, in general it is not necessary that
irreducible elements must be prime. Furthermore, Gauss’s observation that integers
factor uniquely into prime powers does not hold in rings in general, as the example of
√
√
√
6 = (2)(3) = (1 + −5)(1 − −5) demonstrates that in Z[ −5] the integer 6 may be
factored into the product of two irreducible elements in two different ways. Nevertheless,
the concept of unique factorization can be generalized to ideals in R by imposing the
condition that such ideals factor uniquely as the product of the powers of prime ideals.
This leads to the concept of Dedekind Domains, which have a central place in the
modern theory of commutative algebra (unfortunately it would take us too far afield to
discuss these algebraic structures in their full generality). Essentially, it is our hope that
the reader will recognize how important Gauss’s theorem on unique factorization is.
It is of further interest to note that while the Greek and Arab mathematicians of
antiquity were well aware of the infinity of prime numbers, and that they were all able
to prove theorems concerning prime numbers, there is no recorded evidence that any
mathematician from this era considered the question of how many prime numbers p ≤ x
or, essentially, how are the primes distributed throughout the integers? Furthermore,
there is no reference to the elementarily equivalent question of how large pi (the i th
prime number) is? While one can easily speculate as to why these questions were
not asked (namely, due to hindsight, we now know these questions are very deep), it
is entirely possible that the ancient’s lack of more advanced mathematical techniques
made it nearly impossible for them to address such questions. Also, it is only natural that
the Greeks would have published only what they could prove; hence, while there is no
evidence of anyone trying to prove results concerning the distribution of prime numbers
12
in antiquity, this should not be taken as evidence that the question was of no interest to
someone in the ancient world. The resolution of this question would occupy the attention
of many mathematicians for several centuries, before ultimately being resolved at the
end of the 19th century.
Let π(x) denote the number of primes less than or equal to x. In the late 18th
century, the French mathematician A.M. Legendre, using extensive numerical evidence,
conjectured that π(x) would be about
x
log(x)−A
for some constant A. He also further
conjectured that A ≈ 1.08366.... This constant appears to have been chosen largely so
that the estimate would fit the data available. At around the same time Legendre made
this conjecture, the prodigious German mathematician C.F. Gauss (also using extensive
numerical data) speculated that π(x) would be about
∫
li (x) =
2
x
dt
x
x
+ ...
=
+
log(t)
log(x) log2 (x)
(1–1)
Note that while the ratio
li (x)(log(x) − A)
=1
x→∞
x
lim
for any value of A, the function li (x) is a far stronger approximation than what was
conjectured by Legendre. Another reason why Gauss’s estimate is superior to that of
Legendre’s is that, had the two individuals had the extensive lists of prime numbers that
we have today, they would have noted that
x
|li (x) − π(x)| < − π(x)
log(x) − A
for larger values of x, [7].
To point out the superiority of Gauss’s estimate over that of Legendre’s, of the
78,498 primes less than one million li (x) gives an estimate of 78,628, a difference
of 130; whereas,
x
log(x)−A
gives an estimate of 72,372, a difference of about 6,116
13
(for further results see [7]). For larger values of x this discrepancy only becomes
larger, and by simple numerical estimates it became apparent that li (x) was the better
approximation to π(x). The only problem was that for all of their brilliance, neither Gauss
nor Legendre could prove this conjecture, thereafter referred to as the Gauss-Legendre
Conjecture by many notable authors, and later came to be called the Prime Number
Theorem.
While the proof of the Prime Number Theorem eluded mathematicians for the
majority of the 19th century, its utility could not be ignored. As was stated before, many
useful functions in number theory are multiplicative; therefore, if one could determine
the approximate distribution of the prime numbers then many other questions in number
theory could also be solved. Thus, a proof of the prime number theorem became a sort
of ”holy grail” for 19th century mathematicians. The next notable contribution to the
study of π(x) came from the prolific Russian mathematician P.L. Tchebyschev. Around
1850 Tchebyschev proved that
0.89
x
x
< π(x) < 1.11
,
log(x)
log(x)
(1–2)
so that the order of magnitude of π(x) conjectured by Gauss and Legendre was indeed
correct. Tchebyschev also proved that if there exists an A such that the relative
x
is minimized (i.e. if there exists a ”best possible” A in
error of π(x) −
log(x) − A the Gauss-Legendre estimate), then A = 1, disproving Legendre’s observation that
A = 1.083.... Furthermore, Tchebyschev demonstrated that if the limit
lim
x→∞
π(x) log(x)
=C
x
(1–3)
exists, then C = 1 (proofs of all of these results can be found in [7]). This result
illustrates the subtleties which arise when attempting to prove the Prime Number
Theorem, as Tchebyschev’s result states that the Gauss-Legendre conjecture is correct
14
π(x) log(x)
approaches a limit. It is tempting to take this result too far
x
and conclude the Prime Number Theorem, but recall that there is no reason (given the
π(x) log(x)
results known to mathematicians before 1850) that
must approach a limit at
x
all!
if and only if
Tchebyschev’s methods were subsequently refined and given sharper bounds in the
years following the publication of his memoir on π(x), with the best known result due to
the mathematician J.J. Sylvester who improved Tchebyschev’s estimate to
.956
x
x
< π(x) < 1.045
;
log(x)
log(x)
unfortunately, neither the brilliance of Sylvester, nor that of any of his contemporaries,
was capable of improving the Tchebyschev estimate to demonstrate the validity of the
Gauss-Legendre conjecture. This fact was lamented by Sylvester, who concluded his
article improving Tchebyschev’s estimate [28] with the statement that in order to prove
the Prime Number Theorem ”...we shall probably have to wait until someone is born into
the world as far surpassing Tchebyschev in insight and penetration as Tchebyschev has
proved himself superior in these qualities to the ordinary run of mankind.”
In many ways the dream of J.J. Sylvester was already realized several decades
earlier by the next major advancement in the theory of prime numbers, which came from
the German mathematician G.B. Riemann. Riemann’s approach to the Prime Number
Theorem differed significantly from that of his predecessor’s in that he began to unleash
the powerful techniques of complex analysis to answer the question. If we let
∞
∑
1
ζ(s) =
ns
n=1
then a well known identity due to L. Euler shows that for ℜ(s) > 1,
)−1
∞
∏(
∑
1
1
.
=
1− s
ns
p
p
n=1
15
(1–4)
Hence, the connection between the above series and the prime numbers was known
from the time of Euler, and was used to great effect by Euler and Tchebyschev; however,
they appear to have only considered this series as a function of a real variable s ∈ R,
[5]. Riemann’s great insight was that by considering the above series as a function of a
complex variable one could then extend it to a function for any s ∈ C, s ̸= 1 (a technique
which in modern terminology is known as analytic continuation); hence, this function is
now called the Riemann Zeta function in his honor, and following Riemann’s notation is
denoted ζ(s) for s ∈ C, s ̸= 1. Using Mellin inversion (a variation of Fourier’s inversion
technique) Riemann obtained an analytic expression for the function π(x), and gives this
solution as
∫ ∞
∑[
]
1
dt
ρ
1−ρ
1/2
π(x)+ π(x )+... = li (x)−
li (x ) + li (x ) −log(2)+
, (1–5)
2
2
t(t
−
1)
log(t)
x
ℑρ>0
for x > 1, where ρ represents a complex zero of the function ζ(s) [7]. This equality
demonstrates the very important connection between the complex zeros of the zeta
function and the distribution of the prime numbers, in particular, it demonstrates that if
one can show that ℜ(ρ) < 1 for all complex zeros ρ, then the Prime Number Theorem
could be proved. In addition, Riemann speculated that ℜ(ρ) =
1
2
for all complex zeros ρ
of the zeta function. This last statement is the still unproven Riemann Hypothesis, and is
considered by many to be one of the most important unsolved problems in mathematics
(see [7]).
Riemann summarized all of his results concerning the distribution of prime numbers
in his influential 8-page paper Ueber die Anzahl der Primzahlen unter einer gegebenen
Grösse (On the number of primes less than a given magnitude) which he submitted
to the Prussian Academy in 1859 in thanks for his induction there. Riemann states in
its first paragraph that he wishes to share some of his observations with the Academy,
and it is for this reason perhaps that Riemann does not include rigorous proofs of
16
many of the results which he derives. Sadly, Riemann died shortly after the publication
of this memoir, so we will never truly know if these statements were conjectures or
well-reasoned theorems whose proofs were simply omitted for brevity. Nevertheless,
Riemann’s paper essentially outlined for future generations a way to prove the Prime
Number Theorem, and over the course of the next 40-years several mathematicians
would fill these gaps and resolve the conjecture of Gauss and Legendre.
The last decade of the 19th century saw great strides in the theory of functions of
a complex variable. These results allowed the German mathematician H. von Mangoldt
to rigorously prove identity (1-5) in 1894 [7], as well as several other assertion made
by Riemann concerning the complex zeros of ζ(s). The French mathematician J.
Hadamard also succeeded in proving several necessary results on the function ζ(s) to
resolve some of Riemann’s statements; however, in 1896 the two mathematicians J.
Hadamard and C. de la Vallée Poussin finally demonstrated that ℜ(ρ) < 1, a necessary
result which they used to successfully prove the Prime Number Theorem (for more
on the history and proofs of these theorems see [7]). Of the two proofs, de la Vallée
Poussin’s is the deeper and will be sketched in section 2.2, while Hadamard’s is the
simpler. Hadamard’s proof demonstrates that
π(x) = li (x) + R(x),
where (R(x)/li (x)) → 0 as x → ∞, whereas de la Vallée Poussin’s proof demonstrates
that we may take R(x) to be some function which grows no faster than a constant times
√
−C log(x)
, for some constant C > 0, i.e.
xe
π(x) = li (x) + O(xe
−C
√
log(x)
).
(1–6)
De la Vallée Poussin established this bound by showing that if ρ = σ + i τ with σ, τ ∈ R,
then
17
σ >1−
c
,
log(τ )
(1–7)
for some constant c > 0; hence, ζ(s) has no zeros in some region about the line
ℜ(s) = 1. However, if one could prove Riemann’s hypothesis that ℜ(ρ) =
1
2
then the
error term could be improved substantially, that is, one may take R(x) to be a function
which grows no faster than a constant times x 1/2 log(x). Therefore, optimizing the
error term in the Prime Number Theorem required knowledge of the complex zeros of
ζ(s) which were not available to mathematicians of the 19th century. Largely from this
motivation, number theorists of the 20th century began to further explore the properties
of functions of a complex variable.
The advent of the 20th century brought with it a vast increase in the applications
and worldliness of number theory research. Advances in technology made communication
amongst mathematicians easier and results could disseminate more quickly than
in the past. One may note that the majority of the work done on the Prime Number
Theorem during the 19th century was accomplished by mathematicians working in
continental Europe. England, at the time, was still stymied in its reverence for the past
(then the cornerstone of the English educational system) and lagged in the theory of
functions of a complex variable. As an example, students in Cambridge University in
1910 would still prefer the use Newton’s cumbersome notation for differentiation over
that of Leibniz, and with all due respect to Newton’s mathematical talents, Leibniz’s
notation is clearly superior. However, the first two decades of the 20th century brought
English mathematicians into the continental discussion over the distribution of the prime
numbers, with great effect. During this time G.H. Hardy and J.E. Littlewood advanced
the theory of functions of a complex variable, and even succeeded in giving their own
proofs of the Prime Number Theorem which were far simpler than the originals supplied
by Hadamard and de la Vallée Poussin. They also demonstrated the logical equivalence
of the Prime Number Theorem with the statement that ζ(s) ̸= 0 where ℜ(s) = 1. G.H.
18
Hardy also succeeded in showing that there are infinitely many zeros ρ of ζ(s) such
that ℜ(ρ) = 12 , and went on to give more specific estimates of how many zeros of ζ(s)
lie on the line ℜ(s) =
1
2
(for these results, and their extensions, see [7]). To this day
G.H. Hardy’s work is some of the strongest evidence that Riemann’s Hypothesis is true,
although his work does not provide a proof of this statement.
The work of Hardy and Littlewood on functions of a complex variable, while
motivated primarily by questions related to number theory, extends far beyond the
implications of these theorems for arithmetic functions. A classical result due to the
celebrated Norwegian mathematician N.H. Abel states that if
f (z) =
∞
∑
bk z k
(1–8)
k=0
is a power series with radius of convergence 1, converging for z = 1 such that all bi ≥ 0
and
∞
∑
bk < +∞,
k=0
then
lim f (z) =
z→1−
∞
∑
bk = f (1)
(1–9)
k=0
This theorem has an analogue for Dirichlet series, which are basic objects in the study
of analytic number theory; therefore, it became apparent that if a Dirichlet series could
be shown to satisfy certain conditions, then Abel’s theorem would imply important
properties about the arithmetic functions generating these Dirichlet series. One irritating
fact for mathematicians who lived during Abel’s time is that the full converse of the above
theorem is not true, and in fact this oversight led even the very talented Abel to publish
some erroneous results concerning infinite series. It would not be until 1897 that a
partial converse to Abel’s theorem would be discovered by the German mathematician
19
A. Tauber. Tauber’s theorem states that if we let f (z) be the power series in equation
(1-8) with radius of convergence 1 and suppose that there exists an ℓ ∈ C such that
lim f (z) = ℓ,
z →1
where 0 ≤ z < 1. Furthermore, if
∑
bn n = o(x)
n≤x
then,
lim f (z) =
∞
∑
z→1
bk = ℓ < +∞.
(1–10)
k=0
This theorem, which allows one to solve for the sum of the coefficients of a power series
based solely on the analytic nature of the given power series, led to a new and powerful
analytic approach to answering questions in the theory of numbers (though it should
be noted that Tauber was not a number theorist, and the application of his theorem
to prime number theory would have to wait for the work of later authors). Hardy and
Littlewood, recognizing the utility of this approach, generalized the results of A. Tauber
in order to deduce their own proofs of the prime number theorem; furthermore, they
added a new word to the mathematical lexicon by referring to results of this type as
Tauberian Theorems, in honor of the work of Alfred Tauber. Informally, any result that
deduces properties of a function from the average of that function we will refer to as a
Tauberian theorem; and conversely, any result which deduces properties of the average
of a function from the properties of that function we will refer to as an Abelian theorem.
In general, Tauberian theorems are of greater interest to number theorists than are
Abelian theorems because the latter class of theorems is essentially an exercise (one
is given a function and so one must calculate its average) whereas the former class of
20
theorems attempts to impose restrictions upon what functions, if any, can possess a
given average.
In 1913 Hardy and Littlewood showed that the results of Tauber’s theorem would
follow from far more general assumptions. In 1931 the Serbian mathematician J.
Karamata succeeded in giving a much simpler proof of the Hardy-Littlewood Tauberian
theorem, and it is for this reason that the Tauberian theorem of Hardy-Littlewood is often
referred to as the Hardy-Littlewood-Karamata theorem. Unfortunately, even using the
Hardy-Littlewood-Karamata theorem it is not particularly straightforward how one can
deduce the prime number theorem in a ”simple” manner, and it was not until 1971 that
Littlewood supplied a ”quick” proof of the prime number theorem using this method;
however, it should be noted that Littlewood’s proof requires several deep results on the
analytic nature of the Riemann Zeta function (see [29]).
One disadvantage of the Hardy-Littlewood-Karamata theorem is that it can only deal
with singularities of a standard type, i.e. singularities of the form (s − a)−b , where a ∈ R
and b ∈ Z. This deficiency was overcome by the combined work of several prominent
mathematicians. In 1931 the American mathematician N. Wiener and the Japanese
mathematician S. Ikehara (an erstwhile student of Wiener’s) extended the work of Hardy,
Littlewood, and Karamata to include singularities of the type s −ω−1 , where ω > −1 is
any real number ([29]). Their approach, in its modern formulation, also owes much to
the work of the English mathematician A. Ingham. It is also interesting to note that the
Tauberian theorem of Wiener-Ikehara-Ingham allows one to deduce the prime number
theorem with only a single minimal assumption, namely, that the Riemann Zeta function
ζ(s) is nonzero for any complex number s such that ℜ(s) = 1; hence, the prime number
theorem and the non-vanishing of ζ(s) for ℜ(s) = 1 are logically equivalent ([29]).
As the non-vanishing of ζ(s) for ℜ(s) = 1 is an entirely complex-analytic property
of ζ(s), this result led many prominent mathematicians (such as G.H. Hardy) to believe
that the prime number theorem could not be deduced without using (either implicitly
21
or explicitly) the theory of complex variables. In fact, Hardy was so sure that the Prime
Number Theorem could not be proved without the theory of complex functions that
he famously stated that if such an elementary proof could be found, then all of our
number theory textbooks would have to be taken off of the shelves and rewritten. It
therefore came as quite a surprise when, in 1948, P. Erdös and A. Selberg succeeded in
proving the prime number theorem without the use of the Riemann Zeta function or the
properties of complex functions. Their proof, while ”elementary” in the sense that it does
not require the use of complex function theory, is not lacking in subtlety; hence, it would
be unwise to confuse ”elementary” with ”simple” (for the motivated reader, the proof may
be found in [7]).
Around the same time that Tauber, Hardy, and Littlewood were conducting their
investigations into Tauberian theorems, the Finnish mathematician H. Mellin began
considering an integral transform now referred to as the Mellin transform. Note that if we
are given a convergent Dirichlet series
D(s) =
∞
∑
dn
n=1
ns
,
and denote
S(x) =
∑
dn ,
n≤x
where S(x) = O(x α ) then we may interpret D(s), for ℜ(s) > α as the Stieltjes integral:
∫
D(s) =
1
∞
dS(t)
=s
ts
∫
1
∞
S(t)
,
t s+1
(1–11)
obtained by using integration by parts (and using the fact that ℜ(s) > α). Identity (1-11)
can also be viewed as a special case of the following much older formula due to Abel,
and which is typically referred to as the Abel summation formula. If
22
SN =
N
∑
an bn
n=0
and
Bn =
n
∑
bk
k=0
then
SN = aN BN −
N−1
∑
Bn (an+1 − an ) .
(1–12)
n=0
It is interesting to note that (1-11) can be deduced from (1-12). The integral in (1-11),
however, is now a Mellin transform, and it is an astoundingly fortunate property that
many Dirichlet series can easily be viewed as the Mellin transform of certain interesting
arithmetic functions, and it is even more fortunate that the theory of the Mellin transform
has achieved such a level of maturity in our modern age (being a generalization of the
work of several famous mathematicians). As early as 1744 the Swiss mathematician
Leonard Euler considered (in a rather un-rigorous manner) what was essentially a
Mellin transform, and his work was expanded upon by Joseph Louis Lagrange (a
great admirer of Euler’s). The modern theory of integral transforms began in 1785
with the French mathematician and physicist Pierre-Simon marquis de Laplace, who
gave the theory a more solid theoretical basis and made two important observations
regarding integral transforms. Firstly, Laplace showed that by applying a specific
Mellin transform to a given function (called the Laplace transform) one could deduce
important properties about the derivatives of the transformed function, making the
Laplace transform an important tool in the study of differential equations. Laplace’s
second observation was that one could recover the original function from its Laplace
transform by applying another integral transform, which for obvious reasons is called the
inverse Laplace transform. Another important contributor to the theory of such inversion
23
techniques was the French scientist Jean Baptiste Joseph Fourier, who’s most popular
publication Théorie analytique de la chaleur appeared in 1822. The key observation
of Laplace, Fourier, and others was that (subject to certain conditions on the function
being transformed) one could apply an integral transform to a function, deduce useful
properties about that function, and then invert the procedure to recover the original
function; thereby elucidating new properties about the function in question. It is also
interesting to note that this technique was developed largely to answer questions in heat
conduction, wave propagation, celestial mechanics, and probability.
Perhaps owing to the emphasis of integral transforms in the applied sciences, Mellin
was less interested in the number theoretic ramifications of his work than the function
theoretic implications of his inversion procedure. While the application of this inversion
procedure had essentially already been used by Hadamard and de la Vallée-Poussin
in their proofs of the Prime Number Theorem, It was not until the 1908 paper [19] when
the German mathematician O. Perron applied Mellin’s inversion procedure to a general
Dirichlet series that very important arithmetic consequences of Mellin’s work could
be appreciated. Perron succeeded in deriving a formula (now referred to as Perron’s
formula) which equated the partial sums of the coefficients of a given Dirichlet series
with the inverse Mellin transform of this Dirichlet series. More precisely, if a Dirichlet
series given by
D(s) =
∞
∑
an
n=1
ns
is convergent for all s ∈ C such that ℜ(s) > σc , and κ > max(0, σc ), then
∫ κ+i∞
1
S (x) =
D(s)x s s −1 ds,
(1–13)
2πi κ−i∞
∑
∑
1
where x > 0 and S ∗ (x) =
an if x ∈ R − N, while S ∗ (x) =
an + ax if x ∈ N. For
2
n<x
n≤x
∗
more on the derivation of Perron’s formula see Theorem 3.1.7.
24
Formula (1-13), which we will use throughout the thesis, is a remarkably useful
tool for deriving results in analytic number theory, and is one of the most useful Abelian
theorems currently known (see Chapter II.2 in [29]). In fact, the formula of Perron would
easily supersede most Tauberian theorems if it weren’t for the fact that, in general, it is a
rather difficult task to evaluate the inverse Mellin transform of a Dirichlet series.
In 1953 the Indian mathematician L.G. Sathe succeeded in deriving an asymptotic
formula for the number of integers n ≤ x whose distinct prime factors were equal
to k, denoted πk (x). While other mathematicians (such as E. Landau) were capable
of deriving this asymptotic for fixed k, Sathe demonstrated that this result (equation
(2-32)) held uniformly in k. Sathe’s original proof of this result uses the principal of
mathematical induction, and is very involved (see [24]). As an example, Sathe’s original
proof spanned over 70 pages and was in fact so involved that one year later, in 1954,
Sathe saw it fit to publish a simplified account of his original proof which, although
shorter than the original, still involves 54 pages of very difficult mathematics. In the
words of the highly influential Norwegian mathematician Atle Selberg (who in fact was
the referee of Sathe’s second paper), ”While the results of Sathe’s paper are very
beautiful and highly interesting, the way the author has proceeded in order to prove
these results is a rather complicated and involved one, and this by necessity since the
proof by induction...presents overwhelming difficulties in keeping track of the estimates
of the remainder terms...” It is nevertheless to Sathe’s credit that he could derive his
results inductively, however complicated his arguments may have been.
Selberg noted that Sathe’s results could be derived, and expanded upon, by
attacking the problem from a more classical approach (i.e. using Mellin’s inversion
theorem, which does not require the use of Sathe’s complicated inductive argument,
to be discussed in section 2.3). In a somewhat unorthodox moment, Selberg authored
a short note, [26], on Sathe’s paper which appeared in the same issue of the Journal
of the Indian Mathematical Society as Sathe’s second paper (and recall that Selberg
25
was the referee for Sathe’s paper!). Selberg successfully approximated the partial
sums of the coefficients of Dirichlet series of the type F (s) = G (s; z)ζ(s)z , where ζ(s)
is the Riemann Zeta function, ℜ(s) > 1, z ∈ C is an arbitrary complex number, and
G (s; z) is a function which is analytic in ℜ(s) > 1/2 satisfying rather modest growth
conditions (see Chapter II.5 of [29], or the discussion of the Sathe-Selberg technique in
Section 2.3); moreover, these results either reproved or expanded upon all of Sathe’s
theorems. Sadly, perhaps owing to the length of his own papers, or the brevity of
Selberg’s arguments, Sathe never published another paper in the field of mathematics.
Essentially, the Sathe-Selberg method allows us to treat Dirichlet series with
singularities of the form (s − 1)z , z ∈ C. However, it is often the case in analytic
number theory that we must consider singularities of a different type, such as logarithmic
1
singularities of the form: log( s−1
). In 1954 the French mathematician H. Delange
expanded upon the work of Sathe and Selberg to provide a theorem which applied
to all Dirichlet series satisfying modest conditions and possessing singularities of the
1
), where ω ∈ R and k ∈ Z with k ≥ 0 (See note 5.1 at the
type (s − 1)−ω logk ( s−1
end of Chapter II.5 in [29]). As a result of the combined efforts of Sathe, Selberg, and
Delange, the method utilized to derive results of this type is typically referred to as the
Sathe-Selberg or Selberg-Delange method (see Chapter II.5 The Selberg-Delange
method in [29]). In Theorem 2.2.1 of Chapter II we will state a major generalization of
the Wiener-Ikehara-Ingham theorem first proved by Delange in 1954. In many ways this
result can be viewed as the pinnacle of known Tauberian theorems, as it applies in an
astoundingly general setting (that is, for sufficiently regular Dirichlet series possessing
singularities which are both monomial and logarithmic). In Chapters I and II it will
further be demonstrated that most of our results follow from simple applications of the
Selberg-Delange method in the guise of Theorem 2.2.1.
At the same time that G.H. Hardy and J.E. Littlewood were conducting their
research in analytic number theory, the largely self-taught Indian mathematician S.
26
Ramanujan drafted a letter to Hardy in an attempt to gain recognition of his work.
Hardy, so overwhelmed with the brilliance of Ramanujan’s work, arranged for the
young mathematician to come work with him at Cambridge. This began a significant
collaboration which inspired, amongst other things, the study of additive arithmetic
functions. Recall that an arithmetic function f (n) is multiplicative if n = ab with a and
b relatively prime implies that f (n) = f (ab) = f (a)f (b). An additive function is very
similar, if n = ab with a and b relatively prime, then an arithmetic function is additive if
f (n) = f (ab) = f (a) + f (b). One example of an additive function is log(n). A more
interesting number theoretic example is the function ω(n), which is used to denote the
number of distinct prime divisors of an integer n; similarly, the number of prime power
divisors of an integer n, denoted Ω(n), is also additive. Hardy and Ramanujan proved
that the ratio
∑
n≤x ω(n)
= 1,
lim ∑
x→∞
n≤x Ω(n)
and that (as will be discussed in section 2.3) the averages of both ω(n) and Ω(n) for
n ≤ x have order of magnitude log log(x) (see [11] or [29]). This in itself is a rather
surprising result, as it states that on average most prime factors occur square-free.
This result was also one of the first results concerning the average value of an additive
arithmetic function.
Note that
∑
ω(n) =
1
p|n
and that
Ω(n) =
∑
p α ||n
27
α,
where p α ||n implies that p α |n but p α+1 does not divide n. With these definitions it is only
natural to ask about the similar functions
A∗ (n) =
∑
p
p|n
and
A(n) =
∑
αp,
p α ||n
which are the sum of the distinct prime divisors of n and the sum of the prime divisors
of n weighted according to multiplicity, respectively. These functions, just like ω(n) and
Ω(n), are also additive and are intimately connected with these functions. It is therefore
some surprise that it took over 50 years from the time of Hardy and Ramanujan’s
work before these functions were studied in earnest. The theory of these functions is
also the story of a collaboration between an eminent European and a young Indian
mathematician, and as these functions have a central role in the following thesis we will
go into some detail about this collaboration, and its ultimate results.
The prolific Hungarian mathematician P. Erdős visited Calcutta in 1974 where he
met the theoretical physicist and educator Alladi Ramakrishnan. Ramakrishnan’s son,
Krishnaswami Alladi, had been investigating the functions A(n) and A∗ (n) (defined
below as the Alladi-Erdős functions) independently and had obtained several interesting
results, as well as raising many deeper questions which he was unable to answer.
Erdős, ever eager to collaborate with young mathematicians, rerouted his flight from
Calcutta to Australia to stop in Madras so that he could visit the young K. Alladi, who
was at the time still an undergraduate. After their discussions, many of which were
conducted while walking along the beach, Erdős and Alladi published several papers
∑
∑
which proved, amongst other things, that n≤x A(n) and n≤x A∗ (n) both have order of
magnitude
π2 x 2
,
12 log(x)
[2] (the Alladi-Erdős collaborative encounter is also nicely recounted
28
in Chapter 1 of Bruce Schechter’s biography on Erdős entitled: My Brain is Open, [25]).
Note that the results of Alladi and Erdős further validate the observations from the
Hardy-Ramanujan Theorem that most integers do not have large prime power divisors.
Alladi and Erdős’s work also concerned itself with the largest prime factors of an
integer. Let P1 (n) be the largest prime factor of n, then clearly
∑
P1 (n) ≤
∑
n≤x
n≤x
however, it is a fact (first shown by Alladi and Erdős) that
of magnitude
π2 x 2
,
12 log(x)
(1–14)
A(n);
∑
n≤x
P1 (n) also has order
which is somewhat surprising. In fact, this result shows that the
majority of the value of A(n) is accounted for by P1 (n) (see [2]). In a later paper (see [3])
Alladi and Erdős proceed to evaluate the sum
∑
(A(n) − P1 (n) − ... − Pk−1 (n)) ,
n≤x
and (rather surprisingly) demonstrate that:
∑
lim
x→∞
n≤x
(A(n) − P1 (n) − ... − Pk−1 (n))
∑
= 1.
(1–15)
Pk (n)
n≤x
This result states that not only is the majority of the contribution of A(n) accounted for by
P1 (n), but the majority of the contribution of A(n) − P1 (n) is accounted for by P2 (n), i.e.
the second largest prime divisor, and so forth.
Another useful outcome from the Alladi-Erdős collaboration, was that Erdős drew
Alladi’s attention to the function Ψ(x, y ), which is defined as the number of positive
integers n ≤ x whose prime factors are smaller than y , i.e. P1 (n) ≤ y . This function,
whose first published reference can be found in the work of R. Rankin to investigate
the differences between consecutive prime numbers, has been studied extensively by
many mathematicians. It plays an important role in the Alladi-Erdős papers, and we
29
will have reason to refer to its basic properties in this thesis as well. While the original
study of the function Ψ(x, y ) is typically attributed to the paper [23] by Rankin (with
contemporaneous investigations being conducted by A. Buchstab [6], K. Dickman [10]
and V. Ramaswami [22]) it is interesting that the study of Ψ(x, y ) (and related functions)
actually appears in the much earlier work of S. Ramanujan. As the story surrounding
this area of Ramanujan’s research is particularly fascinating, we will take a brief pause
to appreciate his life and the rediscovery of his work related to Ψ(x, y ).
As was noted previously, Ramanujan was a largely self-taught Indian mathematician
who collaborated later in life with G.H. Hardy. Ramanujan had a tremendous love
for mathematics and would not only study it in his free time, but would also go on to
derive numerous identities which he recorded in his personal papers. Unfortunately,
Ramanujan was very poor, and his poverty made paper a precious commodity, so as
a result Ramanujan would often simply write down a formula without any reference to
how he had arrived at the result or (more importantly) how one could prove his claim.
Ramanujan filled his notes with literally thousands of identities which run the gamut
from simple geometric observations (which can be proven in a page or two) to absurdly
complicated identities whose proof requires hundreds of pages to establish. Sadly,
perhaps due to overwork, living in the foreign climate of Cambridge, England, or as
a result of general ill-health (typically attributed to amoebic dysentery, malnutrition,
or tuberculosis) Ramanujan died in 1920 at the age of 32. After Ramanujan passed
away, many of his personal papers were given to the University of Madras by his wife,
and the University passed a large number of Ramanujan’s notebooks to his friend and
collaborator G.H. Hardy.
After Hardy received Ramanujan’s papers, he and his contemporaries spent years
attempting to prove Ramanujan’s claims, and after Hardy’s death in 1947, many other
mathematicians resumed the work of proving Ramanujan’s identities. Most of the
posthumously published work of Ramanujan was already proven and well-known to
30
mathematicians by 1976, when the fabled ”lost notebook” of Ramanujan was discovered
by the American mathematician G. Andrews at the Wren Library at Trinity College,
Cambridge. Apparently, sometime between 1934 and 1947 G.H. Hardy handed a
number of Ramanujan’s manuscript pages to the English mathematician G.N. Watson,
who lost these pages in the sea of cluttered mathematical papers which occupied
his office. After Watson’s death, these pages were saved from incineration (but not
obscurity) by the mathematicians J.M. Whittaker and R. Rankin who had them stored
in the Wren Library. They remained there until Andrews caused a sensation in the
mathematical world when he located them in 1976. Andrews’ account of the story is
somewhat less apocryphal, and definitely less swashbuckling; apparently he had a
good idea of where to look for these lost notes in the library, although he was not certain
about their content or exact location.
Amongst the many topics studied by Ramanujan during the last years of his life
was the function Ψ(x, y ), as well as a related function called the Dickman Function.
Ramanujan managed to deduce many results concerning these functions decades
before the individuals for whom they are named, and were it not for the misplacement
of his papers may be named for him. Ramanujan’s mathematical abilities were truly
amazing, and of the 1600-or so identities contained in his lost notebook, its editor B.
Burnt speculates that less than five are incorrect. As an example of this self-taught
individual’s insight, in 1930 K. Dickman re-derived a function related to Ψ(x, y ) called the
log(x)
, x ≥ y ≥ 2. The Dickman function is
Dickman function, denoted ρ(u) where u =
log(y )
related to Ψ(x, y ) by the fact that uniformly for x ≥ y ≥ 2 we have
Ψ(x, x 1/u ) ∼ xρ(u)
and, in addition to this important connection, ρ(u) satisfies some rather surprising
identities. For instance, ρ(u) is continuous at u = 1, differentiable for u > 1, and for u > 1
satisfies the difference-differential equation
31
uρ′ (u) + ρ(u − 1) = 0,
which were all known to de Bruijn [9]. De Bruijn used the difference differential equation
to demonstrate that ρ(u) decreases very rapidly (see section 3.2 for explicit bounds).
Unknown to anyone, many of the results related to the Dickman function had already
been deduced by Ramanujan almost 20-years earlier (for more details concerning the
re-discovery of these interesting identities see [27]).
Seven years after Dickman published his work A. Buchstab derived a functional
equation for Ψ(x, y ), and two decades after Buchstab’s work, N.G. de Bruijn succeeded
in significantly improving on the results of Rankin, Ramaswami, and Dickman concerning
this function. For this reason Ψ(x, y ) is often referred to as the Buchstab-de Bruijn function, as opposed to the Ramanujan (or some other more appropriately named) function.
Around the time that Buchstab, Rankin, Dickman, and de Bruijn were investigating
Ψ(x, y ) (a period from about 1930-1960) the world entered what may be termed the
computer age. Computing machines began to outpace human calculating skills to the
point that in a very short period of time mathematicians had access to amounts of data
which were far in excess of what their predecessors had known (or even could have
known). In the present, most of us use the internet, a personal computer, or some sort
of modular computing device on a daily basis, and it would be difficult for us to imagine
a world where digital communication did not exist. This relatively cheap, reliable, and
instantaneous form of communication has one major problem, simply put, it is difficult
to determine when the message is sent in a secure fashion. To solve this problem,
many computer programmers and mathematicians have developed sophisticated means
of encrypting messages so that only the intended recipient can read the transmitted
message. These encryption techniques have become very sophisticated; however, most
rely upon one simple principle, namely, that it takes a relatively short amount of time for
a computer to multiply a given set of integers but it takes an extraordinarily large amount
32
of time to factor an integer into prime numbers. For example, some numbers used in
computer encryption take only fractions of a second to multiply, but require billions of
years using our best known factoring algorithms and supercomputers to factor into prime
numbers. Hence, numerical factorization is a very important application of mathematics
to our modern world.
In 1976, while Alladi and Erdős were conducting their theoretical investigations
into the functions A(n) and A∗ (n), D. Knuth and L. T. Pardo published a paper [14]
where they investigated the running time of what is perhaps the simplest way to factor
an integer algorithmically, namely, the algorithmic procedure described in the first
paragraph of this introduction (and we will call this method the Knuth-Pardo factorization
algorithm). This familiar procedure for factoring an integer is taught to most of us in
pre-calculus (albeit not in an algorithmic fashion) and is a very intuitive way to factor
an integer. However, it does raise the question: how fast is this procedure? Knuth and
Pardo essentially answer this question in [14].
The connection between Ψ(x, y ) and integer factorization algorithms is quite
transparent, if we could guarantee that an integer n ≤ x had all prime factors less
than or equal to y , i.e. P1 (n) ≤ y ≤ x, then we need only check for prime divisors of
n which are less than or equal to y ≤ x (which would obviously decrease the number
of trial divisions, and hence the efficiency, of the Knuth-Pardo algorithm, provided this
y is appreciably smaller than x of course). We will have much more to say about the
connection between Ψ(x, y ) and the running time of the Knuth-Pardo algorithm in
the body of the thesis. A similar observation holds if we knew the largest prime factor
P1 (n) of an integer n, then we could simply apply the algorithm to m = n/P1 (n) ≤ n,
which can now be factored faster using the Knuth Pardo algorithm (as m ≤ n). By
letting Pk (n) denote the kth largest prime factor of an integer n, it is fairly clear that
we could decrease the running time of the Knuth-Pardo algorithm even further if we
had knowledge of not only P1 (n), but also Pk (n), for k ≥ 2. For this reason Knuth and
33
Pardo investigate the mean and variance of Pk (n) in their paper theoretically, as well as
supplying extensive tables of how this effects the running time of their algorithm (see
[14]).
34
CHAPTER 2
THE DISTRIBUTION OF PRIMES AND PRIME FACTORS
2.1 Notation and Preliminary Observations
We begin by proving the classical result that there exist infinitely many primes in
a perhaps unfamiliar fashion. The following proof, due to Paul Erdős [8], is particularly
∑1
beautiful in that it not only shows the stronger result that
diverges, but uses the
p
underlying structure (or anatomy) of the integers to do so.
∑1
Theorem 2.1.1.
diverges.
p
p
Proof (Erdős): Let p1 , p2 , ... be the sequence of primes, listed in increasing order.
∑1
Assume, by way of contradiction, that
converges. Then there must exist a natural
p
p
number k such that
∑ 1
1
< .
pi
2
i≥k+1
Call the primes p1 , p2 , ..., pk the small primes and pk+1 , pk+2 , ... the large primes. Let N
be an arbitrary natural number. Then
∑ [N ] N
< .
p
2
i
i≥k+1
Now, let Nb be the number of positive integers m ≤ N which are divisible by at least one
big prime, and let NS be the number of positive integers n ≤ N which have only small
prime divisors.
First we estimate NB by noting that [ pNi ] counts the positive integers n ≤ N which are
multiples of pi . Then
Nb ≤
∑ N
N
< .
pi
2
i≥k+1
Now we estimate Ns by noting that every integer n ≤ N with only small prime factors
may be written in the form n = an bn2 , where an is square-free. Every an must therefore be
35
the product of different small primes; hence, there are precisely 2k different square-free
√
√
√
parts. Furthermore, as bn ≤ n ≤ N, we see that there must be at most N different
square parts, and so
√
Ns ≤ 2k N;
therefore,
N = Ns + Nb <
√
N
+ 2k N,
2
√
which holds for any N. Letting, say, N = 22k+2 , implies that 2k 22k+2 = 22k+1 =
N
.
2
Then
for this choice of N
N = Ns + Nb <
N
N
+ = N,
2
2
which is the desired contradiction.
Let ζ(s) be the Riemann-zeta function; as was alluded to in the introduction, this
function plays a central role in analytic number theory. We will state several of the more
important results concerning ζ(s) which will allow us to deduce the later results of this
paper. We begin with the formal definition:
Definition 2.1.2. For ℜ(s) > 1 the Riemann Zeta function is defined as
∞
∑
1
ζ(s) =
.
ns
n=1
The zeta function was first studied by Euler who derived the following product
formula, referred to by some authors as an analytic form of the fundamental theorem of
arithmetic. Euler noted that for s > 1
)−1
∞
∏(
∑
1
1
,
=
1− s
s
n
p
p
n=1
36
(2–1)
and in fact that identity holds for all complex s such that ℜ(s) > 1, [7]. This product
formula is the key to understanding how one may deduce properties of the prime
numbers from those of ζ(s). By simply taking the logarithm of ζ(s), for ℜ(s) > 1, and
noting that the Taylor series expansion of − log(1 − x) is valid for x ∈ C such that |x| ≤ 1,
we easily derive the following identity,
log ζ(s) = − log
∏(
p
=
1
1− s
p
)
=−
∑
log(1 − p −s )
(2–2)
p
∑ 1
∑ 1
∑ 1
+
+
...
=
+ h(s),
ps
2p 2s
ps
p
p
p
where h(s) is bounded for ℜ(s) >
1
,
2
and identity (2-2) holds for ℜ(s) > 1. Hence,
the logarithm of the zeta function essentially gives us the sum of the reciprocals of the
primes, raised to the s power.
The key to unraveling the asymptotic distribution of the prime numbers will be to
apply some sort of inversion technique to log ζ(s); ideally we would like to evaluate the
∑ 1
finite sums
at s = 1, but this is not in the domain of absolute convergence of
s
p
p≤x
log ζ(s) so we cannot simply set s = 1 in the above identity and conclude any meaningful
result. Instead, motivated by the applications of Fourier inversion, we may consider
ζ(s) to be a function of a complex variable s ∈ C, then applying a version of Fourier’s
inversion theorem called the Mellin inversion theorem we may evaluate finite sums such
∑
as
1. This sum is of such central importance to number theory that it warrants its
p≤x
own definition as a function:
Definition 2.1.3. We call the sum
π(x) =
∑
1
p≤x
the prime counting function, that is, π(x) is the number of primes less than or equal to x,
x ∈ R. Furthermore, define
37
Π(x) =
∞
∑
π(x 1/n )
n
n=1
to be the special prime counting function.
Note that as lim x 1/n = 1, and π(1) = 0 the special prime counting function is actually a
n→∞
finite sum.
It has been discovered in the personal papers of Euler, who was the first to note
the connection between the prime numbers and the zeta function, that he attempted to
extend ζ(s) to be a function of a complex number s. Euler, however, was only partially
successful in this attempt ([7]). As an interesting historical anecdote, it is now known
that the Euler-Maclaurin summation formula (which Euler was certainly aware of)
can be used to analytically continue the function ζ(s) to the entire complex plane
(see [30]); unfortunately, this marks one of those rare instances when Euler did not
exhaust his mathematical talents to attack the problem he was addressing. However,
a straightforward application of Stieltjes integration will allow us to analytically continue
ζ(s) beyond the line ℜ(s) = 1. Consider the integral representation
∫
∞
ζ(s) =
1−
d[t]
=s
ts
∫
1
∞
[t]
dt.
t s+1
Upon replacing [t] = t − {t} we then obtain
∫
s
1
∞
[t]
dt = s
t s+1
∫
1
∞
(
{t}
1
− s+1
s
t
t
)
s
dt =
−s
s −1
∫
1
∞
{t}
dt
t s+1
(2–3)
and as {x} is bounded, it can easily be seen that the final integral in (2-3) is bounded
for ℜ(s) > 0. Therefore, (2-3) gives the analytic continuation of the function ζ(s) to the
region ℜ(s) > 0, s ̸= 1 [30]. About 50 years after Euler’s time, Riemann succeeded
in analytically continuing ζ(s), s ̸= 1, to the entire complex plane, and in his seminal
memoir supplied several proofs of this continuation. Although it will not be necessary for
38
the purposes of this paper, it is interesting to see these complex analytic properties of
ζ(s). If we let Γ(s) be the gamma function of Euler and set
ξ(s) =
(s )
1
s(s − 1)π −s/2 Γ
ζ(s),
2
2
(2–4)
then Riemann demonstrated that
ξ(s) = ξ(1 − s),
(2–5)
and thus from (2-4) and (2-5) ζ(s) is defined for all complex numbers s, such that
ℜ(s) < 0 and ℜ(s) > 1. Equation (2-3) supplies the analytic continuation for the
remaining complex numbers s such that 0 ≤ ℜ(s) ≤ 1, s ̸= 1; these complex numbers
constitute what is referred to as the critical strip of the function ζ(s). For several proofs
of (2-3), (2-4), and (2-5) consult [29], [7], and [30].
Riemann also stated, without proof, that the function ζ(s) satisfied an infinite
product over the zeros of the zeta function. Although he was correct, this product
representation was not proved rigorously until several decades later as a consequence
of Hadamard’s proof of his more general factorization theorem, and for this reason
identity (2-6) is typically referred to as the Hadamard product. For all s ̸= 1, there exists
a constant b such that:
)
∏(
s
e bs
1−
ζ(s) =
e s/ρ
2(s − 1)Γ( 2s + 1) ρ
ρ
(2–6)
where the ρ are complex zeros of ζ(s), i.e. the zeros of ζ(s) within the critical strip.
It follows from (2-3), Riemann’s functional equation (2-5), and the properties of Γ(s)
that (s −1)ζ(s) is an analytic function for all s ∈ C; hence, ζ(s) is a meromorphic function
in C with a sole singularity at s = 1. However, in order to answer questions concerning
the prime numbers, we must consider the principal branch of the function log ζ(s), which
by the above observations is now a function of the complex variable s, with a branch
39
point singularity extending to the left of 1 in a zero-free region of ζ(s). Furthermore, by
applying Stieltjes integration we may derive the useful identity for ℜ(s) > 1:
∫
∞
log ζ(s) = s
2
Π(x)
dx,
x s+1
(2–7)
which justifies the need to define two different prime counting functions, as Π(x) is more
easily handled using analytic techniques than π(x). The two functions are nevertheless
very close to one another, as
|Π(x) − π(x)| ≤
√
x logA (x)
for some A ≥ 1. Given that π(x) is of the order of magnitude of x/ log(x), this difference
is actually quite small.
One difficulty with log ζ(s) as a complex function is that it may have other singularities,
which will correspond to the zeros of ζ(s). For this and other reasons, the zeros of ζ(s)
are of great interest to number theorists. The zeros of the Riemann Zeta function are
among the most intriguing objects in all of mathematics; however, for the purpose of
proving the prime number theorem we need only show that ζ(s) ̸= 0 for ℜ(s) = 1. The
following theorem, first proved by de la Vallée-Poussin, while very interesting is therefore
more than is required for the proof of the Prime Number Theorem. This result, when
ζ ′ (s)
, will be sufficient to prove a much stronger
combined with suitable bounds for
ζ(s)
form of the prime number theorem. The next identity follows from taking the logarithmic
derivative of equation (2.1.6), and is valid for all s ∈ C where s ̸= 1, −2n − 2 for n ∈ Z+ ,
or ρ (ρ a complex zero of ζ(s))
1
1 Γ′ ( 2s + 1) ∑
ζ ′ (s)
= log(2π) −
−
+
ζ(s)
s − 1 2 Γ( 2s + 1)
ρ
40
(
1
1
+
s −ρ ρ
)
(2–8)
It is clear that equation (2-8) holds for ℜ(s) > 1, that it holds for a general complex
s ̸= 1, −2n − 2, ρ follows from Mittag-Leffler’s theorem. The restriction that s ̸= −2n − 2
comes from the poles of the logarithmic derivative of the Gamma function.
Theorem 2.1.4. Let s = σ + i τ with σ, τ ∈ R. Then there exists an absolute constant
c
c > 0 such that ζ(s) has no zeros in the region σ ≥ 1 −
.
log(|τ | + 2)
Proof: For σ > 1 we have
(
ℜ
ζ ′ (s)
ζ(s)
)
=
∑ log(p)
p mσ
p,m
cos(mt log(p)).
Hence, for σ > 1 and any real γ,
ζ ′ (s)
−3
− 4ℜ
ζ(s)
=
∑ log(p)
p,m
p mσ
(
ζ ′ (σ + i γ)
ζ(σ + i γ)
)
(
−ℜ
ζ ′ (σ + 2i γ
ζ(σ + 2i γ)
)
(2–9)
[3 + 4 cos(mγ log(p)) + cos(2mγ log(p))]
and by the simple trigonometric identity 3 + 4 cos θ + cos 2θ = 2(1 + cos θ)2 ≥ 0, it follows
that (2-9) is greater than or equal to 0. Now
−
ζ ′ (s)
1
<
+ O(1).
ζ(s)
σ−1
(2–10)
and by (2-8)
∑
ζ ′ (s)
−
= O(log(t)) −
ζ(s)
ρ
(
1
1
+
s −ρ ρ
)
,
where ρ = β + i γ runs through the complex zeros of ζ(s). Hence,
(
−ℜ
ζ ′ (s)
ζ(s)
)
= O(log(t)) −
∑(
ρ
β
σ−β
+ 2
2
2
(σ − β) + (t − γ)
β + γ2
Now, as every term in the last sum is positive, it follows that
41
)
.
(
−ℜ
ζ ′ (s)
ζ(s)
)
(2–11)
<< log(t)
and, if β + i γ is a particular zero of ζ(s), then
(
−ℜ
ζ ′ (σ + i γ)
ζ(σ + i γ)
)
< O(log γ) −
1
.
σ−β
(2–12)
Now, from the comments following (2-9), (2-10), (2-11), and (2-12) we may obtain the
inequality
3
4
−
+ O(log γ) ≥ 0,
σ−1 σ−β
or say
3
4
−
≥ −c1 log γ.
σ−1 σ−β
Solving for β yields
1−β ≥
1 − (σ − 1)c1 log γ
.
3
+ c1 log γ
σ−1
Note that the right hand side is positive if σ − 1 =
1−β ≥
c1
, and thus
2 log γ
c2
,
log γ
which is the desired result.
In particular, Theorem 2.1.4 implies that log ζ(s) is analytic in a region extending
beyond the half-plane ℜ(s) ≥ 1, while avoiding the singularity at s = 1. Although one
can deduce the prime number theorem from these properties alone, the function log ζ(s)
will contain a branch point singularity at s = 1 as well as at every ρ for which ζ(ρ) = 0.
In general, it is more difficult, analytically, to handle branch point singularities than it
is to analyze a function without such singularities, such as a function with removable
42
singularities (although in this case Theorem 2.2.1 actually applies if we are dealing
with a logarithmic or monomial singularity). Therefore, following the classical approach
to proving the Prime Number Theorem, we may avoid this difficulty by differentiating
ζ ′ (s)
log ζ(s) to obtain
, which is now a single valued function whose poles are all
ζ(s)
removable singularities (as can be seen by equation (2-8)), and hence a meromorphic
function in C. This motivates the following definition:
Definition 2.1.5. Let n be an integer, n = p α , where p is a prime number, then define
Λ(n) = log(p); if an integer n is not the power of a prime then Λ(n) = 0. The function
∑
Λ(n), called the
Λ(n) is called the von Mangoldt function. Furthermore, define ψ(x) =
n≤x
Tchebyschev function.
The motivation for these definitions becomes apparent when we consider the
Dirichlet series generated by Λ(n), that is, for ℜ(s) > 1,
∞
∑
Λ(n)
n=1
ns
=−
ζ ′ (s)
ζ(s)
(2–13)
which is simply the negative of the logarithmic derivative of ζ(s). The von Mangoldt
function is closely related to the function Π(x) as
Π(x) =
∑ Λ(n)
,
log(n)
n≤x
(2–14)
and applying formula (1-10) (the Abel summation formula) gives the identity
ψ(x)
+
Π(x) =
log(x)
∫
x
2
ψ(t)
dt
t log2 (t)
(2–15)
relating the Tchebyschev function ψ(x) to Π(x). Note that from equation (2-15) the ratio
ψ(x)
Π(x) log(x)
approaches a limit as x → ∞ if and only if
approaches the same limit.
x
x
By a theorem of Tchebyschev’s (mentioned in the introduction) if this limit exists it must
equal 1. The Prime Number Theorem is the statement that this limit does in fact exist;
43
however, rather than dealing with the function Π(x) directly we will prove this statement
in the elementarily equivalent form
ψ(x)
= 1.
x→∞
x
(2–16)
lim
It will be useful to apply Abel summation to the Dirichlet series generated by Λ(n)
to obtain the following equalities which we will use in the proof of the Prime Number
Theorem, and which hold for all complex s such that ℜ(s) > 1:
ζ ′ (s) ∑ Λ(n)
−
=
=s
ζ(s)
ns
n=1
∞
∫
1
∞
ψ(t)
dt;
t s+1
(2–17)
furthermore, by Theorem 2.1.4 this function may be analytically continued to a
neighborhood of any point on the real line ℜ(s) = 1, s ̸= 1 (as the ζ(s) in the
denominator will be nonzero).
2.2
The Prime Number Theorem
First I will remind the reader of Landau’s big-O and little-o notation. Let g(x) and
h(x) > 0 be two real valued functions. If there exists some positive constant C such that
for all sufficiently large x
|g(x)| ≤ Ch(x)
then we say that g(x) = O(h(x)), or g(x) << h(x). Similarly, if
g(x)
=0
x→∞ h(x)
lim
then we say that g(x) = o(h(x)). If
lim
x→∞
g(x)
=1
h(x)
then we say that g(x) and h(x) are asymptotically equal, and denote this by g(x) ∼ h(x).
Finally, if there exist constants c1 and c2 such that
44
c1 h(x) ≤ g(x) ≤ c2 h(x)
then we write g(x) ≍ h(x).
We now require only one additional tool to prove the Prime Number Theorem, and
ζ ′ (s)
that is an upper bound on the function −
in the zero-free region of ζ(s) supplied by
ζ(s)
Theorem 2.1.4. Equation (2.1.11) gives a satisfactory upper bound for the real part of
this function; however, it is a fact (which will be necessary to obtain later estimates) that
c
for s = σ + it where σ ≥ 1 −
and t ≥ 3:
log(t)
−
ζ ′ (s)
= O(log(t)),
ζ(s)
(2–18)
and in fact this result can be improved to O(log3/4 (t) log log3/4 (t)) by using the methods
of I.M. Vinogradov, [30]. We may now proceed to prove the Prime Number Theorem.
The first proof of the ”strong form” of the Prime Number Theorem (that is, a proof
of the Prime Number Theorem with a nontrivial error term) was supplied by the Baron
de la Vallée-Poussin. This classical proof also closely parallels the proof of Theorem
3.1.7 in Section 3.1 (although Theorem 3.1.7 has a branch point singularity, which
requires greater care). As this proof is so well known we omit some details, however a
full justification of any comments can be found in [4], [7], [29], or [30].
Theorem 2.2.1. (de la Vallée-Poussin) There exists a constant h > 0 such that
√
ψ(x) = x + O(xe −h log(x) )
Proof: Recall that Perron’s formula (equation (1-13)) gives for κ > 1:
1
ψ(x) =
2πi
∫
κ+i∞
−
κ−i∞
ζ ′ (s) x s
ds.
ζ(s) s
c
We now deform the straight line contour as follows: fix T and let a = 1 +
log(T )
c
and b = 1 −
, with c the constant in Theorem 2.1.4. Then deform the path of
log(T )
integration between a − iT and a + iT to go horizontally from a − iT to b − iT , vertically
45
from b − iT to b + iT and horizontally from b + iT to a + iT . Taking into account the
ζ ′ (s)
simple pole at s = 1 with residue 1 of −
, we obtain:
ζ(s)
1
2πi
∫
a+iT
a−iT
1
ζ ′ (s) x s
−
ds = x +
ζ(s) s
2πi
(∫
∫
b−iT
+
a−iT
∫
b+iT
a+iT )
+
b−iT
b+iT
= x + I1 + I2 + I3 ;
consequently,
1
ψ(x) = x +
2πi
(∫
∫
a−iT
a+i∞
+
a−i∞
a+iT
(
ζ ′ (s)
−
ζ(s)
))
+ I1 + I2 + I3
= x + I4 + I5 + I1 + I2 + I3 = x + R(x).
It may then be demonstrated using the bounds of equation (2-18) that the following
estimates hold:
(
1
log(T )
I1 + I3 = O x
T
)
a
I2 = O(x b log2 (T ))
(
I4 + I5 = O
xa
T
(
)
)
1
1−a
+x
log(x) log(T )
a−1
with the dominant terms arising from (x a log(T ))/T and x b log2 (T ). To balance these
effects we set:
x a log(T )
= x b log2 (T )
T
that is,
46
a−b =
log(T )
.
log(x)
Solving for T we obtain for some constant c1 > 0:
T = e c1
√
log(x)
,
thus
R(x) = O(x b log(T )) = O(xe −c2 log
= O(xe −h
√
log(x)
1/2
(x)+c3 log log(x)
),
).
Therefore, we obtain the estimate that for some h > 0, ψ(x) = x + O(xe −h log
1/2
(x)
), and in
particular:
lim
x→∞
ψ(x)
= 1,
x
x
.
log(x)
Applying partial summation to the results of the above theorem we obtain the
thus ψ(x) ∼ x, implying that π(x) ∼
immediate corollary:
∫
√
dt
+ O(xe −h log(x) )
2 log(t)
It is also worth noting that the above corollary, due originally to de la Vallée-Poussin,
x
Corollary 2.2.2. π(x) =
was the best estimate for π(x) for over thirty years.
It should be somewhat obvious from the above proof that the error term in the
Prime Number Theorem is directly related to how far we may enlarge the zero-free
region of ζ(s). Some notable improvements include those of Littlewood, who in 1922
demonstrated that there exists a k > 0 such that ζ(s) ̸= 0 for any s = σ + it where
log log(|t| + 3)
σ >1−k
. This corresponds to an improved estimate of
log(|t| + 3)
47
∫
x
π(x) =
2
√
dt
+ O(xe −α log(x) log log(x) ),
log(t)
the proof can be found in [30], although it is significantly deeper than Theorem 2.1.4.
Still better estimates are known, although the results are very difficult to establish. The
best estimate known to date (see [29]) is
∫
π(x) =
2
x
dt
3/5
−1/5
+ O(xe −β log (x) log log (x) ),
log(t)
and is obtained using the zero-free region of Korobov and Vinogradov. This discussion
should give one the impression that improving the zero-free region of ζ(s) appears to
be a very difficult problem. For example, it is still unknown whether there exists a single
ϵ > 0 such that ζ(s) ̸= 0 for ℜ(s) > 1 − ϵ.
It should also be noted that if θ is an upper bound for the real parts of the zeros of
ζ(s) then
∫
π(x) =
2
x
dt
+ O(x θ log(x)).
log(t)
However, this formula is worthless if θ = 1, and (as the above comments note) thus far
every zero-free region which has been established for ζ(s) cannot guarantee that we
may choose θ = 1 − ϵ for any ϵ > 0.
Having demonstrated the proof of the Prime Number Theorem along classical
lines, I will now state the 1954 theorem of Delange alluded to in the introduction. This
result itself utilizes very careful estimates of ζ(s) which are much stronger than what is
required to prove the Prime Number Theorem, therefore, Delange’s theorem is hardly
the easiest way in which one could conclude the Prime Number Theorem from the
properties of ζ(s). But, Delange’s theorem will allow us to evaluate the finite sums of a
much broader class of functions than Λ(n)-especially additive functions. Delange was
actually motivated to establish his general theorem in order to estimate the moments of
48
certain additive functions, an in doing so to give a proof of the Erdős-Kac theorem by the
method of moments. However, we are going to use this theorem to study the moments
of Λ(n). Of particular interest will be the function Λ2 (n) which arises in the Selberg
asymptotic formula, used by Erdős and Selberg to give the first elementary proofs
of the prime number theorem. It should also be noted that Delange’s theorem is an
exceedingly deep result which relies upon complicated methods of contour integration,
and can in many ways be viewed as a generalization of the Sathe-Selberg method (to be
outlined in greater detail shortly).
As Delange’s theorem applies to Dirichlet series which may have branch point
singularities (at s=a, say) it is necessary to deform the path of integration beyond the
line ℜ(s) = a in some zero-free region of the series and evaluate so-called Hankel
contours. These contours are formed by the circle |s − a| = r excluding the point
s = a − r , together with the line (δ, a − r ] (for some 0 < δ < a − r + ϵ) traced out
twice, with respective arguments +π and −π. These contours are also intimately related
to the function Γ(s) (see Chapter II.5 in [29] for more on this connection); furthermore,
one may wish to consult Theorem 3.1.8 in Section 3.1, where we explicitly calculate the
Hankel contour of a branch point singularity.
While Delange’s theorem will be an indispensable tool for the remainder of
this thesis, the reader should note that its general applicability follows from difficult
properties (such as the evaluation of Hankel contours around branch-point singularities).
We include this theorem to allow for greater continuity of the arguments in the following
1
as well
theorems, as we will have to evaluate monomial singularities such as
(s − 1)2
(
)
1
as logarithmic singularities such as log
. The evaluation of general monomial
s −1
singularities was already known to Selberg; it is the evaluation of mixed monomial and
logarithmic singularities which was Delange’s major contribution to the theory. Hence,
we stress that this theorem is a highly nontrivial application of the Selberg-Delange
49
method (for a full proof, and a more thorough discussion of the Selberg-Delange method
consult Chapters II.5 and II.7 of [29]).
∞
∑
bn
be a Dirichlet series with non-negative coefficients,
Theorem 2.2.3. Let F (s) :=
s
n
n=1
converging for σ > a > 0. Suppose that F (s) is holomorphic at all points lying on the line
σ = a other than s = a and that, in the neighborhood of this point and for σ > a, we have
F (s) = (s − a)
−ω−1
q
∑
(
gj (s) log
j=0
j
1
s −a
)
+ g(s)
where ω is some real number, gj (s) and g(s) are functions holomorphic at s = a, the
number gq (a) being nonzero. If ω is a non-negative integer, then we have:
B(x) :=
∑
bn ∼
n≤x
gq (a)
x a logω (x) log logq (x)
aΓ(ω + 1)
where Γ(s) is Euler’s Gamma function.
If ω = −m − 1 for some non-negative integer m and if q ≥ 1 then:
B(x) ∼ (−1)m m!
qgq (a) a −m−1
x log
(x) log logq−1 (x)
a
Note that this theorem applies so long as the function is holomorphic at all points on
the line σ = a, s ̸= a. It is a subtle point, but worth noting, that this does not imply
that the function need be holomorphic in any open neighborhood of the entire line. For
our purposes we will be most interested in applying the Selberg-Delange’s method to
various manifestations of log ζ(s), and at present it is not known for any constant ϵ > 0
whether log ζ(s) is holomorphic in the region ℜ(s) = σ > 1 − ϵ, s ̸= 1; however, it can be
shown that log ζ(s) is holomorphic at every point on the line σ = 1, s ̸= 1.
With this theorem in hand, the prime number theorem follows easily from the known
properties of ζ(s), the Riemann Zeta Function. As ζ(s) is absolutely convergent in
σ > 1, has a simple pole at s = 1 with residue 1, and is holomorphic and nonzero
for all points lying on the line σ = 1 for s ̸= 1, we may apply the Selberg-Delange’s
50
ζ ′ (s)
, which is also an absolutely convergent Dirichlet series in σ > 1, has
ζ(s)
a simple pole at s = 1 with residue 1, has non-negative coefficients, and is holomorphic
method to −
for all points lying on the line σ = 1, s ̸= 1 (for a more detailed discussion of these
facts see [7] or [29]). This immediately implies the prime number theorem in the form:
ψ(x) = x + o(x). Of course, as was shown above, there is no need to appeal to such a
deep theorem for such a result.
Deriving the prime number theorem via the function ψ(x) is the classical approach
to solving this problem, and was used by both Hadamard and de la Vallée Poussin in
their original proofs of this theorem. In the following discussion, we will take a slightly
different approach from this traditional way of proving the prime number theorem.
This approach is implicit in the work of Hadamard, de la Vallée-Poussin, Selberg, and
Delange (to name a few), although I have never seen the result stressed in the published
literature.
Let ψ2 (x) =
∑
Λ2 (n). If we define
n≤x
Π2 (x) =
∑ Λ2 (n)
,
2
log
(n)
n≤x
(2–19)
then we may view the above sum as a Stieltjes integral, and applying the integration by
parts formula yields,
∫
x
Π2 (x) =
2
dψ2 (t)
log2 (t)
ψ2 (x)
ψ2 (2)
=
−
+2
2
log (x) log2 (2)
∫
x
2
ψ2 (t)
dt
t log3 (t)
The second term in equation (2-20) is easily seen to be
log2 (2)
ψ2 (2)
=
= 1.
log2 (2)
log2 (2)
By a classical result due to Tchebyschev ([7]) ψ(x) = O(x), implying that
51
(2–20)
ψ2 (x) =
∑
Λ2 (n) ≤ log(x)
n≤x
∑
Λ(n) = ψ(x) log(x) = O(x log(x)).
n≤x
Using Tchebyschev’s estimate we may conclude that the third term in equation (2-20) is
thus
∫
2
2
x
ψ2 (t)
dt = O
t log3 (t)
(∫
x
=O
2
dt
log2 (t)
(∫
x
t log(t)
dt
t log3 (t)
2
)
(
=O
x
log2 (x)
)
(2–21)
)
,
∫
x
obtained by a further application of the integration by parts formula to
2
dt
; hence,
log2 (t)
taking into account (2-20) and the above estimates we may infer that:
ψ2 (x)
Π2 (x) =
+O
log2 (x)
(
x
log2 (x)
)
ψ2 (x)
+1=
+o
log2 (x)
(
x
log(x)
)
.
(2–22)
Noting that the special prime counting function Π(x) is very close to Π2 (x) (as Π(x) −
Π2 (x) ≤
√
x log(x)
)
4
then if we could show that ψ2 (x) = x log(x) + o(x log(x)) it would follow
(
)
(
)
x
x
x
x
that Π2 (x) =
+o
and hence Π(x) =
+o
.
log(x)
log(x)
log(x)
log(x)
∞
∑
ζ ′ (s)
Λ(n)
1
We begin by differentiating
=−
=−
+ ... to obtain:
s
ζ(s)
n
s −1
n=1
ζ ′′ (s)
−
ζ(s)
(
ζ ′ (s)
ζ(s)
)2
=
∞
∑
Λ(n) log(n)
ns
n=1
=
1
+ O(1).
(s − 1)2
(2–23)
+ h(s)
(2–24)
Moreover,
∞
∑
Λ(n) log(n)
n=1
ns
where h(s) is analytic in s ∈ C, ℜ(s) >
=
∞
∑
Λ2 (n)
ns
n=1
1
.
2
By applying partial summation to this Dirichlet series for ℜ(s) > 1 we find that:
ζ ′′ (s)
−
ζ(s)
(
ζ ′ (s)
ζ(s)
)2
∫
∞
=s
2
52
ψ2 (x)
dx,
x s+1
(2–25)
( ′ )2
ζ ′′ (s)
ζ (s)
and as the Dirichlet series
−
has non-negative coefficients, is
ζ(s)
ζ(s)
holomorphic at all points lying on the line ℜ(s) = 1, s ̸= 1, with a sole singularity at
s = 1 of multiplicity 2 and a coefficient of 1, we may invoke Theorem 2.2.3 (Delange’s
Theorem) to conclude that:
(2–26)
ψ2 (x) = x log(x) + o(x log(x)),
inserting equation (2-26) into equation (2-22) yields
x
Π2 (x) =
+o
log(x)
(
x
log(x)
)
and therefore,
x
π(x) =
+o
log(x)
(
x
log(x)
)
,
which is the Prime Number Theorem. It is again stressed that we need not approach the
problem using the Selberg-Delange method, as the proof of Theorem 2.2.1 shows that
1
, and this method is easily adapted to
we can evaluate singularities of the type
(s − 1)
1
the singularity
in the above theorem.
(s − 1)2
2.3 The Hardy-Ramanujan Theorem
The functions ω(n) and Ω(n), which denote the number of distinct prime divisors of
n and the number of distinct prime powers dividing n, respectively, were first studied by
Hardy and Ramanujan in their 1917 paper [11]. Not only are they interesting additive
functions in their own right, but they also motivated the study of probabilistic number
theory and the Alladi-Erdős functions (which will be discussed in the next chapter).
Hardy and Ramanujan demonstrated that
∑
(
ω(n) = x log log(x) + c1 x + O
n≤x
53
x
log(x)
)
(2–27)
and
∑
(
Ω(n) = x log log(x) + c2 x + O
n≤x
x
log(x)
)
(2–28)
where c1 = 0.261497... and c2 = 1.0345653... [29]. All that is needed to prove (2-27) and
(2-28) is the fact that
∑1
p≤x
p
(
= log log(x) + c1 + O
1
log(x)
)
(2–29)
;
however, we note that Delange’s Theorem yields (2-27) and (2-28) analytically. With this
method of proof in mind, we will later introduce a companion function to Ω(n) and ω(n)
which, when taken with a well-known result of Hardy and Ramanujan, will give the result
with little effort.
Before proceeding to the Hardy-Ramanujan Theorem, we require one additional
result from [29]:
∑
(Ω(n) − ω(n)) = Cx.
n≤x
∑
1
≈ 0.773156.
p(p
−
1)
p
(
)
∑
1
Theorem 2.3.1.
ω(n) = x log log(x) + c1 x + O
log(x)
n≤x
for some well-known constant C =
Proof: The classical approach to this theorem is given by the simple observation that:
∑
n≤x
ω(n) =
∑∑
n≤x p|n
1=
∑
1=
pm≤x
∑ [x ]
p≤x
p
=x
∑1
p≤x
p
+ O(π(x))
from equation (2-29), and by Tchebyschev’s estimate (or the Prime Number Theorem),
∑
(
ω(n) = x log log(x) + c1 x + o
n≤x
54
x
log(x)
)
.
Therefore, we obtain equation (2-27), and by the comments preceding this theorem we
obtain equation (2-28).
Let πk (x) denote the number of integers n ≤ x such that ω(n) = k, and let Nk (x)
denote the number of integers n ≤ x such that Ω(n) = k. By using induction on k, Hardy
and Ramanujan succeeded in proving that there exists a constant C such that:
(
πk (x) = O
x(log log(x) + C )k−1
(k − 1)! log(x)
)
(2–30)
holds uniformly in k. The essence of their proof is the observation that
kπk+1 (x) ≤
∑
√
p≤ x
( )
x
πk
p
from which the theorem easily follows by using the Prime Number Theorem as the base
case of k = 1 and applying induction on k. For the case of Nk (x) Hardy and Ramanujan
demonstrated that there exists a constant D such that:
(
Nk (x) = O
x(log log(x) + D)k−1
(k − 1)! log(x)
)
(2–31)
holds uniformly for k ≤ (2 − δ) log log(x), δ > 0. The additional restriction on k in
equation (2-31) follows from some subtle aspects concerning the function Ω(n) which
will be discussed below.
Hardy and Ramanujan made many more observations in their seminal paper, one
of which was to introduce the concept of the normal order of an arithmetic function. An
average order is a very natural estimate to seek, for given an arithmetic function f (n) the
1∑
average order is simply
f (n). Furthermore, by summing over the function f (n) for
x n≤x
n ≤ x we are in effect smoothing out the irregularities of the function which will almost
certainly exist; hence, the average order may be influenced by sporadically occurring
irregular values. These irregular values can sometimes give misleading results, and in
this section we will give a famous example of one such situation. However, Hardy and
55
Ramanujan developed a different type of statistic called the normal order, which is in
many ways more natural and informative than the average order, and which we now
define.
Definition 2.3.2. We say that an arithmetic function f (n) has normal order g(n) if g(n) is
a monotone arithmetic function such that for any ϵ > 0
|f (n) − g(n)| ≤ ϵ|g(n)|
on a set of integers n having natural density 1.
The observations of Hardy and Ramanujan allow us to obtain interesting probabilistic
interpretations of various arithmetic functions. Recall that a random variable X is
Poisson with parameter λ > 0 if
P(X = j) =
λj −λ
e
j!
for j = 0, 1, 2, .... The expected value of a Poisson random variable is given by E [X ] = λ,
see [21].
Equation (2-30) of Hardy and Ramanujan demonstrates that πk (x) may be
interpreted as being bounded by a Poisson random variable with parameter λ =
log log(x); furthermore, this same equation shows that πk (x) is small compared to x
when |k − log log(x)| is large. This is because for every δ > 0
∑
lim e −t
t→∞
|k−t|>t 1/2+δ
tk
→0
k!
(see [21]) so that (2-30) demonstrates that
1
x→∞ x
lim
∑
|k−log log(x)|>(log log(x))1/2+δ
and, hence, that the inequality
56
ω(x) = 0
(2–32)
|ω(n) − log log(n)| < (log log(n))1/2+δ
holds on a set of integers with density 1. These observations led Hardy and Ramanujan
to deduce that ω(n) not only has average order log log(n), but also has normal order
log log(n).
In 1934 P. Turán succeeded in proving a stronger result concerning ω(n) than that
derived by Hardy and Ramanujan. This result, known as Turán’s inequality, has further
probabilistic implications for the Hardy-Ramanujan functions and is the subject of the
next theorem.
Theorem 2.3.3. (Turán)
∑
(ω(n) − log log(x))2 = O(x log log(x))
n≤x
Proof: Consider:
∑
(ω(n) − log log(x))2 =
∑
ω 2 (n)−2 log log(x)
n≤x
n≤x
=
∑
∑
ω(n)+x(log log(x))2 +O(x log log(x))
n≤x
ω 2 (n) − x(log log(x))2 + O(x log log(x)).
n≤x
Now, as
∑
n≤x
ω 2 (n) =
∑
n≤x


2

∑
∑ ∑
∑


1 =
1+
1 ,
n≤x
p|n
pq|n
p 2 |n
where it is understood that p ̸= q are distinct primes. It is easily seen that
∑∑
1 = O(x)
n≤x p 2 |n
and that
57
∑ ∑
∑ x
,
pq
pq≤x
1≤
p,q|n n≡0(pq)
where it is implicit that on the left hand side of the above equation the outer sum is
summed over all pq ≤ x and the inner sum is summed over all n ≤ x. Now, as
(
∑ x
≤x
pq
pq≤x
∑1
q≤x
)(
p
∑1
q≤x
)
q
≤ x(log log(x))2 + O(x log log(x));
hence,
∑
ω 2 (n) ≤ x(log log(x))2 + O(log log(x)).
n≤x
We may therefore conclude that:
∑
(ω(n) − log log(x))2 = O(x log log(x)),
n≤x
which is Turán’s result.
We may further note that if T > 0 is large and NT (x) denotes the sum of the
√
integers less than x, such that |ω(n) − log log(x)| > T log log(x), that is:
∑
NT (x) =
|ω(n)−log log(x)|>T
√
1,
log log(x)
where the sum is over all n ≤ x, then
∑
T log log(x)NT (x) ≤
|ω(n)−log log(x)|>T
√
(ω(n) − log log(x))2 ≤ O(x log log(x)).
log log(x)
Therefore, we may conclude that
NT (x) = O
58
(x )
T
,
and in particular, that
lim NT (x) = o(x).
T →∞
We now offer an alternative proof of equations (2-27) and (2-28) via the following
companion function:
Definition 2.3.4. Let w (n) =
∑ Λ(d)
, that is
log(d)
d|n
w (n) =
∑1
α
α
p |n
.
With this definition it should be clear that ω(n) ≤ w (n) ≤ Ω(n) for all integers n ≥ 2.
With these inequalities we can immediately deduce that:
∑
w (n) = x log log(x) + O(x).
n≤x
However, we can improve the O-term by noting that w (n) generates the following
Dirichlet series:
ζ(s) log ζ(s) =
∞
∑
w (n)
n=2
ns
.
Recall that in order to prove the Prime Number Theorem it was necessary to show that
ζ(s) ̸= 0 for s = 1 + it; hence, the function log ζ(s) has a sole singularity on the line
ℜ(s) = 1, at s = 1. This was necessary as Delange’s theorem could not be applied to
functions which do not satisfy conditions of holomorphy on some vertical line, and from
the properties of log ζ(s) the function ζ(s) log ζ(s) must be holomorphic at all points on
the line ℜ(s) = 1, save the singularity at s = 1. Of course, as was previously mentioned,
this result follows easily from equation (2-28), and is therefore not as deep a theorem
as the Prime Number Theorem. However, we wanted to see how the Hardy-Ramanujan
59
results fit within the context of Delange’s Theorem as we will provide a similar treatment
when addressing the Alladi-Erdős functions, which will be developed in the next chapter.
While the above observations are sufficient to apply Delange’s Theorem to evaluate
∑
w (n), we could actually prove much stronger results through more elaborate
the sum
n≤x
methods. This is because ζ(s) log ζ(s) may be analytically continued for all complex
s ̸= 1 in the same zero-free region as ζ(s). Although, for our purposes the gain is minor
for the additional work that would be involved.
Theorem 2.3.5. There exists a constant c3 such that
∑
(
w (n) = x log log(x) + c3 (x) + O
n≤x
x
log(x)
)
.
Proof: By the above discussion ζ(s) log ζ(s) satisfies the conditions of Delange’s
theorem; moreover, ζ(s) log ζ(s) has a singularity of the type:
1
log
(s − 1)
(
1
s −1
)
c3
+
+ c4 log
(s − 1)
(
1
s −1
)
+ ...
for s near 1. Therefore, Delange’s Theorem implies that:
∑
(
w (n) = x log log(x) + c3 x + O
n≤x
x
log(x)
)
,
as was to be demonstrated.
In the above proof it should be noted that while Delange’s powerful theorem allows
us to easily derive our result, his theorem is in many respects invoked unnecessarily. In
particular, the first proof implies that (as was mentioned before) the only real analytic
result necessary is equation (2-29), which is far more easily derived than Delange’s
theorem.
It is a rather amazing fact that many deterministic arithmetic functions can be
interpreted in a probabilistic manner, much like how the prime number theorem can be
considered to be a statement about the arithmetic mean of Λ(n). Furthermore, many of
60
these probabilistic interpretations shed much more light on various arithmetic functions
than what can be derived analytically or elementarily. The motivation for Delange’s
powerful theorem was supplied by the work of Sathe and Selberg, whose theorems
offer further probabilistic interpretations for the functions πk (x) and Nk (x). Historically,
Edmund Landau derived the estimates
πk (x) ∼
x(log log(x))k−1
(k − 1)! log(x)
(2–33)
Nk (x) ∼
x(log log(x))k−1
(k − 1)! log(x)
(2–34)
and
in 1909 [16], but he only proved these results for fixed k. Hardy and Ramanujan derived
their uniform estimates (2-30) and (2-31) in 1917 [11], yet it would take a further 36
years before Sathe proved uniform estimates in k which were comparable in accuracy
to those of Landau. Sathe no doubt found his inspiration in the papers of Hardy and
Ramanujan and proceeded to prove his results inductively, although it should be
noted that Sathe’s methods are very complicated and his results are difficult to derive.
However, by using Selberg’s argument one may derive in a more classical and natural
fashion the results of Sathe.
To arrive at these results, consider the functions:
F (s, z) =
∞
∑
z ω(n)
n=1
ns
=
∏(
p
z
1+ s
p −1
)
= ζ(s)z f (s, z)
(2–35)
= ζ(s)z g(s, z)
(2–36)
and
G (s, z) =
∞
∑
z Ω(n)
n=1
ns
=
∏(
p
z
1− s
p
)−1
where the function in (2-35) represents an analytic function of s and z for ℜ(s) > 1, and
(2-36) represents an analytic function in s and z for ℜ(s) > 1, |z| < 2. When factored in
61
terms of ζ(s)z , f (s, z) becomes analytic for all z where ℜ(s) > 1/2, and g(s, z) becomes
analytic when ℜ(s) > 1/2 and |z| < 2. The additional restriction that |z| < 2 arises from
the pole which the prime p = 2 contributes to (2-36).
In order to deduce an asymptotic formula for πk (x), Selberg first applies Perron’s
formula to (2-35)
S(z, x) =
∑
z
ω(n)
n≤x
1
=
2πi
∫
k+i∞
ζ(s)z f (s, z)
k−i∞
xs
ds
s
where k > 1, in order to get an estimate of the partial sums of the coefficients of that
series; however, this requires one to deform the straight line contour in Perron’s formula
into a zero-free region for ζ(s) in the strip 1/2 < ℜ(s) < 1. Furthermore, in general ζ(s)z
may contain branch point singularities, which will necessitate taking a Hankel contour
around the singularity at s = 1. After deforming the line of integration, one will quickly
notice that (as in Theorem 3.1.8) the majority of the contribution from the integral comes
from this branch point singularity; therefore, after Selberg demonstrated that the other
contours contribute little to the estimate of S(z, x), he obtained the result that
S(z, x) =
∑
(
z
ω(n)
= x log
z−1
(
(x)f (1, z) 1 + O
n≤x
1
log(x)
))
which is uniform if z is bounded. In [26] Selberg then solves for πk (x) by applying
Cauchy’s theorem to the sum S(z, x):
1
πk (x) =
2πi
∫
|z|=r
S(z, x)
dz
z k+1
and then optimizing the radius r of the contour. The optimal value turns out to be
k
r =
, and utilizing this result we may obtain an improvement on Landau’s
log log(x)
asymptotic
πk (x) ∼
k
f (1, log log(x)
)x(log log(x))k−1
(k − 1)! log(x)
62
which is valid uniformly for k ≤ M log log(x), M an arbitrarily large positive constant.
We now apply the Sathe-Selberg technique to derive an asymptotic for Nk (x)
which holds uniformly (in some range of k). By applying Perron’s formula in a manner
analogous to the derivation of S(z, x) (i.e. by deforming the path of integration and
evaluating the Hankel contour around the singularity at s = 1) to (2-36) we obtain the
sum
T (z, x) =
∑
(
z
Ω(n)
= x log
z−1
(x)g(1, z) 1 + O
n≤x
(
1
log(x)
))
which, by utilizing Cauchy’s theorem in a similar manner to the method used to derive
πk (x), gives
1
Nk (x) =
2πi
∫
|z|=r
T (z, x)
dz
z k+1
k
. However, as |z| < 2 this forces
log log(x)
k ≤ (2 − δ) log log(x), for any δ > 0. Evaluating this contour integral yields the
with the optimal value of the radius r =
asymptotic:
x(log log(x))k−1
,
Nk (x) ∼
(k − 1)! log(x)
which holds uniformly in k ≤ (2 − δ) log log(x).
In order to study functions of this type in general, Selberg in [26] then derives the
following deep theorem:
Theorem 2.3.6. (Selberg) Let
B(s, z) =
∞
∑
b(z, n)
n=1
for ℜ(s) > 1/2, and let
63
ns
∞
∑
|b(z, n)|
n=1
n
(log(2n))B+3
be uniformly bounded for |z| ≤ B. Furthermore, let
B(s, z)ζ(s)z =
∞
∑
a(z, n)
n=1
ns
for ℜ(s) > 1. Then, we have
A(z, x) =
∑
a(z, n) =
n≤x
B(1, z)
x logz−1 (x) + O(x logz−2 (x)),
Γ(z)
uniformly for |z| ≤ B, x ≥ 2.
With this theorem in hand we may not only obtain the above asymptotic values, but
also those for a much wider class of functions.
As was mentioned previously, the average order of a function can sometimes lead
to misleading results. To illustrate this phenomenon consider τ (n), the number divisors
of an integer n. It is a well-known fact, first demonstrated by Dirichlet, that
∑
√
τ (n) = x log(x) + (2γ − 1)x + O( x),
n≤x
and we may conclude that the arithmetic mean of τ (n) is log(n). It is therefore tempting
to assume that τ (n) will be about the size of log(n) quite often; however, this is
ω(n)
∏ α
completely false. Given the canonical decomposition of an integer n =
pi i it is
i=1
easily seen that
∏
ω(n)
2
ω(n)
≤ τ (n) =
∏
ω(n)
(αi + 1) ≤
i=1
2αi = 2Ω(n) ,
(2–37)
i=1
and from the Hardy-Ramanujan results both ω(n) and Ω(n) have average and normal
order log log(n), hence (2-37) implies
64
τ (n) = (log(n))log(2)+o(1)
on some subset of the integers having natural density 1. Thus τ (n) is more often than
not equal to (log(n))log(2)+o(1) , and is therefore significantly less than its arithmetic mean
on a set of density 1 (in particular τ (n) cannot have its average order equal to its normal
order). This fact can now be explained, but it requires the power of the Sathe-Selberg
techniques and is therefore much deeper than the earlier results concerning τ (n)
∑
described above. It will be shown presently that the sum
τ (n) is dominated by a
n≤x
small number of integers with many divisors. As
∑
n≤x
τ (n) ∼
∑
2ω(n)+o(1) ∼
n≤x
∑
2k πk (n) ∼
n≤x
∞
∑
2k x(log log(x))k−1
k=0
(k − 1)! log(x)
and this is an exponential series in 2 log log(x). Thus
∑
|k−2 log log(x)|<(log log(x))1/2+δ
2
x(2 log log(x))k−1
≍ x log(x)
(k − 1)! log(x)
(2–38)
The main contribution in (2-38) comes from the terms for which |k − 2 log log(x)| <
(log log(x))1/2+δ . For such values of k,
τ (n) = 2k(1+o(1)) = 22 log log(x)(1+o(1)) = (log(x))2 log(2)+o(1)
and as 2 log(2) > 1, these values are larger than the average order of log(n) for τ (n).
Another way to interpret this is that the average order of τ (n) is thrown off by the small
set of integers with the property that ω(n) ∼ 2 log log(n), which in light of the fact that
the average and normal orders of ω(n) are log log(n), should also be quite rare. This
explains why τ (n) can have an average order which is larger than its normal order.
To close this section we give one more interesting probabilistic interpretation of the
function ω(n). The Erdős-Kac Theorem states that for all λ ∈ R
65
∫ λ
√
1 1
2
lim n ≤ x : ω(n) ≤ log log(x) + λ log log(x) = √
e −u /2 du.
x→∞ x
2π −∞
It follows that the function ω(n) behaves as a normally distributed random variable, with
mean and variance log log(n).
2.4
Remarks
It should be noted that the prime number theorem is a rather deep theorem, and
while elementary proofs of this theorem exist (that is, theorems which do not require
the use of complex function theory), they are hardly simple. The brevity of the above
arguments proving the Prime Number Theorem (and in particular the very short proof of
the Hardy-Ramanujan Theorem) follows from the fact that most of the particulars of the
theorem are subsumed in Delange’s Theorem (Theorem 2.2.1), which is itself a highly
nontrivial result relying upon the theory of analytic functions. The above proofs do not
use any new techniques from the theory of complex variables; however, I have never
seen any published results on the asymptotic value of the summation of the powers of
the von Mangoldt function:
ψk (x) =
∑
Λk (n).
n≤x
The above approach for ψ2 (x) generalizes easily to ψk (x) for k ∈ Z, k > 2, and
only requires one to differentiate log ζ(s) k-times, and then apply Delange’s theorem. It
is easily deduced that any asymptotic result for ψk (x) is equivalent to the prime number
theorem; as asymptotic results concerning π(x) follow by applying partial summation to
the function ψk (x) in the manner outlined above, yielding
ψk (x) ∼ x logk−1 (x).
(
Furthermore, the formula ψ2 (x) = x log(x) + x + O
)
x
allows us to analyze
log(x)
the second moment (the variance) of the function Λ(n). While Λ(n) = 0 unless n is
66
a prime power, we should expect its variance to be quite large. This fact is in many
ways surprising as the Prime Number Theorem assures us that the mean of Λ(n) is
(
)
1
ψ(x)
= 1+O
, which is quite small. However, by direct calculation this
x
log(x)
variance is given by
(
(
(
))2
(
))
1∑
1
Λ(n)
1∑
2
Λ(n) − 1 + O
Λ (n) − 2Λ(n) + 1 + O
=
x n≤x
log(n)
x n≤x
log(n)
=
1
=
x
1
(ψ2 (x) − 2ψ(x) + x + O(π(x)))
x
(
(
x log(x) + x − 2x + x + O
(
= log(x) + O
1
log(x)
x
log(x)
))
)
= log(x) + o(1).
This is far in excess of the average order of Λ(n), which is in many ways to be expected.
In closing this section we make several remarks on the still unproven Riemann
hypothesis which would have major consequences for the estimate of π(x). As was
mentioned earlier, the error term in the Prime Number Theorem depends on the elusive
zero-free region of ζ(s), and Riemann Hypothesized that if ζ(ρ) = 0 and 0 < ℜ(ρ) < 1
then ℜ(ρ) = 1/2; this conjecture (if true) would allow us to improve our asymptotic
estimate to
ψ(x) = x + O(x 1/2 log2 (x))
and
π(x) = li (x) + O(x 1/2 log(x))
67
(for proofs of these results and further consequences of Riemann’s Hypothesis see
[7] and [29]). It should be noted that while there is a great deal of evidence to support
Riemann’s conjecture, this conjecture appears to be far beyond the scope of our
present understanding of mathematics. Despite efforts by some of the most talented
mathematicians in history, the Riemann Hypothesis has eluded all attempts to either
prove or disprove its validity.
68
CHAPTER 3
ARITHMETIC FUNCTIONS INVOLVING THE LARGEST PRIME FACTOR
3.1
The Alladi-Erdős Functions
In this section we will introduce various arithmetic functions and identities which
will be of use later in applications to numerical factorization. To begin, we define the
following functions:
Definition 3.1.1. Let A∗ (n) =
∑
∑
p, the sum of the prime factors of n and let A(n) =
p|n
αp be the sum of the prime factors of n, weighted according to multiplicity (Note
p α ||n
that p α ||n implies that p α |n but p α+1 does not divide n). We will call these two functions
the first and second Alladi-Erdős functions, respectively. Furthermore, we now add a
companion function to the two functions just introduced, namely, we define
A′ (n) =
∑ pα
p α |n
α
.
In their 1976 paper [2], Alladi and Erdős introduced the functions A(n) and A∗ (n), as
well as demonstrating some of their basic properties. We have introduced the function
A′ (n) as it generates a Dirichlet Series which is very convenient to work with in light
of Delange’s Theorem. Firstly, the functions A(n), A∗ (n), and A′ (n) are all additive
p2
and A∗ (n) ≤ A′ (n); furthermore, as A′ (p 2 ) = p +
and A∗ (p 2 ) = p we see that
2
A∗ (p 2 ) ≤ A′ (p 2 ), and it follows by a simple inductive argument that A∗ (n) ≤ A′ (n) for all
values of n ∈ Z+ . Other interesting properties follow by considering the asymptotic value
∑
∑
∑
of the functions f (x) =
A(n), f ∗ (x) =
A∗ (n), and f ′ (x) =
A′ (n), all of which
n≤x
n≤x
n≤x
are equal to
π2 x 2
+O
12 log(x)
(
x2
log2 (x)
)
.
(3–1)
This asymptotic was also established for the first time by Alladi and Erdős; however, we
will improve their asymptotic value to
69
∫
ζ(2)
2
x
√
t
dt + O(x 2 e −C log(x) ),
log(t)
(3–2)
(where ζ(2) = π 2 /6) as well as showing how Riemann’s hypothesis would allow us to
further improve the error term.
We will now analytically derive the asymptotic of f ′ (x) which can be approached
more easily through analysis than f (x) or f ∗ (x). We begin with a simple lemma:
Lemma 3.1.2. f ′ (x) − f ∗ (x) = O(x 3/2 )
Proof: Consider the difference f ′ (x) − f ∗ (x) which is easily seen to be equal to:
∑ p2 [ x ] ∑ p3 [ x ]
∑ pα [ x ]
(A (n) − A (n)) =
+
+ ... +
2 p2
3 p3
α pα
2
3
n≤x
p α ≤x
∑
′
∗
p ≤x
p ≤x
where p ||n. Clearly this value is majorized by:
α
∑
(A′ (n) − A∗ (n)) ≤
∑x
∑ x
+ ... +
2
α
2
p α ≤x
p ≤x
n≤x


∑ 1
∑ 1

=x
+ ... +
2
α
1/2
1/α
p≤x
(
=x
p≤x
x 1/2
x 1/3
x 1/α
+
+ ... +
log(x) log(x)
log(x)
)
+ ...
(
)
∑ x 1/2
x 1/2
≤x
+O x 2
= O(x 3/2 )
log(x)
log (x)
n≤α
(
)
x
x
, and the fact that
where we have used the fact that π(x) =
+O
log(x)
log2 (x)
∑
log(x)
n≤α 1 = α ≤ log(2) = O(log(x)). Thus we have obtained the result of the lemma.
It is interesting to contrast the difference f ′ (x) − f ∗ (x) from the above lemma with
the difference f (x) − f ∗ (x), which is much smaller. It was first proved by Alladi and Erdős
in [2], that
Lemma 3.1.3. (Alladi-Erdős) f (x) − f ∗ (x) = x log log(x) + O(x)
70
Proof: It is easily seen that the difference f (x) − f ∗ (x) is given by
∑
∑ [x ] ∑ [x ]
∗
(A(n) − A (n)) =
p 2 +
p 3 + ...,
p
p
2
3
n≤x
p ≤x
p ≤x
furthermore
∑
p 2 ≤x


∑
∑
x
x
p 2 =
+O
p
p
√ p
√
p≤ x
p≤ x
= x log log(x) + O(x)
and


∑ 1
∑ x
∑
p i =x
=O
p .
i−1
p
p
i
i
1/i
p ≤x
p ≤x
p≤x
Therefore, as
∑∑ x
p i = O(x)
p
i≥3
and we may conclude that
∑
(A(n) − A∗ (n)) = x log log(x) + O(x);
(3–3)
n≤x
which is the result of Alladi and Erdős.
Given the results of the preceding lemmas, the difference between any three of the
arithmetic functions f (x), f ′ (x), and f ∗ (x) will not exceed O(x 3/2 ) = o(x 2 / log(x)). It will
be shown later in this chapter that f ′ (x) is a function which is asymptotically equivalent
π2 x 2
to
, implying that the three functions must have the same asymptotic values.
12 log(x)
It is also interesting to note how much greater f ′ (x) grows asymptotically than the
functions f ∗ (x) and f (x). While for our purposes this discrepancy will not be important,
equation (3.1.3) shows that f (x) − f ∗ (x) = x log log(x) + O(x) which is a far smaller
71
value then f ′ (x) − f ∗ (x) = O(x 3/2 ). This is most easily explained by the fact that f ′ (x)
essentially counts prime powers, whereas the summation of the Alladi-Erdős functions
do not. Therefore, it is interesting (and perhaps somewhat counterintuitive) that each
function has the same dominant term for their respective asymptotic values. This is best
explained as saying that the contribution from the primes p which divide an integer more
than once (i.e. p α |n, where α > 1) is quite small compared to the contribution from the
largest prime divisor. To make this discussion more precise, let P1 (n) denote the largest
prime divisor of an integer n. In [2] Alladi and Erdős showed that
∑
and hence
∑
ζ(2) x 2
P1 (n) =
+O
2
log(x)
n≤x
(
x2
log2 (x)
)
(3–4)
,
P1 (n) has the same asymptotic as f ′ (x). Therefore, the majority of the
n≤x
contribution to the sum
∑
A(n) comes from the largest prime factors of the integers
n≤x
n ≤ x. This is consistent with Hardy and Ramanujan’s theorem that
∑
(Ω(n) − ω(n)) =
n≤x
O(x), where ω(n) is the number of distinct prime divisors of an integer n and Ω(n) is
the number of distinct prime powers dividing an integer n; furthermore, from equations
∑
∑
(2-27) and (2-28) both
ω(n) and
Ω(n) are asymptotically x log log(x) + O(x). That
n≤x
n≤x
ω(n) and Ω(n) are asymptotically very close allows us to surmise (albeit heuristically)
that the prime factors of most integers occur square-free. Theorem 3.1.4 also gives an
interesting result due to Erdős (see Theorem 3.1.4 below) which gives further evidence
of how the largest prime factor of n is quite dominant, and dictates the behavior of a
large class of arithmetic functions. These observations further validate the fact that
∑
P1 (n) is asymptotically the same order of magnitude as f (x), as most integers will
n≤x
only be divisible by a prime p once, the higher powers of p will be rare; therefore, they
should contribute little to the value of f (x), as they do.
This last observation raises a very interesting and natural question: let Pm (n) to be
the mth largest prime factor of an integer n, 1 ≤ m ≤ ω(n) (that is, Pm+1 (n) < Pm (n) for
72
1 ≤ m ≤ ω(n) − 1). We have already noted that
∑
P1 (n) accounts for the dominant
n≤x
term in the functions f (x), f ∗ (x), and f ′ (x); consequently, it is only natural that we ask
∑
P2 (n) accounts for the dominant term in the modified Alladi-Erdös
whether the sum
functions:
∑
n≤x
(A(n) − P1 (n)),
n≤x
∑
∑
(A′ (n) − P1 (n)), and
n≤x
(A∗ (n) − P1 (n))?
n≤x
This question was posed to Paul Erdös by Krishnaswami Alladi during their first
collaborative encounter, and was proved not long after in a much more general form in
[2]. Their solution to the problem is supplied by equation (1-13), however, we will now
state it formally as a theorem:
Theorem 3.1.4. (Alladi-Erdős) For all integers m ≥ 1, we have:
∑
(A(n) − P1 (n) − ... − Pm−1 (n)) ∼
n≤x
∑
Pm (n) ∼
n≤x
km x 1+(1/m)
,
logm (x)
where km > 0 is a constant depending only on m, and is a rational multiple of ζ(1 + 1/m)
where ζ(s) is the Riemann Zeta function.
( 1+(1/m) )
∑
x
, while asymptotically bounded
Thus, for m ≥ 2 the sums
Pm (n) = O
m
log
(x)
n≤x
by f (x), f ′ (x), and f ∗ (x), do grow appreciably. Note that for the case m = 1, the above
theorem implies that
∑
π2 x 2
P1 (n) = r
+o
6 log(x)
n≤x
(
x2
log(x)
)
,
where r ∈ Q, r > 0. In fact, for m = 1 we may take r = 1/2 so that
∑
π2 x 2
+o
P1 (n) =
12 log(x)
n≤x
(
x2
log(x)
)
;
however, at present we cannot prove that r = 1/2 by simply appealing to the above
theorem. Nevertheless, these observations further validate the previous comments
∑
concerning the sum
P1 (n).
n≤x
73
To further emphasize the dominance of the largest prime factor of an integer n, we
will prove an interesting result first derived by P. Erdös. Recall that if pn denotes the nth
prime number, then Tchebyschev’s estimate states that pn = O(n log(n)); also, a weak
form of the well-known theorem due to F. Mertens (1874) states that
∏(
p≤x
1
1−
p
)−1
= O(log(x)).
(3–5)
All of these results can be made more precise, and can be found in [29].
Our next theorem, due to Paul Erdős, gives an interesting (and in many ways
surprising) characterization of an arithmetic function in terms of its largest prime factor:
Theorem 3.1.5. (Erdős) Let f (n) > 0 be a non-decreasing arithmetic function. Then the
sum
∞
∑
n=1
1
f (n)n
converges if and only if the sum
∞
∑
n=1
1
f (P1 (n))n
converges.
Proof: As f (n) is non-decreasing f (P1 (n)) ≤ f (n), thus
∞
∑
n=1
∑
1
1
≤
,
f (n)n
f
(P
(n))n
1
n=1
∞
∞
∑
∞
∑
1
1
so that if
converges,
must also converge.
f (P1 (n))n
f (n)n
n=1
n=1
To prove the converse, consider the following:
∑
n≤x
∑ 1
∑ 1
1
=
f (P1 (n))n
f (p)
n
p≤x
P1 (n)=p
74
)−1
∑ 1
∑ 1 ∏(
∑ 1
1
=
=
1−
f
(p)
mp
f
(p)p
q
q<p
p≤x
p≤x
P1 (m)≤p
(
=O
∑ log(p)
p≤x
)
f (p)p
(
≤O
∑ log(pn )
f (pn )pn
n≤x
)
where pn denotes the nth prime number. Now, by Tchebyschev’s estimate,
∑ log(pn )
=O
f
(p
)p
n
n
n≤x
(
∑
n≤x
1
f (pn )n
)
and as n ≤ pn it follows that
∑
n≤x
∑ 1
1
≤
,
f (pn )n
f
(n)n
n≤x
and taking the limit as x → ∞ implies, by virtue of the above inequalities, that if
∞
∑
n=1
1
f (n)n
converges, then
∞
∑
n=1
1
f (P1 (n))n
must also converge, and the result follows.
To analytically derive the desired asymptotic bounds for the Alladi-Erdös functions
we will consider the Dirichlet series generated by A′ (n), which arises naturally from the
∑ pα
study of the Riemann Zeta function ζ(s) discussed in Chapter 2. As A′ (n) =
=
α
p α |n
∑ Λ(d)
d, where Λ(n) is the von Mangoldt function, it follows that:
log(d)
d|n


∞
∞
′
∑
∑
∑
Λ(d)  1
A (n)

=
d
(3–6)
s+1
n
log(d)
ns+1
n=1
n=1
d|n
75
=
∞
∞
∑
1 ∑ Λ(n) n
ns+1 n=1 log(n) ns+1
n=1
= ζ(s + 1) log ζ(s).
Using the fact that ζ(s) is analytic in ℜ(s) > 1, ζ(1 + it) ̸= 0, ∀t ∈ R, and has a sole
singularity at s = 1 we may deduce that ζ(s + 1) log ζ(s) is holomorphic at all points lying
on the line ℜ(s) = 1, s ̸= 1; therefore, we may apply Delange’s theorem to conclude the
following:
( 2 )
π2 x 2
x
Theorem 3.1.6. f (x) =
A (n) =
+o
12 log(x)
log(x)
n≤x
∞
∑ A′ (n)
Proof: As equation (3-6) demonstrates,
= ζ(s + 1) log ζ(s). Applying Delange’s
ns+1
n=1
theorem (Theorem 2.2.1) yields:
∑
′
∑ A′ (n)
n≤x
n
′
x
= ζ(2)
+o
log(x)
(
x
log(x)
)
π2 x
=
+o
6 log(x)
(
x
log(x)
)
,
and a straightforward application of Abel summation (equation (1-12)) demonstrates that
∑
π2 x 2
f (x) =
A (n) =
+o
12 log(x)
n≤x
′
′
(
x2
log(x)
)
,
which is the desired result.
Using Mellin’s inversion theorem we may improve this asymptotic estimate by taking
into account the zero-free region of ζ(s) supplied by Theorem 2.1.3. To improve the
estimate we will not need the full power of Mellin’s inversion theorem, in fact, for our
purposes the following theorem will suffice:
Theorem 3.1.7. Let k > 1 and x > 0, then the integral
1
2πi
equals 1 if x > 1,
1
2
∫
k+i∞
k−i∞
if x = 1, and 0 if 0 < x < 1.
76
x s+1
ds
s +1
If D(s) =
∑∞
d(n)
n=1 ns+1
is any Dirichlet series, where k > σa ≥ 1 is chosen such that D(s)
lies in a domain of absolute convergence, then
1
2πi
∫
k+i∞
k−i∞
1
xs
D(s) ds =
s
2πi
∫
k+i∞
k−i∞
(∞
)
∑ d(n) x s+1
ds
s+1
n
s
+
1
n=1
( x )s+1 ds
( x )s+1 ds
∑ 1 ∫ k+i∞
∑ 1 ∫ k+i∞
∑
=
d(n)
+
d(n)
=
d(n)
2πi k−i∞
n
s + 1 n≥x 2πi k−i∞
n
s + 1 n≤x
n≤x
if x ∈ R − Z. If x ∈ Z then the above integral equals
d(x) ∑
+
d(n).
2
n<x
Note that the assumption that s is in a domain of absolute convergence is essential in
order to justify the interchanging of the infinite sum and the integral.
As was mentioned in the introduction, the first person to apply the above inversion
technique to Dirichlet series was Oskar Perron in his 1908 article [19]. It is for this
reason that the formula equating the sum of the coefficients of a Dirichlet series D(s)
with the inverse Mellin transform of D(s) is often referred to as Perron’s formula.
Recall that equation (3-6) shows that
∞
∑
A′ (n)
n=1
ns+1
= ζ(s + 1) log ζ(s),
which is now in a form where we may apply Perron’s formula. As the evaluation of
integrals of this form is now classical, we will only sketch the proof of how one may use
Perron’s formula to derive the desired estimate. For the full details of how to evaluate
contour integrals of this form one may consult Chapter 5 of [7], Chapter II.4 of [29], or for
a step by step demonstration see Chapter 2 of [4].
We motivate the discussion by the following somewhat informal argument, as we
will be taking a line integral it follows from Cauchy’s theorem that we may deform the
77
line of integration to be a suitable contour. For our purposes we will simply take the
classical contour of integration chosen by de la Vallée-Poussin, which requires us to
take ℜ(s) = σ > 1 −
c
log(t)
for some constant c > 0. However, for s in this region the
function ζ(s + 1) is analytic, hence bounded, so we should suspect that it does not make
any major contribution to the integral. Thus, the majority of the contribution from the
contour will occur at the logarithmic singularity at s = 1 of ζ(s + 1) log ζ(s), which has
a coefficient of ζ(2); also, the error term in our estimate will be directly related to how
far we may take our contour into the critical strip. As we will not be taking our contour
further than σ = 1/2, and ζ(s + 1) ≤ ζ(3/2) is bounded in this region, the error term will
be bounded by a function times the error term implied by the Prime Number Theorem.
Hence, we should suspect that the integral is closely approximated by:
ζ(2)
2πi
∫
k+i∞
k−i∞
∫ x
√
∑ Λ(n)
x s+1
t
log ζ(s)
ds = ζ(2)
n = ζ(2)
dt + O(x 2 e −c log(x) )
s +1
log(n)
2 log(t)
n≤x
which follows from the Prime Number Theorem. The next theorem is simply the
statement that all of these observations are in fact accurate. Furthermore, we remind the
reader that the following theorem is merely a sketch of the proof as many of the details
have been omitted.
Theorem 3.1.8. There exists a constant c > 0 such that
∫
′
x
f (x) = ζ(2)
2
√
t
2 −c log(x)
dt + O(x e
).
log(t)
Proof: First, note that for a > 1
1
f (x) =
2πi
′
∫
a+i∞
ζ(s + 1) log ζ(s)
a−i∞
for non-integral x, whereas the integral equals f ′ (x − 1) +
x s+1
ds
s +1
A′ (x)
2
= f ′ (x) + o(x 3/2 ) if x ∈ Z;
thus the choice of an integral or non-integral x > 0 will not affect our estimate. In order
78
to evaluate this line integral we will deform the path of integration as far into the critical
strip, 0 ≤ σ ≤ 1, as possible while avoiding the singularities of the integrand, also the
integrand has a branch cut on the real line which must be handled with care.
Therefore, we choose the contour as follows: first let T be fixed, b = 1 −
a = 1+
c
log(T )
c
,
log(T )
and
where c is the constant in de la Vallée-Poussin’s zero-free region. The
classic path of integration is given by the vertical line from a + i ∞ to a + iT , followed
by the horizontal line from a + iT to b + iT , then the vertical line from b + iT to b + i ϵ.
The contour then proceeds around the branch cut avoiding the singularity at s = 1,
with a semicircle of radius ϵ and center 1, then follows the horizontal path below the
branch cut to b − i ϵ. The remainder of the path is merely the contour above the real axis
reflected about the real line, that is, the vertical line from b − i ϵ to b − iT , the horizontal
line from b − iT to a − iT , and the vertical line from a − iT to a − i ∞. Note that by de
la Vallée-Poussin’s theorem and our choice of a and b this contour lies in a domain of
analyticity of ζ(s + 1) log ζ(s). Hence, by Cauchy’s theorem, we may conclude that
∫
1
f (x) =
2πi
′
1
=
2πi
(∫
∫
a−iT
+
a−i∞
∫
b−iT
a−iT
ζ(s + 1) log ζ(s)
a−i∞
∫
b−iϵ
+
a+i∞
∫
+
b−iT
+
cut
∫
b+iT
b+iϵ
∫
a+iT
+
x s+1
ds
s +1
a+i∞ )
+
b+iT
ζ(s +1) log ζ(s)
a+iT
x s+1
ds
s +1
= D(x) + E (x).
where D(x) denotes the dominant term in our asymptotic estimate and E (x) denotes the
error term, i.e. E (x) = o(D(x)).
It is a well-known fact that the error term in the Prime Number Theorem is directly
related to how far we may move the path of integration of log ζ(s) into the critical strip,
and as our integrand ζ(s + 1) log ζ(s) is very similar to the function evaluated in the
classical proofs of the prime number theorem it is not difficult to justify that the error
79
term of f ′ (x) is also related to how far we may deform our contour into the critical strip.
In fact, for our purposes the error term E (x) will be supplied by the contours which are
not around the branch point (for a justification of this fact and further details see [4]).
Specifically,
1
E (x) =
2πi
(∫
∫
a−iT
+
a−i∞
∫
b−iT
+
a−iT
∫
b−iϵ
∫
b+iT
+
+
b−iT
b+iϵ
a+i∞ )
∫
a+iT
ζ(s +1) log ζ(s)
+
b+iT
a+iT
x s+1
ds.
s +1
For ℜ(s) = σ > 0 the function ζ(s + 1) is maximized on the real line, and as the path
of integration does not penetrate as far as σ =
1
2
into the critical strip, we may conclude
that,
ζ(3/2)
E (x) ≤
2πi
(∫
∫
a−iT
+
a−i∞
∫
b−iT
+
a−iT
∫
b−iϵ
+
b−iT
∫
b+iT
+
b+iϵ
∫
a+iT
a+i∞ )
+
b+iT
log ζ(s)
a+iT
x s+1
ds.
s +1
From here it is a straightforward, though detailed, process to estimate E (x).
However, we may side-step the issue of directly evaluating the six contours by noting
that
∫ a+i∞
∑ p
1
x s+1
=
ds,
log ζ(s)
α
2πi
s
+
1
a−i∞
α
p ≤x
then, choosing the same contour as before, it follows from the Prime Number Theorem
∫ x
√
∑ p
t
that
=
dt + O(x 2 e −c log(x) ). Thus, the error term is given by
α
2 log(t)
p α ≤x
1
2πi
(∫
∫
a−iT
+
a−i∞
∫
b−iT
+
a−iT
∫
b−iϵ
+
b−iT
∫
b+iT
+
b+iϵ
∫
a+iT
a+i∞ )
+
b+iT
log ζ(s)
a+iT
√
x s+1
ds = O(x 2 e −c log(x) ),
s +1
which differs from the contour integral we wish to evaluate by a constant. Therefore
√
√
E (x) ≤ O(ζ(3/2)x 2 e −c log(x) ) = O(x 2 e −c log(x) ). Note that the implicit constant in the
O-term is far from optimal; however, for our purposes we need only show that such a
constant exists.
80
The final step necessary to establish Theorem (3.1.7) is to evaluate the integral
along the branch cut, which we now proceed to do:
1
2πi
∫
ζ(s + 1) log ζ(s)
cut
x s+1
ds
s +1
∫
∫
x s+1
1
ζ(s + 1) log((s − 1)ζ(s))
ds −
s +1
2πi
cut
1
=
2πi
ζ(s + 1) log(s − 1)
cut
x s+1
ds
s +1
The first integral is zero because ζ(s + 1) log((s − 1)ζ(s)) is regular and single-valued
along the cut; hence, the integrand is regular, and the integral along the upper side
cancels the integral along the lower side. To evaluate the second integral we make the
substitution s − 1 = ϵe iθ for −π < θ < π. With this substitution log(s − 1) = log(ϵ) + i θ,
and therefore the value of log(s − 1) along the lower portion of the branch cut differs from
the value of log(s − 1) along the upper portion of the branch cut by 2πi . Letting γ be a
semicircle of radius ϵ centered at s = 1, then the above integral reduces to evaluating:
1
2πi
1
=
2πi
∫
b
1
∫
ζ(s + 1) log(s − 1)
cut
x s+1
1
ζ(2 + ϵ) (log(ϵ) − i π)
ds +
s +1
2πi
1
+
2πi
∫
x s+1
ds
s +1
b
ζ(2 + ϵ) (log(ϵ) + i π)
1
∫
ζ(2 + ϵ) (log(ϵ) + i θ)
γ
(3–7)
x s+1
ds
s +1
x s+1
ds.
s +1
Now, we need only evaluate the three integrals in (3-7); letting ϵ → 0, the third
integral is easily seen to be
1
2πi
∫
ζ(2 + ϵ) (log(ϵ) + i θ)
γ
(
)
x s+1
ds = O log(ϵ + π)x 2+ϵ 2πϵ = o(1).
s +1
To obtain the desired value for D(x) we note that,
81
1
2πi
∫
b
1
x s+1
1
ζ(2 + ϵ) (log(ϵ) − i π)
ds −
s +1
2πi
∫
1
∫
ζ(2 + ϵ) (log(ϵ) + i π)
b
x s+1
ζ(2 + ϵ)
ds = −
s +1
=−
b
1
∫
1
ζ(2)
b
x s+1
ds
s +1
x s+1
ds,
s +1
as ϵ → 0. Thus we are left with evaluating the integral,
∫
1
ζ(2)
b
x s+1
ds.
s +1
(3–8)
Letting u 2 = x s+1 in (3-8) gives,
∫
1
x s+1
ds = ζ(2)
s +1
ζ(2)
b
∫
x
= ζ(2)
2
∫
b
furthermore, as log(u) > log(2) for 2 ≤ u ≤ x
∫
b
1
x 2+2
u
du + ζ(2)
log(u)
b
+ 12
2
x
∫
x
u
du
log(u)
2
b+1
2 2
u
du;
log(u)
we see that
1
x 2+2
ζ(2)
2
b
1
u
du = O(x 2 + 2 )
log(u)
2 −c
√
log(x)
). Therefore
which will be absorbed into the error term of E (x) = O(x e
∫ x
t
D(x) = ζ(2)
dt, and as f ′ (x) = D(x) + E (x) we have our desired estimate:
2 log(t)
′
∫
f (x) = ζ(2)
2
x
√
t
2 −c log(x)
dt + O(x e
)
log(t)
(3–9)
which completes the proof.
We reiterate that the above proof has omitted several important details, specifically,
the explicit evaluation of the contours which are away from the branch point. However,
the reader need not worry about the rigor of the above proof as integrals of this form
are now classical in analytic number theory, and can be found in most texts on the
82
topic. In fact, in 1903, using a variant of the above integral representation, Edmund
Landau succeeded in giving his own proof of the Prime Number Theorem by using
the very same contour chosen above, and it is this proof which Landau included in
his classic text [16] (a text which achieved so much fame that mathematicians such
as G.H. Hardy simply referred to it as the Handbuch). The evaluation of each of these
contours is not terribly difficult, although there are several particulars which must be
taken into account. For example, log ζ(s) is a multiple-valued function with a singularity
at s = 1, and this necessitates taking a branch of the logarithm which in turn makes the
contour integration more difficult. Furthermore, some of the contours are not absolutely
convergent, which further complicates their evaluation. In the above proof we avoided
these difficulties by noting that the contours away from the branch point correspond to
the error term in the Prime Number Theorem, and hence can be derived in a simple
fashion from this observation. However, it should be mentioned that the error term in the
Prime Number Theorem arises from the evaluation these contours. In essence, we have
not avoided the task of evaluating these contours, but rather, we have invoked a theorem
which allows us to avoid their explicit computation.
Furthermore, note that by applying integration by parts to the integral in (3-9) that:
∫
′
f (x) = ζ(2)
2
x
ζ(2) x 2
t
dt =
+O
log(t)
2 log(x)
(
x2
log2 (x)
)
we may re-verify our previous estimate of f ′ (x).
Assuming the Riemann Hypothesis, all of the complex-valued singularities of
log ζ(s) will have ℜ(s) = 12 . If one makes this assumption when evaluating the above
contour integral, then we may improve our estimate to:
′
∫
f (x) = ζ(2)
2
x
t
dt + O(x 3/2 log(x)),
log(t)
which is the best possible estimate using these methods. As f ′ (x) − f (x) = O(x 3/2 )
and f ′ (x) − f ∗ (x) = O(x 3/2 ) this also improves the error terms of f (x) and f ∗ (x).
83
Furthermore, it should be noted that improving the error term in the asymptotic
estimate of f ′ (x) (or in the estimate of f (x) or f ∗ (x)) to O(x 3/2 log(x)) is equivalent
to the Riemann hypothesis. Recall that Theorem 3.1.3 above, due to Alladi and Erdös,
∑
Pm (n) = o(x 3/2 ) for all m ≥ 2. This shows that sums of
demonstrates that the sum
n≤x
this form are far smaller than even the best error terms for f (x), f ′ (x), and f ∗ (x), which
is to be expected, as their combined sum (in many ways) will determine this error term.
As was already stated, Alladi and Erdős derived the asymptotic results
π2 x 2
f (x) =
+O
12 log(x)
(
x2
log2 (x)
)
and
∑
π2 x 2
P1 (n) =
+O
12
log(x)
n≤x
(
x2
log2 (x)
)
in [2] using entirely elementary methods. Further results of this nature were also derived
by Knuth and Pardo who, using only elementary methods, derived the asymptotic values
for the mean and standard deviation of the largest prime factor of n, P1 (n). The following
theorem is a restatement of this result, which we have included to provide a clearer
picture of the functions under discussion. The emphasis of Knuth and Pardo in [14] is
their algorithmic process, and for this reason they only sketch how one may derive their
theorem. Of course, we wish to emphasize the mathematical aspects of their paper, so
while the proof below is essentially due to Knuth and Pardo, we include certain details to
make their proof more rigorous.
Using Knuth and Pardo’s notation, let Φ(t) be the probability that P1 (n) ≤ t when
n is in the range 1 ≤ n ≤ N. This functions can be identified with the Dickman function
ρ(u) (see Definition: 3.2.5) discussed in the first section, and which will be discussed in
log(N)
greater length in section 3.2. In fact if one sets u =
then for 1 ≤ u ≤ 2 we have
log(t)
ρ(u) = Φ(t). In [14] the authors demonstrate that:
84
(
Φ(t) = 1 − log
for
√
log(N)
log(t)
)
1
+
log(N)
∫
{u} du
+O
u2
N/t
1
(
1
log(N)
)2
N ≤ t ≤ N. Now, let
∫
∫
N
N
t dΦ(t) = Φ(N)N − Φ(1) − k
k
Ek (P1 (n)) =
k
Φ(t)t k−1 dt,
1
1
that is, Ek((P1 (n)) is
the kth)moment of P1 (n). Note that as the above integral from 1 to
∫ √N
√
dΦ(t) = O(N k/2 ), it will be absorbed into the error term below.
N is O N k/2
1
(
)
ζ(k + 1) N k
Nk
Theorem 3.1.9. Ek (P1 (n)) =
+O
.
k + 1 √log(N)
log2 (N)
Proof: Ignoring the integral from 1 to N we are left with
∫
√
N
=
√
k
1 + log log(t) − log log(N) +
tkd
N
∫
k/2
N
−k
Φ(t)t k−1 dt,
1
(
N
∫
t dΦ(t) = Φ(N)N − Φ( N)N
k
∫
√
N
N
1
= √ t d(log log(t)) +
log(N)
N
1
log(N)
∫
1
N/t
{u} du
+O
u2
(
1
log (N)
))
2
)
( )k ∫ v
(
N
Nk
{u} du
d
+O
√
v
u2
log2 (N)
N
1
∫
k
1
by replacing t by N/v in the second integral. The O-estimate is justified by the simple
∫b
∫b
observation that if a f (t)dg(t) and a f (t)dh(t) exist, where h(t) = O(g(t)), and where
both f and g are positive monotone functions on [a, b], then
∫
N
√
N
t k−1 dt
= Nk
log(t)
Nk
=
log(N)
(∫
1
√
N
∫
√
N
1
dv
+
v k+1
∫
dv
v k+1 (log(N) − log(v ))
√
1
85
N
log(v )dv
k+1
v (log(N) − log(v ))
)
Nk
=
+O
k log(N)
(
Nk
log2 (N)
)
.
∫
√
N
Note that the second integral is −N k / log(N) times the integral
1
is within O(N −(k+1)/2 ) of
∫
1
=
∞
∑
{v } dv
=
v k+2
j≥1
∑ (1 ( 1
=
∑(
j≥1
=
1
−
k
j
(j + 1)k
k
j≥1
{v } dv /v k+2 , which
1
k(k + 1)
(
)
∫
(v − j)dv
v k+2
j+1
j
j
−
k +1
1
1
−
k
j
(j + 1)k
(
)
1
j k+1
1
−
(j + 1)k+1
1
1
−
k + 1 (j + 1)k+1
))
)
1
1
1 ζ(k + 1)
−
(ζ(k + 1) − 1) = −
k(k + 1) k + 1
k
k +1
.
Hence, we have shown that
ζ(k + 1) N k
Ek (P1 (n)) =
+O
k + 1 log(N)
(
Nk
log2 (N)
)
,
which is the desired result.
This verifies the result of Alladi and Erdős that the first moment (the mean) of
π2 N
P1 (n) is asymptotically
. Furthermore, the above theorem demonstrates
12 log(N)
√
ζ(3)
N
√
that P1 (n) has the asymptotic standard deviation of
(within a factor
3
log(N)
of 1 + O(1/ log(N))). In [14] Knuth and Pardo also make the important observation
that the ratio of the standard deviation to the mean deviates as N → ∞. This will be
important for the analysis of their factorization algorithm (in Chapter 4) as it shows
86
that a traditional ”mean and variance” approach is unsuitable when dealing with such
factorization algorithms.
3.2 Ψ(x, y )
It will be very informative to consider the most frequently occurring value for the
largest prime factor P1 (n) of a given integer n, n ≤ x. This value cannot be obtained
from the above estimates which give the average of the largest prime factors of integers
n ≤ x. It turns out that the most frequently occurring value for the largest prime factor of
an integer n ≤ x is far smaller than f (x)/x, owing to a small number of integers with very
large prime factors which influence the estimate for f (x)/x (and make it much larger). In
order to do this analysis we must introduce the function Ψ(x, y ):
Definition 3.2.1. The Buchstab-de Bruijn function Ψ(x, y ) is defined to be the number of
positive integers n ≤ x such that P1 (n) ≤ y . That is, Ψ(x, y ) is the number of integers
less than x whose prime factors are less than y .
log(x)
As the value
plays an important role in the behavior of this function, it is
log(y )
customary and convenient to define
u :=
log(x)
,
log(y )
provided x ≥ y ≥ 2; for the remainder of this section any reference to u will correspond
to this definition. As was mentioned in the introduction, Ψ(x, y ) was first studied (in
isolation) by S. Ramanujan and (publicly) by R. Rankin in [23], who used it to investigate
the differences between prime numbers. However, Rankin’s results only pertained to the
function Ψ(x, y ) in a rather limited range for the value of y . One year prior to Rankin’s
publication A.A. Buchstab derived a very general recurrence formula in [6] to evaluate
recurrences which commonly arise in sieve theory. In its most general form this formula
can quickly become complicated to the point of losing much of its usefulness, but as it
pertains to Ψ(x, y ) the formula is far less daunting. The following identity will be referred
87
to as Buchstab’s recurrence relation (the proof is fairly easy and can be found in [29]):
for x ≥ 1, y > 0, we have
Ψ(x, y ) = 1 +
∑
(
Ψ
p≤y
)
x
,p ,
p
(3–10)
where p ≤ n is, as usual, the prime numbers less than or equal to n. It was a recurrence
of this sort (though in a different guise) which was discovered by Ramanujan, and whose
rediscovery is recounted in the interesting article [27]. Iterating this recurrence gives us
the following theorem, known as Buchstab’s identity, whose proof can also be found in
[29]. Thus (2-26) implies:
Theorem 3.2.2. For x ≥ 1, z ≥ y > 0, we have
Ψ(x, y ) = Ψ(x, z) −
∑
y <p≤z
(
Ψ
x
,p
p
)
Up to this point there is no doubt some confusion about the naming convention
of Ψ(x, y ) which we now hope to clarify; in 1951 N.G. de Bruijn exploited Buchstab’s
identity to significantly improve the range of y for which Ψ(x, y ) would satisfy an
asymptotic formula. Moreover, de Bruijn’s results all held uniformly, making Ψ(x, y )
into a rather convenient function to work with (for a given range of y ). At last we have a
good historical justification for calling Ψ(x, y ) the Buchstab-de Bruijn function!
The Buchstab-de Bruijn function has been the subject of much research in recent
years, and we will only state some of the properties of Ψ(x, y ) necessary to answer the
above question concerning the most frequently occurring value for the largest prime
factor of an integer n.
For our purposes we only require an asymptotic estimate for Ψ(x, y ) given by de
Bruijn which holds for large values of y , namely, Theorem 3.2.3. [9]).
Theorem 3.2.3. If x ≥ y ≥ 2 then,
88
(
)
Ψ(x, y ) = O xe −u/2
holds uniformly.
It is worth noting that the value u can vary with x in the above theorem, which is one of
the major reasons for the superiority of de Bruijn’s result over those of his predecessor’s.
However, this result can be strengthened, as was demonstrated by de Bruijn in [9] that if
x > 0, y ≥ 2, where log2 (x) ≤ y ≤ x and u defined as before, then
Ψ(x, y ) < x log2 (y )e −u log(u+3)−u log log(u+3)+O(u) .
(3–11)
This estimate is obtained by noting that the sum over all the integers n such that
P1 (n) ≤ y satisfies an Euler product given by
∑
P1 (n)≤y
)−1
∏(
1
1
=
1− s
.
ns
p
p≤y
Then, for κ > 0 we may apply Perron’s formula to obtain:
1
Ψ(x, y ) =
2πi
∫
κ+i∞
∏(
κ−i∞ p≤y
1
1− s
p
)−1
xs
ds,
s
which de Bruijn then evaluates, from which he obtains (3-11).
For our purposes we only require the uniform estimate for Ψ(x, y ) given by Theorem
3.2.3, which holds for large values of y .
We will now study the Dickman function, a function which frequently arises in the
study of Ψ(x, y ) and which itself satisfies many interesting and useful equations. A study
of the Dickman function, and in particular the rate of decay of this function, will help us to
better understand the asymptotic relation in Theorem 3.2.3
log(x)
Definition 3.2.4. Let u =
such that 2 ≤ u ≤ 3, the Dickman function is defined as
log(y )
∫
ρ(u) := 1 − log(u) +
log(v − 1)
2
89
u
dv
.
v
It may not be immediately clear how the Dickman function is related to the
Buchstab-de Bruijn function, however, in the proof of theorem (3.1.8) what Knuth
log(N)
and Pardo refer to as Φ(t) is essentially ρ(u) with u =
. The following theorem
log(t)
makes the connection between ρ(u) and Ψ(x, y ) more explicit (see [29], [9], and [27])
log(x)
Theorem 3.2.5. For x ≥ y ≥ 2, and fixed u =
, we have
log(y )
Ψ(x, x 1/u )
= ρ(u).
x→∞
x
lim
Hence, the above theorem demonstrates that for fixed u
Ψ(x, x 1/u ) ∼ ρ(u)x.
It is a fact that ρ(u) is uniquely defined by the initial condition ρ(u) = 1 for 0 ≤ u ≤ 1
and the recurrence
∫
ρ(u) = ρ(k) −
u
ρ(v − 1)
k
dv
v
for k < u ≤ k + 1. This property is deduced by applying Buchstab recurrence to
the function Ψ(x, y ). This relationship implies that the Dickman function satisfies the
following difference-differential equation
uρ′ (u) + ρ(u − 1) = 0,
(3–12)
provided u > 1. Furthermore, the above properties of ρ(u) demonstrate that the
Dickman function is differentiable for u > 1, and using Theorems 3.2.3 and 3.2.5 De
Bruijn showed that for u > 3
ρ(u) = e −u log(u)−u log log(u)+O(u) .
We are now in a position to answer the motivating question of this section.
90
(3–13)
Theorem 3.2.6. For all ϵ > 0, the most frequently occurring value for P1 (n) for all n ≤ x
√
√
lies between e (1−ϵ) log(x) log log(x) and e (1+ϵ) log(x) log log(x) .
Proof: The number of integers m ≤ x with the property that P1 (m) = p is given by
(
)
x
Ψ
, p , so we need to maximize this function. From equation (3-13) we may conclude
p
that
(
Ψ
x
,p
p
)
=
x − log(x/p)
x log(x)
e log(p) +O(u) = e − log(p) +1+o(1) ;
p
p
x − log(x)
e log(t) +1 for t will give a rough estimate of the average
t
size of the largest prime factor of n, for n ≤ x. This is a simple optimization problem,
hence, maximizing the function
√
log(x)
the greatest value of the above equation occurs when t = e log log(x) and inserting this
√
(1+o(1)) log(x) log log(x)
into the equation yields e
, from which we may conclude the desired
result.
The function Ψ(x, y ) was generalized by Knuth and Pardo in [14], to Ψk (x, y ) =
| {n ≤ x : Pk (n) ≤ y } |. Clearly Ψ1 (x, y ) = Ψ(x, y ), but there are several similarities
between Ψ(x, y ) and Ψk (x, y ), namely, one may study these functions inductively. If
α > 0 then
(
α
α
Ψk (x , x) = ρk (α)x + O
xα
log(x α )
)
(3–14)
where ρk (α) are functions analogous to the Dickman function in the case when k =
1. Knuth and Pardo demonstrate that the ρk (α) functions satisfy similar recurrence
relations to ρ(u). If α > 1 and k ≥ 1 then,
∫
ρk (α) = 1 −
α
(ρk (t − 1) − ρk−1 (t − 1))
1
for 0 < α ≤ 1 and k ≥ 1,
ρk (α) = 1,
91
dt
,
t
and if α ≤ 0 or k = 0
ρk (α) = 0.
Furthermore, Knuth and Pardo demonstrate the following important asymptotic
results, for k=2 there exist constants c0 , c1 , ..., cr such that
ρ2 (α) = e γ
(c
0
α
+
cr −1 )
c1
+
...
+
+ O(α−r −1 )
α2
αr
(3–15)
and for k ≥ 3 we have
ρk (α) = e
k−2
(α)
γ log
α(k − 2)!
(
+O
logk−3 (α)
α
)
.
(3–16)
Equations (3-14), (3-15), and (3-16) show that there is a dramatic difference between
the functions Ψk (x, y ) for k ≥ 2 and Ψ1 (x, y ), in particular, Ψk (x, y ) decreases much
more rapidly for k = 1 than it does for k ≥ 2. As a simple example of this difference,
consider the functions Ψ1 (x, 2) and Ψ2 (x, 2). As there are no primes p ≤ 2 except
p = 2, it follows that Ψ1 (x, 2) will simply be a count of the number of powers of 2 less
]
[
log(x)
; however, notice that every number of
than or equal to x. Hence, Ψ1 (x, 2) =
log(2)
x
the form 2p where p ≤ will be counted by the function Ψ2 (x, 2). By the prime number
2
x
theorem this implies that there are about
such numbers of the form 2p ≤ x,
2 log(x)
x
thus,
≤ Ψ2 (x, 2). Although a simple example, it is clear that Ψ1 (x, 2) grows far
2 log(x)
more slowly as a function of x than does Ψ2 (x, 2).
3.3 Generalized Alladi-Duality
There is an interesting duality between the kth largest and the kth smallest
prime factors of an integer n, first noted by K. Alladi in [1], which will be of use in later
observations concerning numerical factorization. Whereas Alladi’s treatment is entirely
elementary, and holds for all arithmetic functions, the following proof will demonstrate
Alladi’s duality analytically. We also note that although Alladi’s proof of the duality in
92
[1] generalizes to give the following result, he only supplies a proof of the principle for
the special case of k = 1 (which is the duality amongst the largest and smallest prime
factors of n). The only detriment to the analytic approach is that we must place bounds
on the arithmetic functions being discussed to ensure the convergence of the Dirichlet
series which they generate; however, this is a relatively minor restriction, and is in fact
equivalent to the statement that the Dirichlet series generated by the arithmetic function
has an abscissa of convergence σa ̸= −∞. Furthermore, as a bonus, we will derive a
new representation for the function ζ(s).
Let g(n) be an arithmetic function such that g(1) = 0 and let pk (n) and Pk (n)
denote the kth smallest and kth largest prime factors of n, respectively (while these two
values may coincide, the hope is that with this definition the notation will not cause any
confusion). Furthermore, let µ(n) be Möbius’s number theoretic function. In this section,
unless otherwise indicated, sums are to be taken for all n ≥ 2.
Lemma 3.3.1. (Alladi)
∑
(
µ(d)g(Pk (d)) = (−1)
k
d|n
∑
(
µ(d)g(pk (d)) = (−1)
k
d|n
∑
d|n
∑
d|n
)
ω(n) − 1
g(p1 (n)),
k −1
)
ω(n) − 1
g(P1 (n)),
k −1
(
)
ω(d) − 1
µ(d)
g(P1 (d)) = (−1)k g(pk (n)),
k −1
)
ω(d) − 1
g(p1 (d)) = (−1)k g(Pk (n)),
µ(d)
k −1
(
in particular,
∑
µ(d)g(P1 (d)) = −g(p1 (n)),
d|n
93
and
∑
µ(d)g(p1 (d)) = −g(P1 (n)).
d|n
Proof: Consider the Dirichlet series generated by µ(n)z ω(n) g(p1 (n)), for |z| < 1, and the
easy identity:
E (z; s) =
∞
∑
µ(n)z ω(n) g(p1 (n))
ns
n=1
=−
∑ g(p) ∏ (
ps
p
q>p
z
1− s
q
)
,
(3–17)
subject only to the restriction that g(n) grows at a rate where the Dirichlet series under
consideration converges for all s such that ℜ(s) > σa . This identity is valid for all s ∈ C,
ℜ(s) > σa , where σa is the abscissa of absolute convergence of the Dirichlet series
∞
∑
µ(n)g(p1 (n))
, provided |z| ≤ 1. Therefore,
s
n
n=1


(
)
∑
1
∂
ω(d) − 1 ω(d)−k
1

ζ(s)E (z; s) =
µ(d)
z
g(p1 ) s .
k
(k − 1)! ∂z
k −1
n
n=1
∞
∑
k
(3–18)
d|n
However,
ζ(s)E (z; s) = −
[
∑ g(p) ∏ (
ps
p
=−
∑ g(p) ∏ (
p
ps
r ≤p
=−
q>p
1
1− s
r
z
1− s
q
)−1 ∏ (
q>p
∑ g(p) ∏ (
p
ps
r ≤p
)(
1
1− s
q
)−1 ] ∏ (
r ≤p
1
1− s
r
)−1
(3–19)
)
1
1
z
z
1 + s + 2s + ... − s − 2s − ...
q
q
q
q
1
1− s
r
)−1 ∑
(−z)ω(m)−1
;
ms
p1 (m)>p
hence,
)−1
∑ g(p) ∏ (
1
∂k
1
k
ζ(s)E (z; s) = (−1)
1− s
lim
z→1− (k − 1)! ∂z k
p s r ≤p
r
p
∑
ω(m)≥k−1,p1 (m)>p
1
ms
(3–20)
94
= (−1)k
∞
∑
g(Pk (n))
ns
n=1
.
By equating (3-18) and (3-20) we see that
(−1)k
∞
∑
n=1


(
)
∑
g(Pk (n))
ω(d) − 1

 1.
=
µ(d)
g(p
(d))
1
ns
k −1
ns
n=1
∞
∑
d|n
Now, from the uniqueness of Dirichlet series we may equate the corresponding
coefficients of (3-18) and (3-20) in the above equality to conclude that (−1)k g(Pk (n)) =
(
)
∑
ω(d) − 1
µ(d)
g(p1 (d)). Proving the fourth identity.
k −1
d|n
Similarly, consider the Dirichlet series generated by µ(n)z ω(n) g(P1 (n))
∞
∑
µ(n)z ω(n) g(P1 (n))
D(z; s) :=
ns
n=1
=−
∑ g(p) ∏ (
ps
p
q≤p
z
1− s
q
)
;
(3–21)
then,


(
)
∑
ω(d) − 1
1
1
∂

µ(d)z ω(d)−k
g(P1 (n)) s .
ζ(s)D(z; s) =
k
k −1
(k − 1)! ∂z
n
n=1
∞
∑
k
(3–22)
d|n
However,
ζ(s)D(z; s) = −
∑ g(p) ∏ (
ps
p
=−
r >p
1
1− s
r
∑ g(p) ∏ (
p
ps
r >p
)−1 ∏ (
1
1− s
r
q≤p
z
1− s
q
)(
1
1− s
q
)−1 ∑
(−z)ω(m)−1
,
ms
P1 (m)≤p
and taking the limit as z → 1− by (3-21), (3-22), and (3-23) we have
95
)−1
(3–23)
)−1
∑ g(p) ∏ (
1
1
∂k
k
lim
ζ(s)D(z; s) = (−1)
1− s
s
z→1− (k − 1)! ∂z k
p
r
p
r >p
∑
ω(m)≥k−1,P1 (m)<p
1
ms
(3–24)
k
= (−1)
∞
∑
g(pk (n))
ns
n=1
.
Now, equating the two Dirichlet series given by (3-23) (by differentiating and taking the
limit) and (3-24):
(−1)k
∞
∑
g(pk (n))
ns
n=1


(
)
∑
1
ω(d) − 1

=
µ(d)z ω(d)−k
g(P1 (n)) s ,
n
k −1
n=1
∞
∑
d|n
from which we may deduce that
∑
(
µ(d)z
ω(d)−k
d|n
)
ω(d) − 1
g(P1 (d)) = (−1)k g(pk (n))
k −1
which proves the third identity. The first two identities follow by applying Möbius
inversion to the last two identities, as this process is relatively straightforward we
will only carry out the inversion procedure for the third identity. Note that as µ(n) is
supported on the square-free integers, we need only carry out the inversion procedure
over square-free numbers; hence, µ(n/d) = µ(n)µ(d), as this holds generally for
square-free n. Then, as
∑
d|n
(
)
ω(d) − 1
µ(d)
g(P1 (n)) = (−1)k g(pk (n))
k −1
we may invert to obtain
(
)
∑
ω(d) − 1
µ(n)
g(P1 (n)) =
µ(n/d)(−1)k g(pk (d))
k −1
d|n
96
= µ(n)(−1)k
∑
µ(d)g(pk (d));
d|n
thus,
(
)
∑
ω(d) − 1
(−1 )
g(P1 (n)) =
µ(d)g(pk (d))
k −1
k
d|n
which is the second identity. The first identity follows similarly, proving the lemma.
The next lemma can also be viewed as a generalization of Alladi’s duality principle
for subsets Ω ⊂ P where P denotes the set of all prime numbers. Note that as we have
already proved the Alladi duality identities for all n the following lemma follows trivially if
we consider only sums which are taken over the integers n ≥ 2 such that if a prime p|n
then p ∈ Ω.
Lemma 3.3.2.
−g(p1 (n)) =
∑
µ(d)g(P1 (d))
d|n
and
−g(P1 (n)) =
∑
µ(d)g(p1 (n)),
d|n
where it is understood that g(n) is a bounded function (in the sense of Lemma 3.3.1)
and the sums are taken over the integers n ≥ 2 such that if p|n then p ∈ Ω.
From Lemma 3.3.2 and the properties of ζ(s) we may estimate sums of the form
∑
g(p1 (n)) using Theorem 2.2.1 (Delange’s theorem). Thus,
n≤x,P1 ∈Ω
∑
1 = Cx + o(x)
n≤x,P1 (n)∈Ω
if and only if
∑ µ(n)g(p1 (n))
= −C < +∞;
ns
p1 (n)∈Ω
Now consider
97
(
)−1
∑ µ(n)
∑ 1
1 ∑ 1 ∏
1
1
=
−
1
−
=
−
ns
ζ(s) p∈Ω p s q≤p
ps
ζ(s)
ns
p1 (n)∈Ω
(3–25)
P1 (n)∈Ω
for s ∈ C, ℜ(s) > 1. From Theorem 2.2.1 it follows that
∑
1 = e1 x + o(x)
n≤x,p1 (n)∈Ω
if and only if
−e1 =
∑ µ(n)
,
n
P1 (n)∈Ω
and
∑
1 = d1 x + o(x)
n≤x,p1 (n)∈Ω
if and only if
−d1 =
∑ µ(n)
.
n
p1 (n)∈Ω
In particular, if Ω = P then the above identities become:
)
∑ 1 ∏(
1
1
−1=−
1− s
s
ζ(s)
p
q
p
q<p
or
)
∑ 1 ∏(
1
1
1−
=
1− s .
ζ(s)
p s q<p
q
p
From Tchebyschev’s estimate pn = O(n log(n)), where pn denotes the nth prime
1
number, and the fact that lim+
= 0 we see that:
s→1 ζ(s)
O
(∞
∑
n=2
))
∏(
1
1
< +∞,
1−
n log(n) q<p
q
n
98
(3–26)
∏(
)
1
1
and as a result of (3-21),
1−
<<
, which is a weak form of Mertens’
q
log log(n)
q<pn
theorem. If we use Mertens’ theorem directly (see [29]):
∏(
q<pn
1
1−
q
)
(
=O
1
log(n)
)
then the convergence of
)
∑ 1 ∏(
1
1
1−
1− s
=
s
ζ(s)
p
q
n
p
q<p
n
n
x
. Of
log log(x)
course, this result is much weaker than the prime number theorem derived in Chapter 2.
as s → 1+ implies that pn >> n log log(n); equivalently, π(x) <<
However, for such a nontrivial result its derivation is remarkably simple.
It is worth mentioning the result of Alladi [1] that if we allow our set Ω in Lemma
3.3.2 to be the set of primes p for which p ≡ l(m) and then apply this lemma for any
l, m ∈ Z, (l, m) = 1,
∑
p1 (n)≡l(m)
µ(n)
1
1
=−
= − lim+
s→1 ζ(s)
n
ϕ(n)
∑
P1 (n)≡l(m)
1
ns
where, again, it is understood that the sum is taken for all n ≥ 2; hence, for s near 1,
∑
P1 (n)≡l(m)
1
ζ(s)
∼
.
s
n
ϕ(m)
We further remark that
∑
p1 (n)≡l(m)
1
µ(n)
=−
n
ϕ(m)
(3–27)
follows as a consequence of the prime number theorem for arithmetic progressions, see
[1]. That is, if πl,m (x) denotes the number of primes p ≤ x such that p ≡ l(m) then the
prime number theorem for arithmetic progressions states that
99
1
x
πl,m (x) =
+O
ϕ(m) log(x)
(
x
log2 (x)
)
.
It is speculated by Alladi in [1], and this author, that equation (3-22) is elementarily
equivalent to the prime number theorem for arithmetic progressions. However, at
present this is only a conjecture.
This completes Chapter 3 and the theoretical results necessary to analyze the
factorization algorithm of Knuth and Pardo.
100
CHAPTER 4
THE KNUTH-PARDO ALGORITHM
The algorithm introduced by Knuth and Pardo in their 1976 paper [14] is perhaps
the simplest way to factor an integer algorithmically. The essential idea rests in
attempting to divide an integer n by various trial prime divisors such as 2, 3, 5, ... and
then ”casting out” all factors which are discovered, and then repeating the process. As
√
this process will discover all prime divisors less than n, the algorithm will terminate
when the trial divisors exceed the square root of the remaining un-factored part.
√
The reason why we may restrict our attention to factors less then n is that if n is
√
composite, then it must have a prime divisor ≤ n; or, stated another way, half of the
√
divisors of an integer n must be ≤ n, with the other half of the divisors being obtained
√
√
by evaluating n/d, for d|n, d ≤ n. Also, if p > n, and p|n, then p 2 will not divide n.
The speed of this algorithm is intimately related to the size of the prime factors of n.
√
If n is itself a prime number then there will be approximately n trial divisions (of course,
we do not know a priori whether or not n is prime); whereas, if n is a power of 2 (i.e.
n = 2a for some positive integer a) the number of trial divisions will be O(log(n)). Knuth
and Pardo consider a random integer n and determine the approximate probability that
the number of trial divisions is ≤ nx with 0 ≤ x ≤ 1/2. They then demonstrate that the
number of trial divisions will be ≤ n0.35 about half of the time (for these results see [14]).
Knuth and Pardo reach their conclusion by analyzing the kth largest prime factor
of an integer, and then determine the running time of their algorithm by the size of
the largest two prime factors. We will now introduce their standard ”divide and factor”
algorithm.
For n ≥ 2 let n = p1 (n)p2 (n)...pt (n)m, where p1 (n), ..., pt (n) are prime numbers
listed in non-decreasing order and m ≥ d with all prime factors of m greater than or
equal to d, i.e. pi (m) ≥ d. It is understood that we have such a list of the prime numbers
which are relevant to the algorithm, and hence we do not need to add more time to the
101
algorithm by determining the trial prime divisors. What follows is an informal ALGOL-like
description of the algorithm, supplied in [14] by the authors:
t := 0, m := n, d := 2
While: d 2 ≤ m do
Begin: increase d or decrease m;
If d|m then
Begin:
t := t + 1
pt (n) := d
m := m/d
end
Else: d := d + 1
end
t := t + 1; pt (n) = m; m := 1; d := 1
If we denote D as the number of trial divisions performed and T as the number of
prime factors of n (counting multiplicity) then the above algorithm’s while-loop requires
approximately D + 1 operations, the if-loop requires D operations, the begin-loop
requires T − 1 operations, and the Else-loop requires D − T + 1 operations.
Knuth and Pardo remark in [14] that their algorithm can be refined in several simple
ways by avoiding large numbers of non-prime divisors, for instance, if d > 3 we may
consider prime divisors of the form 6k ± 1. This has the effect of dividing the number of
trial divisions performed, D, by a constant. They further comment that the analysis of the
simple case applies to more complicated settings with only minor variations.
Let Pk (n) be the kth largest prime factor of n; therefore, Pk (n) = pT +1−k (n) after
the algorithm terminates, with 1 ≤ k ≤ T . If n has less than k-prime factors then let
Pk (n) = 1, and for convenience let P0 (n) = ∞. Knuth and Pardo observe in [14] that
102
the while-loop in the algorithm can terminate in three different ways depending upon the
final inputs into the loop:
Case 1: If n < 4 then D = 0, as d = 2 implies that d 2 > n; hence, the algorithm will
terminate.
Case 2: If n ≥ 4 and the Dth trial division succeeds. In this case the final trial division
was by d = P2 (n), where d 2 > P1 (n).
As d is initially 2 the operation d := d + 1 is performed D − T + 1 times, and hence
D − T + 1 = P2 (n) − 2 or D = P2 (n) + T − 3 for P2 (n)2 > P1 (n).
Case 3: If n ≥ 4 and the Dth trial division fails, then the final trial division was by d,
where P2 (n) ≤ d, d 2 < P1 (n), and (d + 1)2 > P1 (n). Thus:
D=
⌈√
⌉
P1 (n) + T − 3
(4–1)
where P2 (n)2 < P1 (n).
Note that in all three cases we have
D = max(P2 (n),
⌈√
⌉
P1 (n) ) + T − 3.
(4–2)
We will now present their largely heuristic derivation that the limiting distribution of
the probability that the kth largest prime factor of an integer ≤ N x exists. In their paper, it
is Knuth and Pardo’s desire to analyze D, and to that end they analyze the distributions
of P1 (n) and P2 (n). In fact, they go a good deal further and consider the distributions of
Pk (n) in general.
Let Probk (x, N) = | {n : 1 ≤ n ≤ N, Pk (n) ≤ N x } |, where x ≥ 0. Hence,
Probk (x, N)/N is the probability that a random integer between 1 and N will have kth
Probk (x, N)
largest prime factor ≤ N x . In [14] Knuth and Pardo demonstrate that lim
=
N→∞
N
Fk (x) exists, and derive various properties of the function Fk (x). The following is their
heuristic argument that Fk (x) exists, analogous to the exposition by Dickman in [10]:
103
Probk (x, N)
= Fk (x) exists.
N→∞
N
”Proof:” Consider Probk (t + dt, N) − Probk (t, N), the number of n ≤ N where N t ≤
Theorem 4.0.3. The limit lim
Pk (n) ≤ N t+dt , where dt is small. To count the number of such n we take all primes such
that N t ≤ p ≤ N t+dt and multiply by all numbers m ≤ N 1−t such that Pk (m) ≤ p and
Pk−1 (m) ≥ p.
If n = mp then n ≤ N t+dt and Pk (n) = p; conversely, every n ≤ N with N t ≤ Pk (n) ≤
N t+dt will have the form n = mp for the above stated p and m.
Note that the number of such m ≤ N 1−t such that Pk (m) ≤ p is approximately
t
Probk ( 1−t
, N 1−t ), the unwanted subset consisting of those m such that Pk−1 (m) < p has
t
approximately Probk−1 ( 1−t
, N 1−t ) members. It follows that the number of such m with
mp ≤ N and Pk (m) ≤ p and Pk−1 (m) ≥ p is
(
Probk
t
, N 1−t
1−t
)
(
− Probk−1
)
t
1−t
,N
.
1−t
(4–3)
Ignoring second-order terms gives:
Probk (t+dt, N)−Probkl,m (t, N)
[
]
[
]
t
t
t+dt
t
1−t
1−t
≈ π(N
) − π(N ) Probk (
, N ) − Probk−1 (
,N )
1−t
1−t
(4–4)
where π(x) is the function counting the number of primes p ≤ x. By the Prime Number
(
)
x
x
+O
; hence, using π(N t+dt ) − π(N t ) ≈ N t dtt in (4-4)
Theorem, π(x) =
log(x)
log2 (x)
yields
Probk (t + dt, N) − Probk (t, N)
dt
≈
N
t
(
Probk
(
t
, N 1−t
1−t
N 1−t
)
( t
))
Probk−1 1−t
, N 1−t
−
(4–5)
N 1−t
as N → ∞ equation (4-5) gives the differential equation:
Fk′ (t)dt
dt
≈
t
(
(
Fk
t
1−t
104
)
(
− Fk−1
t
1−t
))
.
(4–6)
As Fk (0) = 0 we may integrate equation (4-6) to obtain:
∫
∞
(
(
Fk
Fk (x) =
0
t
1−t
)
(
− Fk−1
t
1−t
))
dt
.
t
(4–7)
According to the convention P0 (n) = ∞ we define F0 (x) = 0, for all x. We must also
have Fk (x) = 1 for x ≥ 1, k ≥ 1.
Note that equation (4-7), together with these initial conditions, uniquely determine
Fk (x) for 0 ≤ x ≤ 1, and as we also have the equation:
∫
1
Fk (x) = 1 −
x
dt
t
(
(
Fk
t
1−t
)
(
− Fk−1
t
1−t
))
(4–8)
for 0 ≤ x ≤ 1, Fk (x) is also uniquely defined in terms of its values at points > x. Hence
the limit is well-defined and therefore exists.
We now return to the generalized functions Ψk (x, y ) to better enable us to
understand the Knuth-Pardo algorithm. In their paper, Knuth and Pardo use the kth
moment calculated in Theorem 3.1.9 to deduce many useful properties about Ψk (x, y )
and, consequently, to derive better results on Pk (n). It should be clear that
Probk (x, N) = Ψk (N, N x )
so that
Ψk (N, N x ) ∼ Fk (x)N.
By analyzing the values Ek (P1 (n)) (as in Theorem 3.1.9) Knuth and Pardo go on to show
that
(
α
α
Ψk (x , x) = ρk (α)x + O
for all fixed α > 1.
105
xα
log(x)
)
To close the discussion concerning this factorization algorithm, we will make several
remarks about the model. For one thing, the model is largely probabilistic in that it
considers a random n between 1 and N, and for relations such as nk ≤ N x ; however,
the authors note in [14] that from an intuitive standpoint it may be more natural to ask
for the probability of a relation such as nk ≤ nx without considering N. Furthermore,
they comment that it is quite easy to convert from the one model to the other as most
numbers between 1 and N are large.
To make this discussion more precise, the authors of [14] consider the number of
(
)
integers n, 12 N ≤ n ≤ N, such that Pk (n) ≤ N x . This is: Probk (x, N) − Probk x, N2 =
1
NFk (x)
2
+ O (N/ log(N)); which easily follows from: Probk (x, N) = NFk (x) +
O (N/ log(N)). Furthermore, consider how many of these n have nx < Pk (n) ≤ N x .
(
)
log(2)
log(2)
The latter relation implies that N k ≥ Pk (n) > ( 12 N)x = N x− log(N) , and Fk x − log(N)
=
Fk (x) + O(1/ log(N)), as Fk (x) is differentiable. Hence, the number of such n is at most
(
Probk (x, N) − Probk
log(2)
,N
x−
log(N)
)
(
=O
N
log(N)
)
(4–9)
,
where the constant implied by the O-term in (4-9) is independent of x in a bounded
)
(
N
region about x. Hence, we have shown that Fk (x) + O log(N) for all n such that
1
N
2
≤ n ≤ N satisfy Pk (n) ≤ nx . Therefore, if Qk (x, N) denotes the total number of n ≤ N
such that Pk (n) ≤ nx , we have:
Qk (x, N) =
∑
1≤j≤log2 log(N)
N
2j
(
(
= NFk (x) + O
N
log(N)
(
Fk (x) + O log
(
by dividing the range
−1
N
log(N)
N
2j
)))
(
+O
N
log(N)
)
)
,
≤ n ≤ N into log2 log(N) parts.
Define the probability of a statement S(n) about the positive integer n by the
formula:
106
(4–10)
1
| {n : n ≤ N s.t. S(n) is true } |
N→∞ N
Pr (S(n)) = lim
(4–11)
when the limit exists.
Hence we may conclude that Pr (Pk (n) ≤ nx ) = Fk (x), for all fixed x. Another
important observation concerning the theoretical model used in this paper is that results
were stated in terms of the probability that the number of operations performed is ≤ N x
(or nx ). Typically it is customary to approach the average number of operations of an
algorithm in terms of mean values and standard deviations; however, this approach
appears to be particularly uninformative for this algorithm. Knuth and Pardo comment
in [14] that this phenomenon is apparent when considering the average number of
operations performed over all n ≤ N, which will be relatively near the worst case
n1/2 ; however, in more than 70 percent of the cases the actual number of operations
performed will be less than n0.4 . Another reason why the typical mean-variance
approach is uninformative is because, as was noted in Theorem 3.1.4 of Chapter III,
the ratio of the standard deviation of the kth prime factor to its standard deviation is a
divergent quantity.
Another point, which is worth noting, is that (as the name suggests) the Knuth-Pardo
algorithm is a very simple algorithm, and in recent years more efficient algorithms have
been introduced (such as the elliptic curve [17] and quadratic sieve method [20]) which
√
(1+o(1) log(n) log log(n))
. These
can deterministically factor integers n with running time e
algorithms render much of our analysis superfluous, as it will take far fewer steps to
factor an integer completely using these methods than by utilizing the Knuth-Pardo
algorithm.
In closing, we note that there are some interesting avenues for future research
using the results of this thesis. Alladi in [1] showed that for l and m relatively prime, we
have equation (3-27), which is the identity
107
∑
p1 (n)≡l(m)
µ(n)
1
=−
n
ϕ(m)
where, as in section 3.3, the sum is to be taken over all n ≥ 2. This identity follows as
a consequence of the Prime Number Theorem for Arithmetic Progressions, the case
k = 1 in Lemma 3.3.1, and some results on Ψ(x, y ). Now that Knuth and Pardo have
supplied us with results on the more general functions Ψk (x, y ), it would be interesting
to see whether further results would follow if we could take similar sums over those
integers n ≥ 2 such that pk (n) ≡ l(m). However, as was noted in section 3.2, there is a
significant difference in the behavior of Ψ1 (x, y ) and Ψk (x, y ) when k ≥ 2. In particular,
Ψ1 (x, y ) = O(xe −u ) decays exponentially, whereas
Ψk (x, y ) ∼ ρk (u)x
and for k ≥ 2 equations (3-14) and (3-15) show that the functions ρk (u) do not decay
nearly as rapidly. It would be interesting to see what consequences this would have for
sums similar to (3-27).
108
REFERENCES
[1] Krishnaswami Alladi, Duality between Prime Factors and an Application to the
Prime Number Theorem for Arithmetic Progressions, Journal of Number Theory,
vol. 9 (1977), p. 436–451.
[2] K. Alladi, P. Erdős, On an Additive Arithmetic Function, Pacific Journal of
Mathematics, vol. 71 (1977), no. 2, p. 275–294.
[3] K. Alladi, P. Erdős, On the Aymptotic Behavior of Large Prime Factors of Integers,
Pacific Journal of Mathematics, vol. 82 (1979), no. 2, p. 295-315.
[4] Raymond Ayoub, An Introduction to the Analytic Theory of Numbers,
Mathematical Surveys, no. 10, 1963.
[5] Carl B. Boyer, A History of Mathematics John Wiley and Sons, 1968.
[6] A.A. Buchstab, An asymptotic estimation of a general number-theoretic function,
Mat. Sbornik, vol. 2 (1937), no. 44, p. 1239-1246.
[7] H. M. Edwards, Riemann’s Zeta Function, Dover Publications, Inc., 2001.
∑1
[8] P. Erdős, Über die Reihe
, Mathematica, Zutphen B 7 (1938), p. 1-2.
p
[9] N.G. de Bruijn, On the number of positive integers ≤ x and free of prime factors
> y , Indag. Math., vol. 13 (1951), p. 50-60.
[10] Karl Dickman, On the frequency of numbers containing prime factors of a certain
relative magnitude, Ark. Mat., Astronomi och Fysic 22A (1930), 10, p. 1-14.
[11] G.H. Hardy, S. Ramanujan, On the normal number of prime factors of a number n,
Quarterly Journal of Mathematics, Oxford, vol. 48 (1917), p. 76-92.
[12] Kenneth Ireland, Michael Rosen, A Classical Introduction to Modern Number
Theory, Springer-Verlag, 1990.
[13] I. Martin Isaacs, Algebra, a Graduate Course, Brooks/Cole Publishing Company,
1994.
[14] Donald E. Knuth, Luis Trabb Pardo, Analysis of a Simple Factorization Algorithm,
Theoretical Computer Science, vol. 3 (1976), no. 3, p. 321–348.
[15] Jean-Marie De Koninck, Andrew Granville, and Florian Luca (editors), Anatomy of
Integers, American Mathematical Society, 2008.
[16] Edmund Landau, Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig:
Teubner, 1909. Reprinted by Chelsea, 1953.
[17] H.W. Lenstra Jr., Factoring integers with elliptic curves, Annals of Mathematics,
vol. 123 (1987), issue 3, p. 649-673.
109
[18] Hans Von Mangoldt Zu Riemann’s Abhandlung ’Ueber die Anzahl der Primzahlen
unter einer gegebenen Grösse, Journal für die reine und angewandt Mathematik,
BD. 114 (1895), 225-305.
[19] Oskar Perron, Zur Theorie der Dirichletschen Reihen, Journal für die reine und
angewandte Mathematik, BD. 134 (1908), 95-143.
[20] Carl Pomerance, The quadratic sieve factoring algorithm, Advances in Cryptology,
Proc. Eurocrypt ’84. Lecture Notes in Computer Science. Springer-Verlag, 1985,
p. 169-182.
[21] Jean Jacod and Philip Protter, Probability Essentials, second edition,
Springer-Verlag, 2004.
[22] V. Ramaswami, The number of positive integers < x and free of prime divisors
> x c , and a problem of S.S. Pillai, Duke Math. J., vol. 16 (1949), p.99-109.
[23] R. Rankin, The difference between consecutive prime numbers, J. London Math.
Soc, vol. 13 (1938), p.242-247.
[24] L.G. Sathe, On a problem of Hardy on the distribution of integers having a given
number of prime factors I-IV, J. Indian Math. Soc., vol. 17 (1953), p.63-82, 83-141;
vol. 18 (1954), p. 27-42, 43-81.
[25] Bruce Schechter, My brain is open: the mathematical journeys of Paul Erdős,
Simon and Schuster, Inc., 1998.
[26] A. Selberg, Note on a paper by L.G. Sathe, J. Indian Math. Soc., vol. 18 (1954), p.
83-87.
[27] Kannan Soundararajan An asymptotic expansion related to the Dickman function,
Ramanujan Journal, vol. 29 (2012) (to appear).
[28] J.J. Sylvester On Tchebycheff’s theorem of the totality of prime numbers comprised within given limits, Amer. J. Math., vol. 4 (1881), p.230-247.
[29] Gérald Tenenbaum, Introduction to analytic and probabilistic number theory,
Cambridge University Press, 1995.
[30] E.C. Titchmarsh, The Theory of the Riemann Zeta-function, Oxford, Clarendon
Press, 1951.
110
BIOGRAPHICAL SKETCH
Todd Molnar graduated from the University of Delaware in 2007 with a B.S. in
mathematics and economics. He has been a graduate student at the University of
Florida since 2008.
111
Download