The Prime Number Theorem (PNT)∗

advertisement
OpenStax-CNX module: m12764
1
The Prime Number Theorem (PNT)
∗
F. Michel
This work is produced by OpenStax-CNX and licensed under the
Creative Commons Attribution License 1.0†
Abstract
The Prime Number Theorem is of key interest to number theorists owing to the importance of
Riemann's work on the subject. But it can also be viewed as a tting function that approximates the
distribution of the prime numbers. We provide such a "pedestrian" viewpoint and development using
elementary mathematics.
The famous mathematician Bernhard Riemann did some novel and exciting work in 1859 on the old
question of how many primes there are below a given number, which is usually written
π (n)
This function is equivalent to asking, "what is the value of the k-th prime for any given k?" The latter
is actually a more specic question since an arbitrary n will in general fall between primes. Indeed, the
notation is most curious because π refers to the number of primes up to n, not to any prime per se, while
n actually refers to the values of the primes (since π changes by unity every time n passes through a prime
value. Thus if n is a prime, it is the π -th prime, pretty much the inverse of dening a function
p (n)
where p is the value of the n-th prime. We will encounter this inversion as a practical matter later. However,
it is a testiment to Riemann's genius that he derived closed expressions that gave the exact number of primes.
For example, the 4th prime is 7, but if you chose n = 8, 9, 10, Riemann's formula will give the same answer
(4) for each of these values for n. For this reason, the notation seems reversed, but actually makes sense.
The PNT as a theorem asserts that the following interpolation formulas, which attempt to smoothly t
onto the trends in the distribution of the primes, are asymptotically exact.
An early estimate was that π (n) was approximately
n
ln (n)
This estimate systematically underestimates the numbers of primes, except possibly for staggeringly large
n
values of n. As a theorem it would state that in the limit of n going to innity, the ratio of π (n) to ln(n)
is unity. The theorem is of particular interest to mathematicians in that it follows from an as-yet unproven
conjecture by Riemann on the distribution of zeros of the zeta function, a complicated topic which is not
required for our simple analysis here.
An improved approximation is
Li (n)
∗ Version
1.1: Apr 25, 2005 1:48 pm -0500
† http://creativecommons.org/licenses/by/1.0
http://cnx.org/content/m12764/1.1/
OpenStax-CNX module: m12764
2
where Li, the log integral function is dened to be
n
Z
2
1
dt
ln (t)
This estimate is substantially better and systematically underestimates the number of primes (again until
one gets to staggeringly large values). Notice that the rst approximation results from considering the log
n
. The log
term to be slowly varying and taking it out of the integral to get the (crude) estimate of ln(n)
integral function diverges at n = 1, but this has no relevance to its applications at large n.
Some large values (from Derbyshire, p. 116 [1]).
n
10
8
10
9
1010
1011
1012
1013
1014
π
Li − π
5,761,455
50,847,534
455,052,511
4,118,054,813
37,607,912,018
346,065,536,839
3,204,941,750,802
754
1,701
3,104
11,588
38,263
108,971
314,890
Table 1
Anyone used to examining errors for trends would see that the steady increase in the dierence Li − π
by about a factor of 3 per decade in n suggests that there is a scaling correction that needs to be applied.
1 Calculating the primes
The method of choice to calculate a list of primes is to use the "sieve" of Eratosthenes. Here one starts with
a list the natural numbers. Then starting with 2, one removes every even number. The next number is 3, so
now one removes every third number (half of these have already been removed by the 2). Here it is easiest
to simply replace the values with zero, rather than literally removing the numbers. Now the next (non-zero)
number is 5, and we set every fth number to zero. Then 7 and then 11. By eleven, the rst 121 non-zero
numbers will all be primes (i.e., 11 squared). The alternative of testing each successive number to see if a
smaller prime divided it would be hopelessly inecient.
Here is a simple MATLAB program (easily translated into the program of your choice):
1
2
3
4
5
6
7
8
9
10
11
%make-prime routine using sieve
N = 100; %largest possible prime in list
rpr=linspace(1,N,N); %starting list=all numbers up to N
nextp = 2; %next prime to remove from list (this is ALSO its location)
for j=2:N
if (nextp*nextp)<N %quit once all remaining nonzeros are primes
for k=2:N %starts at 2 to preserve the prime itself
if nextp*k <= N %don't exceed length of vector
rpr(nextp*k) = 0; %the sieve!
end
end
http://cnx.org/content/m12764/1.1/
OpenStax-CNX module: m12764
12
13
14
15
16
17
18
19
20
21
22
23
3
for n = 1:N
if rpr(nextp+n) ∼= 0 %run up list until first nonzero value
nextp = rpr(nextp+n); %update nextp
break %stop looking further, otherwise exit at N
end
end
primes=[ ]; %start with empty vector
for n=2:length(rpr) %now list the prime values only, excluding "1"
if (rpr(n)∼=0)
primes=[primes,rpr(n)]; %add to existing vector of primes
end
end
Here one simply chooses the value of N , and after executing the program, the vector "primes" will contain
the list: 2; 3; 5; 7; 11; 13; ...; 97 (if N = 100).
Note that the "sieve" itself never removes "1" and this number has all of the properties of a prime (not
divisible by any other number besides 1 and itself). But it has no use in the unique factoring of natural
(whole) numbers into primes, and is so excluded.
This program has been checked against the list given in Abramowitz and Stegun (the primes up to 99,991,
which are roughly the rst 10,000 primes).
2 Derivation of a PNT
The sieve suggests a simple way of estimating the distribution of primes. After any given prime, there will be
a density of non-zero numbers remaining. For example, after 2, half the remaining numbers will be nonzero.
After the 3, one removes 13 but 21 are already gone, so we get 12 − 16 = 13 left. Since there are an innite
number of numbers left, it is easier to think in terms of remaining blocks of numbers; after removing 2 and 3,
we have blocks of 2 × 3 = 6, and in any arbitrary block of 6 consecutive numbers beyond the 3, there will be
exactly 2 non-zero numbers. If we go to 5, the blocks become 2 × 3 × 5 = 30, and there will remain exactly
8 non-zero numbers in any block of 30 numbers beyond 5. It is easy to verify that this density evolves quite
systematically. After each successive prime p, the average density per block drops by exactly a factor of
p−1
p
This density then corresponds to a mean distance between primes, since the next prime is chosen from
the rst non-zero number in the following block, and we can guess this distance from the mean density. In
particular, after each successive prime p, the density decreases by p−1
p and the mean distance correspondingly
p
. Let us call this mean distance after the k-th prime ∆k , while the k-th prime itself we
increases by p−1
would call Pk .
It follows then that the next prime will be approximately
Pk+1 = Pk + ∆k
but given this new prime, the mean step size increases to
∆k+1 = ∆k
Pk+1
Pk+1 − 1
But now we are all done. If we chose the rst prime (P1 ) and the distance to the next prime (∆1 ), we
can simply iterate these relationships indenetely. Indeed, we know these values:
P1 = 2
http://cnx.org/content/m12764/1.1/
OpenStax-CNX module: m12764
4
and
∆1 = 1
A MATLAB program would simply be
1
2
3
4
5
6
7
8
%recursion on steps to generate prime number counts
N=9500;
%primes up to this number
pr(1) = 2; %initial prime
del(1) = 1; %intial step
for k= 1:N
pr(k+1) = pr(k) + del(k);
del(k+1) = del(k)*pr(k+1)/(pr(k+1)-1);
end
We chose N = 9500 because this available on the tabulated lists, or if you created the list of primes, you
should get primes (9500) = 98, 947. In contrast, the above program gives 102,014, which is too large by only
about 3%.
This result is rather astonishingly good given that the initial primes and steps between primes are
anything but regular.
3 Obtaining the PNT Results
It is easy to rearrange the equations as dierence equations and then write them as dierential equations.
First we write
Pk+1 − Pk = ∆k
and
∆k+1 − ∆k = ∆k
1
Pk+1 − 1
For large k, we can approximate these as
d
P (k) = ∆ (k)
dk
and
d
∆ (k)
∆ (k) =
dk
P (k)
One might worry about simplifying Pk+1 − 1 to P (k), but the "corrections" are tiny, and unimportant
as we will show.
The ratio is
d
(P ) = P
d∆
which integrates to give
P = P1 e∆−∆1
which then gives
Z
N
Z
P
dk =
1
P1
1
dP
∆1 + log PP1
Notice that this is in fact the log integral function of the improved PNT. However, because of the curious
choice of notation, the natural symbols for π (here N ) and n (here P ) are reversed.
http://cnx.org/content/m12764/1.1/
OpenStax-CNX module: m12764
5
Notice that the ∆1 in the denominator can be absorbed into the log PP1 and the rescaling P moves these
terms to the limits on the integral. Eectively, this simply rescales Li (n).
Accordingly, we can chose a ∆1 such that the curve is through the primes instead of being always o
to one side. Such a rescaling makes perfect sense given that the PNT is just a t to the primes and such
a t has to be global. Thus the t is not obliged to either pass exactly through the prime 2 nor have the
initial step size be exactly 1. Somehow the PNT derivation of Li (n) has eectively slipped in an implicit
assumption that ∆1 = logP1 (e.g., ∆1 = log2 = 0.6931...).
In fact, choosing
∆1 = 0.625
provides a better t that at large k gives an estimate for the value of the primes that oscillates about the
correct values, being o a few percent to either side. This oscillation shows that there is little point in trying
to do "better" (e.g., trying to solve a more exact dierential equation above; in fact we just iterated instead).
A signicant improvement could be gotten by letting ∆1 be a weak function of k.
Notice that we have not proven the PNT, we have just derived the same asymptotic functions cited in
the proofs. One can create a number of interesting relationships using these approximate expressions and
the PNT itself.
For example, if we integrate the log integral function between two (large) successive primes, the result
should be just unity (i.e., one new prime). Since the logPk denominator will hardly change, the integral
itself will be approximately
Pk+1 − Pk
(1)
1=
logPk
which can be rewritten as the recursion relation
Pk+1 = Pk + logPk
(2)
which must be one of the simplest possible recursion relations for primes. If we start it with the obviously
simplest choice P1 = 2, we get Figure 1.
http://cnx.org/content/m12764/1.1/
OpenStax-CNX module: m12764
6
Figure 1: The rst thousand or so primes, plotting Pk against k. The uppermost (ragged) line are the
actual primes, the dashed line just below it is Li (k), and the solid line is the simple recursion relation
Pk+1 = Pk + logPk starting with P1 = 2.
Another step is to replace Pk+1 − Pk with ∆k , namely the mean distance between primes. This gives us
1
1
1
limit logPk 1 −
1−
... 1 −
= const
kk→∞
P1
P2
Pk
and the result for the rst million primes is plotted in Figure 2. The constant is actually e−γ where gamma
is Euler's constant, 0.5772157... and the exponential is 0.56145946... and we can see the convergence to
this value. This result was derived in 1874 and is known as Merten's theorem (but the proof was based on
Riemann's (1859) paper and not on a causal idea of a "mean distance between primes"). The above product
form is a favorite of number theorists because its inverse is a special case of the zeta function, namely the
value of that function at unity, which happens to be an innity that just cancels the innity in the limit of
Pk (in the sense of the above limit).
http://cnx.org/content/m12764/1.1/
OpenStax-CNX module: m12764
7
Figure 2: Plot of logPk divided by ∆k . If we were to plot this function on ordinary linear scale, it
would look like a step function starting at 0.346 and almost immediately jumping to what looks like
the asymptotic value, and then being constant out to a value of almost 100,000. So we have expanded
the vertical scale to show the trend and plotted the horizontal scale as a log plot to show how the
"uctuations" die o with increasing size of the primes. Here a linear plot suppresses that trend because
almost all of the points to the right of the origin would now correspond to "large" primes.
In Merten's theorem we have used the "natural" denition for ∆1 of 12 , whereas when we used it in
the recursion relation for primes we instead dened ∆1 = 1, since that was the actual spacing between the
starting prime 2 and the next prime, 3. Alternatively, to agree with the log integral expression we would have
had to dene ∆1 = log2. In recursion relations there is often such a freedom of choice, as well a "natural"
choice.
References
[1] J. Derbyshire.
Prime Obsession. Joseph Henry Press, 2003.
http://cnx.org/content/m12764/1.1/
Download