Introduction to Number Theory “Elementary Diophantine

advertisement
Introduction to Number Theory
“Elementary Diophantine Approximation”
B.O. Stratmann
Contents.
1. Quick Review on Continued Fractions
2. Elementary Diophantine Approximations
2.1 Hurwitz’s Theorem
2.2 Lagrange and Markov spectra
2.3 Badly approximable numbers
3. Metrical Diophantine Approximations
3.1 The Borel-Cantelli Lemma
3.2 Metrical Diophantine approximations
4. A first Trip through the Zoo of Prime Numbers
1
1
Quick Review on Continued Fractions
Every irrational number ξ can be approximated by a sequence of rationals pn /qn
which are ‘good approximations’ in the sense that there exists a constant c > 0 such
that
ξ − pn < c for all n ∈ N.
qn qn2
The rationals pn /qn are called convergents (or ‘approximants’). For
1
ξ = [a0 ; a1 , a2 , . . .] = a0 +
a1 +
1
a2 + . . .
,
they are given by
pn /qn = [a0 ; a1 , a2 , . . . , an ]
(we shall always assume that a0 ≥ 0 and ai+1 ≥ 1 for all i ∈ N). There are useful
formulas which relate these quantities.
For n ∈ N we have (with p−1 = q0 = 1, q−1 = 0 and p0 = a0 )
• pn+1 = an+1 pn + pn−1 ;
• qn+1 = an+1 qn + qn−1 ;
• qn pn−1 − pn qn−1 = (−1)n .
Definition 1.1 For ξ = [a0 ; a1 , a2 , . . .] and n ∈ N, let rn and sn be defined as
follows.
qn−1
rn := [an ; an+1 , an+2 , . . .] and sn :=
.
qn
For these quantities the following holds (for n ∈ N).
• rn = an +
1
rn+1 ;
1
• Since qn+1 = an+1 qn + qn−1 , we have for the ratio qn+1
qn = an+1 + qn /qn−1 .
Clearly, this process may be continued until q1 /q0 = a1 is reached. Therefore,
sn+1 =
1
.
[an+1 ; an , . . . , a1 ]
Theorem 1.2 For ξ = [a0 ; a1 , a2 , . . .] and n ∈ N, we have
ξ=
pn rn+1 + pn−1
.
qn rn+1 + qn−1
2
Proof: (by induction) For n = 0 we have
p0 r1 + p−1
a0 r1 + 1
1
=
= a0 +
= ξ.
q0 r1 + q−1
r1
r1
Now assume that the statement is true for n. Then
pn (an+1 +
pn rn+1 + pn−1
=
qn rn+1 + qn−1
qn (an+1 +
ξ =
1
rn+2 )
1
rn+2 )
+ pn−1
+ qn−1
(pn an+1 + pn−1 )rn+2 + pn
pn+1 rn+2 + pn
pn an+1 rn+2 + pn + pn−1 rn+2
=
=
.
qn an+1 rn+2 + qn + qn−1 rn+2
(qn an+1 + qn−1 )rn+2 + qn
qn+1 rn+2 + qn
=
2
Corollary 1.3
1
ξ − p n =
for all n ∈ N.
2
qn
qn (rn+1 + sn )
Proof:
ξ − pn q n
pn rn+1 + pn−1
pn pn qn rn+1 + pn−1 qn − pn qn rn+1 − pn qn−1 = − =
qn rn+1 + qn−1
qn
(qn rn+1 + qn−1 )qn
qn pn−1 − pn qn−1 1
=
= 2
.
q (r
+ s ) q 2 (r
+s )
n
n+1
n
n
n+1
n
2
2
2.1
Elementary Diophantine Approximations
Hurwitz’s Theorem
Theorem 2.1 For all irrationals ξ = [a0 ; a1 , a2 , . . .] and for all n ∈ N, we have that
ξ − p i ≤ 1
qi 2qi2
is fulfilled for at least one element i ∈ {n, n + 1}.
Proof: By way of contradiction, assume that the statement in the theorem is false.
This means that
ξ − p i > 1
qi 2qi2
holds simultaneously for i = n and i = n + 1. Since ξ −
equivalent to
ri+1 + si < 2 for i = n, n + 1.
3
pi qi =
1
,
qi2 (ri+1 +si )
this is
(a) For i = n we get 2 > rn+1 + sn = an+1 +
1
rn+2
+ sn , and hence,
1
1
< 2 − (an+1 + sn ) = 2 −
.
rn+2
sn+1
(b) For i = n + 1 we get
rn+2 < 2 − sn+1 .
Combining (a) and (b), we derive 1 < 4 − 2sn+1 − 2s−1
n+1 + 1,
and hence 0 < 2 − sn+1 − s−1
,
implying
n+1
0 > (sn+1 − 1)2 ,
2
and hence we derive a contradiction.
Theorem 2.2 For all irrationals ξ = [a0 ; a1 , a2 , . . .] and for all n ∈ N, we have that
ξ − pi ≤ √ 1
qi 5 q2
i
is fulfilled for at least one element i ∈ {n, n + 1, n + 2}.
Note, the number
√1
5
is called the Hurwitz number.
Proof: As in the proof of the previous theorem (with 2 replaced by
by way of contradiction that for each i ∈ {n, n + 1, n + 2} we have
√
ri+1 + si < 5.
√
5), assume
Proceeding for i = n and i = n + 1 as in (a) and (b) in the previous proof, we derive
√
s2n+1 − 5sn+1 + 1 < 0.
(1)
Analogously, for i = n + 1 and i = n + 2, we get
√
s2n+2 − 5sn+2 + 1 < 0.
(2)
√
By the quadratic formula, (1) and (2) give (with γ :=
5+1
2
and γ ∗ :=
γ ∗ < si < γ for i = n + 1, n + 2.
√
5−1
2 )
(3)
Using this, we get
sn+2 =
1
1
1
≤
<
= γ∗,
an+2 + sn+1
1 + sn+1
1 + γ∗
2
which contradicts (3).
4
√
Theorem 2.3 (Hurwitz’s theorem) For the golden mean γ :=
we have that
γ − p n ≤ C
qn qn2
5+1
2
is satisfied for at most finitely many reduced pn /qn if and only if C <
= [1; 1, 1, 1, . . .]
√1 .
5
Proof: First note that rn = γ for all n ∈ N. Secondly, note that
s−1
n = [an; an−1 , . . . , a1 ] = γ + ([an; an−1 , . . . , a1 ] − [an; an−1 , . . .]) = γ + δn ,
where for δn we have that limn→∞ δn = 0. Hence, it follows
sn =
1
1
1
1
1
1
−δn
= +
− = + 2
= + n ,
γ + δn
γ γ + δn γ
γ γ + γδn
γ
where for n we have that limn→∞ n = 0. These two observations then give
rn+1 + sn = γ +
Now, if C <
√1
5
√
√
1
+ n = 5 + n → 5 ( for n → ∞).
γ
is given, say C =
√1
5+ρ
for some fixed ρ > 0, then
1
1
1
γ − pn =
≤ 2 √
,
= 2 √
2
qn
qn (rn+1 + sn )
qn ( 5 + n )
qn ( 5 + ρ)
where√the latter√inequality can be fulfilled only for finitely many n (due to the fact
that 5 + ρ < 5 + n can be satisfied for at most finitely many n).
2
Corollary 2.4 For each irrational number ξ, the inequality
ξ − pn ≤ K
qn q 2
n
is fulfilled for infinitely many reduced pn /qn as long as K ≥
2.2
√1 .
5
Lagrange and Markov Spectra
Definition 2.5
• Let c denote some positive real number. An irrational ξ is
called c-approximable if an only if
ξ − p n < c
qn q 2
n
is satisfied for infinitely many reduced pn /qn .
• To each irrational number ξ we associate a non-negative real number ν(ξ),
defined by
ν(ξ) := inf{c > 0 : ξ is c-approximable }.
5
• Two irrational numbers ξ, η are called equivalent (and we write ξ ∼ η) if and
only if there exist k, l ∈ N such that rk (ξ) = rl (η) (i.e. eventually the continued
fraction expansions of ξ and η coincide).
Lemma 2.6 Let ξ, η be irrational. If ξ ∼ η, then ν(ξ) = ν(η).
Proof: Let ξ, η be irrational such that ξ ∼ η. Then there exist k, l ∈ N such that
rk+i (ξ) = rl+i (η) for all i ∈ N. Without loss of generality, assume that l ≥ k. Then
ξ and η must be of the form
ξ = [a0 ; a1 , . . . , ak , c1 , c2 , c3 , . . .] and η = [b0 ; b1 , . . . , bk , bk+1 , . . . , bl , c1 , c2 , c3 , . . .].
In order to prove the assertion of the lemma, it is sufficient to show that
r
1
1
→ 0 for n → ∞.
−
rl+n (η) + sl+n−1 (η) k+n (ξ) + sk+n−1 (ξ)
For this it is sufficient to show that
|rk+n (ξ) + sk+n−1 (ξ) − (rl+n (η) + sl+n−1 (η))| → 0 for n → ∞.
But this follows, since
|rk+n (ξ) + sk+n−1 (ξ) − (rl+n (η) + sl+n−1 (η))| = |sk+n−1 (ξ) − sl+n−1 (η)|
= [c
1
1
→ 0 ( for n → ∞).
−
[cn−1 ; . . . , c1 , bl , . . . , b0 ] n−1 ; . . . , c1 , ak , . . . , a0 ]
2
Definition 2.7 An irrational ξ ∼ γ is called noble number (i.e. the continued
fraction expansion of a noble number has from some stage onward exclusively 1’s as
its entries).
Corollary 2.8
• For each irrational number ξ we have that ν(ξ) ≤
• A number η is a noble number if and only if ν(ξ) =
√1 .
5
√1 .
5
Theorem 2.9 Let N be some fixed positive integer. If ξ = [a0 ; a1 , a2 , . . .] is irrational such that for some n ∈ N we have that
ξ − pi > √ 1
qi q 2 N 2 + 4
i
is fulfilled for all i ∈ {n, n + 1, n + 2}, then it follows that an+2 < N .
6
Proof:
√ proof of the first two theorem of the section (with 2,
√ We proceed as in the
resp. 5, now replaced by N 2 + 4). In this way, considering i = n and i = n + 1,
we derive
p
s2n+1 − N 2 + 4 sn+1 + 1 < 0.
And also, by considering i = n + 1 and i = n + 2, we derive along the same lines
s2n+2 −
p
N 2 + 4 sn+2 + 1 < 0.
Then, using the quadratic formula, we obtain
√
√
N2 + 4 − N
N2 + 4 + N
−1
< si , si <
for i = n + 1, n + 2.
2
2
Using this, we then have
√
√
N2 + 4 + N
N2 + 4 − N
−1
−
= N.
an+2 = sn+1 + an+2 − sn+1 = sn+2 − sn+1 <
2
2
2
Corollary 2.10 For each irrational number ξ and for every N ∈ N, exactly one of
the following two alternatives occurs.
Either:
ξ − p n ≤ √ 1
qn qn2 N 2 + 4
√
is fulfilled for infinitely many pn /qn (or with other words, ν(ξ) ≤ 1/ N 2 + 4),
Or: There exists a number n0 > 0 (depending on N and ξ) such that
an < N for all n ≥ n0
(or with other words, ξ ∈ BN (see Definition 2.18).).
Corollary 2.11 For each non-noble irrational number ξ we have that
ξ − pn ≤ √1
qn 2 2 q 2
n
is fulfilled for infinitely many reduced
√ pn /qn . (Or with other words, for each nonnoble number ξ we have ν(ξ) ≤ 1/(2 2).)
In fact, by means of similar ideas as in the proof of Hurwitz’s theorem (theorem
2.3), one derives that
√
1
ν(ξ) = √ if and only if ξ ∼ 2 (= [1; 2, 2, 2, . . .]).
2 2
Proof: This follows immediately, since if ξ = [a0 ; a1 , . . .] is non-noble then we have
an ≥ 2, for infinitely many n. Hence, by theorem 2.9, we have
ξ − p ≤ √ 1
q
8 q2
n
√
for infinitely many reduced p/q, which implies that ν(ξ) ≤ 1/(2 2).
7
2
Lemma 2.12 Let ξ = [a0 ; a1 , a2 , . . .] be an irrational number such that ν(ξ) is nei1
ther equal to √15 nor to 2√
, but such that ξ ∼ [b0 ; b1 , b2 , . . .] with bi ≤ 2 for all i ∈ N.
2
It then follows that
6
ν(ξ) ≤ .
17
Proof: Without loss of generality we can assume that there are infinitely many
√ 1’s
and √
2’s in [b0 ; b1 , b2 , . . .] (since otherwise ξ would be equivalent to either 1/ 5 or
1/(2 2)). Hence there are infinitely many values n such that an = 1 and an+1 = 2.
For these n, we have
rn+1 + sn = [an+1 ; an+2 , . . .] +
It follows that ν(ξ) ≤
1
≥2+
[an ; . . . , a1 ]
1
1
2+
1
+
1
1
1+
1
=
7 1
17
+ = .
3 2
6
2
6
17 .
Lemma 2.13 If ξ = [a0 ; a1 , . . .] is irrational such that an ≥ 3 for infinitely many
n, then ν(ξ) ≤ √113 .
Proof: By Theorem 2.9 we have that if an ≥ 3 for infinitely many n, then
1
ξ − pn−2 ≤ √
2
2
qn−2
3 + 4 qn−2
1
=√
2
13 qn−2
must hold for infinitely many n. Hence, ν(ξ) ≤
Definition 2.14
!
2
√1 .
13
• The set of numbers
L := {ν(ξ) : ξ is irrational }
is called Lagrange spectrum.
• The set of numbers
M := L ∩
1 1
,√
3 5
is called Markov spectrum.
Note, in some books one finds a slightly different use of the term Markov spectrum.
Also note that since 13 > √113 , we have by Lemma 2.13 that irrational numbers in
the Markov spectrum, that is ξ with ν(ξ) > 1/3, must have the property that they
are equivalent to irrational numbers whose continued fraction expansion contain
exclusively 1’s and 2’s.
As an immediate consequence of Hurwitz’s Theorem (Theorem 2.3), we obtain the
following theorem.
8
Theorem 2.15
1
L ⊂ 0, √ .
5
Proposition 2.16 For an irrational number ξ we have that ν(ξ) ∈ M if and only
if ξ ∼ [a0 ; a1 , a2 , . . .], for [a0 ; a1 , a2 , . . .] such that an ≤ 2 for all n ∈ N.
One can say much more about the structure of the Markov spectrum. It has the
following very interesting properties. The proof of this theorem is slightly more
involved and will be omitted.
Theorem
√ 2.17 The Markov spectrum M consists of a countable set of numbers in
(1/3; 1/ 5], and these numbers accumulate only at the value 13 .
In fact, much more can be said about the Markov and the Lagrange spectrum.
Nevertheless, there are still plenty of fascinating open problems concerning these
spectra. We now list a few known results about them. Some of these we have
already obtained.
q
• Each number in the Markov spectrum is of the form 1/ 9 − m42 , where m is
a positive integer solution of the equation m2 + k 2 + l2 = 3mkl, for k and l
some positive integers. It is known that there are infinitely many solutions m
of this equation. The first numbers in the Markov spectrum are

1
1
√ = q
5
9−
1
q
9−
4
132
1
,q
9−


1
1
 , √ = q
4
2 2
9−
12
4
292
,q
1
9−
4
342
1
,q
9−


5 
1
, √
=q
4
221
9−
22
4
892
,q
1
9−
4
1942
,q

4
52
,
1
9−
4
4332
,....
q
Note that since 1/ 9 − m42 accumulates at 1/3 (for m tending to infinity), it
is clear that the Markov spectrum accumulates at 1/3.
• We have that ν(x) ≥ √112 if and only if x is equivalent to a number whose
continued fraction expansion contains exclusively 1’s and 2’s.
• In the interval
√1 , √1
13
12
the Lagrange spectrum is empty. That is
L∩
1
1
√ ,√
13 12
= ∅.
• Let f be the so called Freimann number which is given by
f :=
491993569
√
.
2221564096 + 283748 462
One then knows that in the interval [0, f ) the Lagrange spectrum is continuous.
This means that for every c ∈ [0, f ) there exists an irrational number x such
that ν(x) = c.
9
2.3
Badly Approximable Numbers
Definition 2.18 For N ∈ N define
BN := {ξ = [a0 ; a1 , a2 , . . .] irrational : ∃ n0 > 0 such that an < N ∀ n ≥ n0 }.
The set of badly approximable numbers B is then defined as
B :=
[
BN = {ξ irrational : ∃ N > 0 such that ξ ∈ BN }.
N >0
With other words, ξ ∈ BN if and only if ξ ∼ η, for some η = [b0 ; b1 , . . .] with bi < N
for all i ∈ N. Furthermore, ξ ∈ B if and only if there exists M ∈ N such that ξ ∈ BM .
The following corollary clarifies why the elements in B are called ‘badly approximable’.
Lemma 2.19
• If ξ is an irrational number such that ξ ∈
/ BN for some
N ∈ N, then
ξ − p n ≤ √ 1
qn qn2 N 2 + 4
√
is fulfilled for infinitely many reduced pn /qn (i.e. ν(ξ) ≤ 1/ N 2 + 4).
• For each ξ ∈ B there exists a constant C > 0 such that for all n ∈ N we have
ξ − p n > C .
qn q 2
n
Proof: The first part is an immediate consequence of theorem 2.9. For the second
part, consider ξ = [a0 ; a1 , . . .] ∈ B. Then there exist numbers M and m0 such that
an < M for all n ≥ m0 . Using this, we derive rn+1 + sn < M + 1 + 1 = M + 2, and
hence
1
ξ − p n >
for all n ≥ m0 .
qn
(M + 2) qn2
For n < m0 we have that there exists a number cn > 0 such that
ξ − pn > cn .
qn q 2
n
If we define C := min{1/(M + 2), c0 , c1 , . . . , cm0 −1 } (i.e. C is the smallest number
in this finite set of numbers), then the result follows.
2
10
3
Metrical Diophantine Approximations
In this section we restrict the investigations to the unit interval I := [0, 1).
3.1
The Borel-Cantelli Lemma
Definition 3.1 A set Σ of subsets of I is called a σ-algebra of I if the following
conditions are satisfied.
• I ∈ Σ;
• If A ∈ I, then Ac ∈ I (where Ac := I \ A denotes the complement of A in I);
•
S
n∈N An
∈ I for all sequences (An ) with An ∈ I (for all n ∈ N).
Definition 3.2 The Borel-σ-algebra Σ0 of I is the smallest σ-algebra of I which
contains all possible intervals of I of the form [x, y) (for 0 ≤ x < y < 1).
The elements of Σ0 are called Borel sets.
Definition 3.3 Each element in Σ0 can be measured by the Lebesgue measure λ in
I. In particular, if A is an interval (i.e. A = [x, y) for some 0 ≤ x < y < 1), then
λ(A) is just the ‘length’ of that interval (i.e. λ(A) = λ([x, y)) = y − x).
Properties:
• λ(I) = 1;
• λ(A) ≥ 0 for all A ∈ Σ0 ;
• λ ( n∈N An ) = n∈N λ(An ) for every sequence (An ) of pairwise disjoint elements An ∈ Σ0 (i.e. Ai ∩ Aj = ∅ ∀ i 6= j).
S
P
• For A ∈ Σ0 we have:
λ(A) = 0 if and only if for all > 0 there exists a sequence (An ) of elements
An ∈ Σ0 such that
A⊂
[
An and
n∈N
X
λ(An ) < .
n∈N
Note, every countable set in I is of zero λ-measure.
More general, in order to find out if a given Borel set is of zero λ-measure, the
following theorem is often helpful.
Theorem 3.4 (Borel-Cantelli lemma)
P
If (An ) is a sequence of elements An ∈ Σ0 such that n∈N λ(An ) < ∞, then we have
λ(A∞ ) = 0,
where the lim sup-set A∞ is defined by
A∞ := {ξ ∈ I : ξ ∈ An for infinitely many n}.
11
Proof: The convergence of
integer n0 such that
P
n∈N λ(An )
X
implies that for each > 0 there exists an
λ(An ) < .
n≥n0
Now note that by definition of A∞ , we have that
[
A∞ ⊂
An .
n≥n0
Hence, it follows that


λ(A∞ ) ≤ λ 
[
An  ≤
X
λ(An ) < .
n≥n0
n≥n0
2
3.2
Metrical Diophantine Approximations
Definition 3.5 Let a1 , . . . , an ∈ N \{0} be given. The n-cylinder I(a1 , . . . , an ) (also
called ‘fundamental interval of order n’) is defined by (here we use the common
notation [x1 , x2 , . . .] := [0; x1 , x2 , . . .])
I(a1 , . . . , an ) := {ξ = [x1 , x2 , x3 , . . .] ∈ I irrational : xi = ai for all 1 ≤ i ≤ n}.
Properties:
• For every ξ ∈ I(a1 , . . . , an ) we have
ξ=
pn rn+1 (ξ) + pn−1
,
qn rn+1 (ξ) + qn−1
where pn , pn−1 , qn , qn−1 are fixed (depending only on a1 , . . . , an ).
•
  pn , pn +pn−1
qn qn +qn−1 I(a1 , . . . , an ) =
 pn +pn−1 , pn
qn +qn−1
qn
for
n even
for
n odd.
•
λ(I(a1 , . . . , an )) =
qn2 (1
1
.
+ sn )
Proof: These properties are immediate consequences of the following.
By theorem 1.2, we have
ξ=
pn rn+1 (ξ) + pn−1
pn + pn−1 /rn+1 (ξ)
=
.
qn rn+1 (ξ) + qn−1
qn + qn−1 /rn+1 (ξ)
12
Since 1 ≤ rn+1 (ξ) and since rn+1 (ξ) can get arbitrary large if ξ varies, we see that
pn + pn−1
pn pn qn + pn−1 qn − pn qn − qn−1 pn λ(I(a1 , . . . , an )) = − =
qn + qn−1
qn
qn2 (1 + sn )
pn−1 qn − qn−1 pn 1
=
=
.
2
2
q (1 + s )
q (1 + s )
n
n
n
n
Furthermore, observe that
pn + pn−1
pn
<
if and only if pn qn−1 − qn pn−1 < 0.
qn
qn + qn−1
But we know (since qn pn−1 − pn qn−1 = (−1)n ) that the left hand side of the latter
inequality is equal to (−1) if and only if n is even.
2
For the next theorem, recall the definition of the set of badly approximable irrational
numbers (Definition 2.18).
Theorem 3.6 For B 0 := B ∩ I we have
λ(B 0 ) = 0.
Proof: For n, N ∈ N we define the sets
AN := {ξ = [a1 , a2 , . . .] ∈ I irrational : ai < N ∀ i ∈ N}, A :=
[
AN ,
N ∈N
(n)
AN := {ξ = [a1 , a2 , . . .] ∈ I irrational : ai < N ∀ i ∈ {1, . . . , n}}.
(n)
We want to show that λ(A) = 0. For this, since AN ⊂ AN , it is sufficient to show
(n)
that limn→∞ λ(AN ) = 0, and this is what we are now going to prove.
(n+1)
(n)
(n+1)
Note that AN
⊂ AN , and that each AN
can be written as a union of disjoint
fundamental intervals as follows
(n+1)
AN
[
=
[
I(a1 , . . . , an , an+1 ) =
[
For fixed (a1 , . . . , an ), we now calculate the Lebesgue measure of
as follows.

I(a1 , . . . , an , k).
k:
(a1 ,...,an )
ai <N,i=1,...,n k<N
(a1 ,...,an+1 ):
ai <N,i=1,...,n+1
S
k:k<N
I(a1 , . . . , an , k)

pn + pn−1
pn N + pn−1 
I(a1 , . . . , an , k) = −
= ...
qn + qn−1
qn N + qn−1 1≤k<N
λ
[
N −1
N −1
1
< 2
= 1−
2
qn (1 + sn )(N + sn )
qn N (1 + sn )
N
=
13
λ(I(a1 , . . . , an )).
Using the latter estimate, we get


(n+1)
λ(AN
[
[

) = λ

k:
(a1 ,...,an ):
ai <N,i=1,...,n k<N
1
N
λ(I(a1 , . . . , an )) 1 −
(a1 ,...,an ):
ai <N,i=1,...,n
[
X

I(a1 , . . . , an , k)
=
X
≤


λ
1
N

k:
k<N
(a1 ,...,an ):
ai <N,i=1,...,n
= 1−
I(a1 , . . . , an , k)
(n)
λ(AN ).
Applying this estimate n times, we derive
(n+1)
λ(AN )
1
≤ 1−
N
(n)
λ(AN )
1
≤ 1−
N
2
(n−1)
λ(AN )
1
≤ ... ≤ 1 −
N
n
(1)
λ(AN ),
which then implies
(n+1)
λ(AN
) → 0 for n → ∞.
(n+1)
From this we obtain that (since AN ⊂ AN
)
λ(AN ) = 0 for all N ∈ N,
and hence, since
!
λ(A) = λ
[
AN
≤
X
λ(AN ) = 0,
N ∈N
N ∈N
we obtain the desired result
λ(A) = 0.
Finally, observe that ξ ∈ B 0 if and only if ξ ∈ A, from which we derive
λ(B 0 ) = 0.
2
By inspection of the proof of the previous theorem, we find that in there we in fact
proved slightly more than we actually formulated in the theorem. Namely, we have
seen that the following is true.
0 := B ∩ I we have
Corollary 3.7 For BN
N
0
λ(BN
) = 0 for all N ∈ N.
Also, combining the previous theorem and corollary 2.19, we immediately obtain
the following result.
Corollary 3.8
λ
p C
p
ξ ∈ I irrational : ∃C > 0 such that ξ − > 2 for all
= 0.
q
q
q
14
We have now seen that the set of badly approximable numbers does not contribute
to sets of irrational numbers of positive Lebesgue measure. Hence, if we want to
investigate sets of positive measure, then we have to look for irrationals which are
more rapidly approximated by their approximants than it is the case for badly
approximable irrationals. The contra-positive of the following theorem gives a first
indication of how an irrational number has to look like in order to have a chance to
contribute to positive Lebesgue measure. In particular, the theorem specifies how
fast the an (ξ) have to increase at least such that ξ has a chance to contribute to
positive Lebesgue measure.
Theorem 3.9 If φ : N → R+ is a function such that
P∞
n=1 1/φ(n)
diverges, then
λ(Bφ ) = 0,
where Bφ := {ξ = [a1 , a2 , . . .] ∈ I irrational : an < φ(n) ∀ n ∈ N}.
Note: A good choice for φ would be φ(n) = n log(n) (recall that
diverges).
P∞
1
n=1 n log(n)
Proof: The proof is basically the same as the proof of the previous theorem. As
before, we obtain that


[

λ

I(a1 , . . . , an , k) <
k:
k<φ(n+1)
1
1−
λ(I(a1 , . . . , an )).
φ(n + 1)
(n)
Hence, with Bφ := {ξ = [a1 , a2 , . . .] ∈ I irrational : ai < φ(i) ∀ i ∈ {1, . . . , n}}, we
get
(n+1)
λ(Bφ
)
n
Y
1
1
(n)
(1)
λ(Bφ ) < . . . <
1−
λ(Bφ ).
< 1−
φ(n + 1)
φ(k + 1)
k=1
Using the fact that 1 − x < e−x for each 0 < x < 1, we can continue as follows.
(n+1)
λ(Bφ
−
)<e
Pn
1
k=1 φ(k+1)
(1)
λ(Bφ ),
which implies (since nk=1 1/φ(k + 1) gets arbitrary large, due to the divergence
condition in the theorem)
P
(n+1)
λ(Bφ
(n+1)
and hence (since Bφ ⊂ Bφ
) → 0 for n → ∞,
for all n),
λ(Bφ ) = 0.
2
15
Note that with the special choice of φ, that is φ(n) = n log(n), an immediate consequence of the previous theorem is (for this essentially consider the complement of
Bφ in I)
λ ({ξ = [a1 , a2 , . . .] ∈ I irrational : an ≥ n log(n) for infinitely many n ∈ N}) = 1.
In contrast to the previous theorem, we now investigate how fast the an (ξ) can increase at most such that ξ has a chance to contribute to positive Lebesgue measure.
Theorem 3.10 If ϕ : N → R+ is a function such that
P∞
n=1 1/ϕ(n)
converges, then
λ(Wϕ ) = 0,
where Wϕ := {ξ = [a1 , a2 , . . .] ∈ I irrational : an > ϕ(n) for infinitely many n}.
Note: A good choice for φ would be φ(n) = n (log(n))1+ , for any fixed > 0 (recall
P
1
that ∞
n=1 n(log(n))1+ converges, for every > 0).
Proof: We have that


[

λ
k:
k≥ϕ(n+1)
=
pn  pn ϕ(n + 1) + pn−1
− = ...
I(a1 , . . . , an , k) = qn ϕ(n + 1) + qn−1
qn
1
1 + sn
2
<
λ(I(a1 , . . . , an )).
qn2 (1 + sn ) ϕ(n + 1) + sn
ϕ(n + 1)
(n)
Hence, with Wϕ := {ξ = [a1 , a2 , . . .] ∈ I irrational : an > ϕ(n)}, we get


[
[
(a1 ,...,an )
k:
k≥ϕ(n+1)
λ(Wϕ(n+1) ) = λ 

<
I(a1 , . . . , an , k)

X
2
2
λ(I(a1 , . . . , an )) ≤
.
ϕ(n + 1) (a ,...,a )
ϕ(n + 1)
1
n
Now, an application of the Borel-Cantelli lemma (theorem 3.4) finishes the proof. 2
Note that with the special choice of ϕ, that is ϕ(n) = n (log(n))1+ , an immediate
consequence of the previous theorem is (for this essentially consider the complement
of Wϕ in I) that for each > 0,
λ {ξ = [a1 , a2 , . . .] ∈ I irrational : ∃n0 such that an < n (log(n))1+ ∀n ≥ n0 } = 1.
Combining this with the remark after Theorem 3.9, we hence have that the continued
fraction expansion of an irrational number ξ = [a1 , a2 , . . .] which contributes to a
set of full Lebesgue measure has the property that for each > 0 we have
an > n log(n) for infinitely many n, whereas an < n (log(n))1+ eventually.
16
Finally, we mention the following important theorem (without proof). In this theorem we use the notion of a (α, β)-Khintchine function, by which we mean the
following.
• A (α, β)-Khintchine function ψ : R+ → R+ is a non-increasing function which
is not ‘decreasing too rapidly’, in the sense that there exist positive numbers
α < 1 and β ≤ 1 such that for all x ∈ R+ we have that ψ(x) ≥ βψ(αx).
Theorem 3.11 (Khintchine’s theorem)
For ψ a (α, β)-Khintchine function let
pn ψ(qn )
Kψ := {ξ ∈ I : ξ − <
is fulfilled for infinitely many n}.
qn
qn2
Then the following holds.
(i) λ(Kψ ) = 0 if and only if
P
n)
converges.
(ii) λ(Kψ ) = 1 if and only if
P
n)
diverges.
n∈N ψ(α
n∈N ψ(α
Remark: In case (i), a good choice for the function ψ would be ψ(x) = (log(x))−(1+)
(for any > 0). And in case (ii), a good choice for the function ψ would be
ψ(x) = (log(x))−1 .
With these choices, we then obtain that for ξ from a set of full λ-measure we have
that the two inequalities
pn 1
1
< ξ − < 2
,
2
1+
qn (log(qn ))
qn
qn log(qn )
are fulfilled simultaneously for infinitely many pn /qn (more precisely, the left-hand
inequality is fulfilled even for all pn /qn apart from finitely many exceptions).
17
4
A first Trip through the Zoo of Prime Numbers
Definition 4.1 A positive integer p 6= 1 is called a prime number if p is divisible
only by 1 and p.
Theorem 4.2 (Unique prime factorisation theorem) Every positive integer
N 6= 1 is either a prime number or can be written uniquely as a product of prime
numbers.
Proof: This is left as an exercise (use complete mathematical induction).
2
It’s a very old fact (Euclid 325-265 B.C., in Book IX of the Elements) that the set of
primes is infinite. Presumably, one of the first rigorous proofs of this fact was given
by Euler.
Theorem 4.3 (Euler) There are infinitely many prime numbers.
Proof: Assume (by way of contradiction) that there are only finitely many prime
numbers. Then let P := {p1 , p2 , . . . , pn } be the set of all these primes, ordered such
that pi < pi+1 for all i = 1, . . . , n − 1. Consider
q := p1 · p2 · ... · pn + 1.
Since q > pn , it follows that q ∈
/ P, and hence q is not a prime number. Since every
number has a unique prime factorisation, it follows that there exist q1 , . . . , qk ∈ P
such that
q := q1 · q2 · ... · qk .
Combining this with the definition of q, it follows that
p1 · p2 · ... · pn + 1 = q1 · q2 · ... · qk .
Since in the product p1 · p2 · ... · pn every prime number appears exactly once as a
factor, we must have that one of these, say pj , is equal to q1 . That is, q1 = pj for
some j ∈ {1, . . . , n}. By dividing the above equality by q1 , we obtain
p1 · p2 · ...pj−1 · pj+1 · ... · pn +
1
= q2 · ... · qk .
q1
Since in here the right hand side is an element of N whereas the left hand side is
not, this gives a contradiction.
2
The Sieve of Erathosthenes.
This ‘sieve’ represents a method for how to find all prime numbers less than some
given number N ∈ N. The method is as follows.
1. Write down a list consisting of all numbers from 2 up to N .
2. p1 (= 2) is the first prime number, and hence stays on the list. Then remove
all multiples of 2 from the list.
18
3. p2 (= 3) is the next number (following p1 ), and hence stays on the list. Then
remove all multiples of p2 from the list.
4. p3 (= 5) is the next number (following p2 ), and hence stays on the list. Then
remove all multiples of p3 from the list.
5. p4 is the next number (following p3 ) (of course p4 = 7), and hence stays on
the list. Then remove all multiples of p4 from the list.
..................
The ‘sieve’ ends once we have reached for the first time a number, say pn , for which
p2n > N . In this way we have obtained all prime numbers p1 , p2 , . . . , pn which are
between 2 and N .
Lemma 4.4 There are arbitrarily large gaps in the sequence of prime numbers. Or
with other words, for each arbitrarily large number N ∈ N there exists a number
n ∈ N such that there are no prime numbers between n and n + N .
Proof: Let N ∈ N be given. Then define n := (N + 1)!, and observe the following
• n + 1 might be a prime number;
• n + 2 = (N + 1)! + 2 is divisible by 2, and hence not a prime number;
• n + 3 = (N + 1)! + 3 is divisible by 3, and hence not a prime number;
• n + 4 = (N + 1)! + 4 is divisible by 4, and hence not a prime number;
.
.
.
• n + (N + 1) = (N + 1)! + (N + 1) is divisible by (N + 1), and hence not a
prime number.
Therefore, we now have that all the N numbers between n + 2 and n + (N + 1) are
not prime numbers.
2
Definition 4.5 The ‘prime number counting function’ π is defined for each N ∈ N
by
π(N ) := {p : p is a prime number, and p ≤ N }.
As a first little estimate we obtain the following. Note that an immediate consequence of this lemma is that there are infinitely many primes, and hence the lemma
gives an alternative proof of Euler’s theorem.
Lemma 4.6
π(N ) >
log N
for all N ∈ N \ {1}.
2 log 2
19
Proof: Let us first remark the following. By the unique prime factorisation theorem
we have that every positive integer n 6= 1 can be written as the product of a square
number and a product of distinct prime numbers. To see this, note that for each
n 6= 1 there have to be prime numbers p1 , . . . , pk , pk+1 , . . . , pk+l , as well as k odd
numbers m1 , . . . , mk and l even numbers mk+1 , . . . , mk+l , such that (note, since the
m1 , . . . , mk are odd, we have that mi2−1 is either equal to zero or a positive integer
(for i = 1, . . . , k))
k+l
Y
n =
i=1
i
pm
i =
k
Y
i
pm
·
i
k+l
Y
i=1
= p1 · . . . · pk ·
i
pm
i
i=k+1
k
Y
i −1
pm
i
·
i=1

= p1 · . . . · pk · 
k
Y
k+l
Y
i
pm
i
i=k+1
mi −1
2
·
pi
i=1
k+l
Y
mi
2
2
pi  .
i=k+1
Now, let N ∈ N \ {1} be given, and consider all integers between 2 and N . We give
an upper bound for the number of ways one can possibly write the numbers between
2 and N in form of a product of a square number and a product of distinct prime
numbers.
• Question: How many distinct squares are there among the numbers between
2 and N ?
√
Answer: Less than N .
• Question: How many distinct products of distinct prime numbers are there at
most among the numbers between 2 and N ?
Answer: Less than
!
!
!
!
π(N )
π(N )
π(N )
π(N )
+
+
+ ... +
+ 1 = 2π(N ) − 1 < 2π(N ) .
1
2
3
π(N ) − 1
(Recall that the ‘binomial coefficients’
n
k
are defined by
n
k
=
n!
(n−k)!·k! ).
Combining these two estimates, we obtain
√
N < N 2π(N ) .
Solving this for π(N ) then gives the result.
2
Already Euclid knew that the set of primes is infinite, and a much more recent and
famous result (by Jacques Hadamard (1865-1963) and C.-J. de la Valleé-Poussin
(1866-1962)) shows that the density of primes is ruled by the following law. This
law had already been conjectured before by Gauss. Since the proof of this law is
rather involved (and strictly speaking this result is not a number theoretical result,
since most proofs make heavy use of probability theory), here we can only state this
very important law.
20
Theorem 4.7
lim
N →∞
π(N )
N
log N
= 1.
Getting a more exact figure for the function π is presumably one of the most important problems in mathematics. Here, a very big step forward would be to verify the
Riemann Hypothesis (given that it is true). In here we use li to denote the function
which is given by
Z N
1
li(N ) =
d(x).
2 log x
Note that limN →∞ li(N )/(N/ log(N )) = 1.
• Riemann Hypothesis:
There exists a constant C > 0 such that for all N sufficiently large
1
1
li(N ) − CN 2 + ≤ π(N ) ≤ li(N ) + CN 2 + for all > 0.
The Riemann Hypothesis is Problem 8 on Hilbert’s famous list of 23 problems (Paris,
1900). In the meanwhile, mathematicians found numerous ways to state the Riemann Hypothesis in equivalent forms which come in completely different disguises.
Let us give one of these.
Definition 4.8 For n ∈ N let
Fn :=
p
∈ [0, 1] : 0 ≤ p < q ≤ n, and p, q are coprime .
q
If we assume that the set Fn = {f1 , . . . , fkn } is ordered such that f1 < f2 < . . . < fkn ,
then Fn is called the Farey sequence of order n. Note that in here kn denotes the
number of elements in Fn .
In the 1920’s Franel and Landau found the following elementary way of formulating
the Riemann Hypothesis.
• Riemann Hypothesis:
There exists a constant C0 > 0 such that for all n sufficiently large
1
n 2 + − C0 ≤
kn X
1
fi − i − 1 ≤ n 2 + + C0 for all > 0.
kn
i=1
Finally, note that the original statement of the Riemann Hypothesis was formulated
in terms of the Riemann zeta-function ζ(z) which is given by
ζ(z) :=
∞
X
1
n=1
nz
21
for z ∈ C.
For real values z this series is called the harmonic series ζ(s), that is
ζ(s) :=
∞
X
1
n=1
ns
for s ∈ R.
Of course, this series converges if and only if s > 1 (note for complex values the
situation is by no means as simple as this!). Nevertheless, to compute the actual
values of this series for particular s > 1 is usually a problem. It can be done for
instance for even numbers s. Here we have the following result of Euler.
Theorem 4.9 (Euler)
ζ(2n) = (−1)n−1
(2π)2n
B2n for all n ∈ N,
2(2n)!
where the Bk are the Bernoulli numbers given by
∞
X
z
Bk k
=
z .
ez − 1 k=0 k!
For instance, we have
ζ(2) =
π2
π4
π6
, ζ(4) =
, ζ(6) =
.
6
90
945
For odd numbers s things are far less well-understood. It was a mathematical
sensation, when in 1978 Apéry proved
ζ(3) is irrational.
Also, for instance one knows (this is a result of Zudilin)
One of ζ(5), ζ(7), ζ(9), ζ(11) has to be an irrational number.
Furthermore, it is know that (this is a result of Rioval)
ζ(2n + 1) is irrational for infinitely many n ∈ N.
Twin Primes.
Definition 4.10 A couple of primes (p, q) are said to be twin primes if q = p + 2.
Except for the couple (2, 3), this is clearly the smallest possible distance between two
primes.
For example (3, 5), (5, 7), (11, 13), (17, 19), (29, 31), ..., (419, 421), ... are twin primes.
So far the following is not known.
22
• Conjecture: There are infinitely many twin primes.
Based on heuristic considerations, G. H. Hardy (1877-1947) and J. E. Littlewood
(1885-1977) developed a law (the twin prime conjecture (1922)) to estimate the
density of twin primes.
The prime number theorem says that the probability that a number N is prime is
about 1/ log(N ). Therefore, if the probability that N + 2 is prime would be independent of the probability that N is prime, we should have the approximation for the
twin prime counting function π2 (N ) := {(p, q) : (p, q) are twin prime number, and
p, q ≤ N },
π2 (N )
lim
= 1.
N
N →∞
(log(N ))2
A more careful analysis shows that this is too simple. In fact we have the following
more accurate conjecture.
• Twin prime conjecture:
lim
N →∞
π2 (N )
N
(log(N ))2
= 2C2 ,
where C2 = 0.6601618151468695739278121100145 . . . is the twin prime constant.
The following rather remarkable result from 1919 is due to the Norwegian mathematician V. Brun (1885-1978). Although it is not known if there are infinitely many
twin prime numbers, Brun showed that the sum of the inverses of all twin primes
converges to a constant.
Theorem 4.11
X
(p,q) twin
primes
1 1
+
p q
= B2 ,
where B2 = 1.902160583104... is the Brun constant.
A few more Conjectures.
There are still loads of other (old and new) unsolved problems concerning prime
numbers. Here is just a very tiny list of some of them.
1. Are there infinitely many primes of the form N 2 + 1 ? (Dirichlet proved that
every arithmetic progression (a + bn)n∈N with a, b coprime contains infinitely
many primes.)
2. Is there always a prime between N 2 and (N + 1)2 ? (The fact that there is
always a prime between N and 2N was proved by Chebyshev.)
23
3. N 2 − N + 41 is prime for 0 ≤ N ≤ 40. Are there infinitely many primes of
this form? The same question applies to N 2 − 79N + 1601 which is prime for
0 ≤ N ≤ 79.
4. Are there infinitely many primes of the form


 Y 
p + 1 ?

p prime
p≤N
5. Are there infinitely many primes of the form

Y


p − 1 ?


p prime
p≤N
6. Are there infinitely many primes of the form N ! + 1?
7. Are there infinitely many primes of the form N ! − 1?
8. If p is a prime number, is 2p − 1 then always not divisible by the square of a
prime number.
9. Does the Fibonacci sequence contain infinitely many prime numbers?
24
Download