Introduction to Number Theory “Elementary Diophantine Approximation” B.O. Stratmann Contents. 1. Quick Review on Continued Fractions 2. Elementary Diophantine Approximations 2.1 Hurwitz’s Theorem 2.2 Lagrange and Markov spectra 2.3 Badly approximable numbers 3. Metrical Diophantine Approximations 3.1 The Borel-Cantelli Lemma 3.2 Metrical Diophantine approximations 4. A first Trip through the Zoo of Prime Numbers 1 1 Quick Review on Continued Fractions Every irrational number ξ can be approximated by a sequence of rationals pn /qn which are ‘good approximations’ in the sense that there exists a constant c > 0 such that ξ − pn < c for all n ∈ N. qn qn2 The rationals pn /qn are called convergents (or ‘approximants’). For 1 ξ = [a0 ; a1 , a2 , . . .] = a0 + a1 + 1 a2 + . . . , they are given by pn /qn = [a0 ; a1 , a2 , . . . , an ] (we shall always assume that a0 ≥ 0 and ai+1 ≥ 1 for all i ∈ N). There are useful formulas which relate these quantities. For n ∈ N we have (with p−1 = q0 = 1, q−1 = 0 and p0 = a0 ) • pn+1 = an+1 pn + pn−1 ; • qn+1 = an+1 qn + qn−1 ; • qn pn−1 − pn qn−1 = (−1)n . Definition 1.1 For ξ = [a0 ; a1 , a2 , . . .] and n ∈ N, let rn and sn be defined as follows. qn−1 rn := [an ; an+1 , an+2 , . . .] and sn := . qn For these quantities the following holds (for n ∈ N). • rn = an + 1 rn+1 ; 1 • Since qn+1 = an+1 qn + qn−1 , we have for the ratio qn+1 qn = an+1 + qn /qn−1 . Clearly, this process may be continued until q1 /q0 = a1 is reached. Therefore, sn+1 = 1 . [an+1 ; an , . . . , a1 ] Theorem 1.2 For ξ = [a0 ; a1 , a2 , . . .] and n ∈ N, we have ξ= pn rn+1 + pn−1 . qn rn+1 + qn−1 2 Proof: (by induction) For n = 0 we have p0 r1 + p−1 a0 r1 + 1 1 = = a0 + = ξ. q0 r1 + q−1 r1 r1 Now assume that the statement is true for n. Then pn (an+1 + pn rn+1 + pn−1 = qn rn+1 + qn−1 qn (an+1 + ξ = 1 rn+2 ) 1 rn+2 ) + pn−1 + qn−1 (pn an+1 + pn−1 )rn+2 + pn pn+1 rn+2 + pn pn an+1 rn+2 + pn + pn−1 rn+2 = = . qn an+1 rn+2 + qn + qn−1 rn+2 (qn an+1 + qn−1 )rn+2 + qn qn+1 rn+2 + qn = 2 Corollary 1.3 1 ξ − p n = for all n ∈ N. 2 qn qn (rn+1 + sn ) Proof: ξ − pn q n pn rn+1 + pn−1 pn pn qn rn+1 + pn−1 qn − pn qn rn+1 − pn qn−1 = − = qn rn+1 + qn−1 qn (qn rn+1 + qn−1 )qn qn pn−1 − pn qn−1 1 = = 2 . q (r + s ) q 2 (r +s ) n n+1 n n n+1 n 2 2 2.1 Elementary Diophantine Approximations Hurwitz’s Theorem Theorem 2.1 For all irrationals ξ = [a0 ; a1 , a2 , . . .] and for all n ∈ N, we have that ξ − p i ≤ 1 qi 2qi2 is fulfilled for at least one element i ∈ {n, n + 1}. Proof: By way of contradiction, assume that the statement in the theorem is false. This means that ξ − p i > 1 qi 2qi2 holds simultaneously for i = n and i = n + 1. Since ξ − equivalent to ri+1 + si < 2 for i = n, n + 1. 3 pi qi = 1 , qi2 (ri+1 +si ) this is (a) For i = n we get 2 > rn+1 + sn = an+1 + 1 rn+2 + sn , and hence, 1 1 < 2 − (an+1 + sn ) = 2 − . rn+2 sn+1 (b) For i = n + 1 we get rn+2 < 2 − sn+1 . Combining (a) and (b), we derive 1 < 4 − 2sn+1 − 2s−1 n+1 + 1, and hence 0 < 2 − sn+1 − s−1 , implying n+1 0 > (sn+1 − 1)2 , 2 and hence we derive a contradiction. Theorem 2.2 For all irrationals ξ = [a0 ; a1 , a2 , . . .] and for all n ∈ N, we have that ξ − pi ≤ √ 1 qi 5 q2 i is fulfilled for at least one element i ∈ {n, n + 1, n + 2}. Note, the number √1 5 is called the Hurwitz number. Proof: As in the proof of the previous theorem (with 2 replaced by by way of contradiction that for each i ∈ {n, n + 1, n + 2} we have √ ri+1 + si < 5. √ 5), assume Proceeding for i = n and i = n + 1 as in (a) and (b) in the previous proof, we derive √ s2n+1 − 5sn+1 + 1 < 0. (1) Analogously, for i = n + 1 and i = n + 2, we get √ s2n+2 − 5sn+2 + 1 < 0. (2) √ By the quadratic formula, (1) and (2) give (with γ := 5+1 2 and γ ∗ := γ ∗ < si < γ for i = n + 1, n + 2. √ 5−1 2 ) (3) Using this, we get sn+2 = 1 1 1 ≤ < = γ∗, an+2 + sn+1 1 + sn+1 1 + γ∗ 2 which contradicts (3). 4 √ Theorem 2.3 (Hurwitz’s theorem) For the golden mean γ := we have that γ − p n ≤ C qn qn2 5+1 2 is satisfied for at most finitely many reduced pn /qn if and only if C < = [1; 1, 1, 1, . . .] √1 . 5 Proof: First note that rn = γ for all n ∈ N. Secondly, note that s−1 n = [an; an−1 , . . . , a1 ] = γ + ([an; an−1 , . . . , a1 ] − [an; an−1 , . . .]) = γ + δn , where for δn we have that limn→∞ δn = 0. Hence, it follows sn = 1 1 1 1 1 1 −δn = + − = + 2 = + n , γ + δn γ γ + δn γ γ γ + γδn γ where for n we have that limn→∞ n = 0. These two observations then give rn+1 + sn = γ + Now, if C < √1 5 √ √ 1 + n = 5 + n → 5 ( for n → ∞). γ is given, say C = √1 5+ρ for some fixed ρ > 0, then 1 1 1 γ − pn = ≤ 2 √ , = 2 √ 2 qn qn (rn+1 + sn ) qn ( 5 + n ) qn ( 5 + ρ) where√the latter√inequality can be fulfilled only for finitely many n (due to the fact that 5 + ρ < 5 + n can be satisfied for at most finitely many n). 2 Corollary 2.4 For each irrational number ξ, the inequality ξ − pn ≤ K qn q 2 n is fulfilled for infinitely many reduced pn /qn as long as K ≥ 2.2 √1 . 5 Lagrange and Markov Spectra Definition 2.5 • Let c denote some positive real number. An irrational ξ is called c-approximable if an only if ξ − p n < c qn q 2 n is satisfied for infinitely many reduced pn /qn . • To each irrational number ξ we associate a non-negative real number ν(ξ), defined by ν(ξ) := inf{c > 0 : ξ is c-approximable }. 5 • Two irrational numbers ξ, η are called equivalent (and we write ξ ∼ η) if and only if there exist k, l ∈ N such that rk (ξ) = rl (η) (i.e. eventually the continued fraction expansions of ξ and η coincide). Lemma 2.6 Let ξ, η be irrational. If ξ ∼ η, then ν(ξ) = ν(η). Proof: Let ξ, η be irrational such that ξ ∼ η. Then there exist k, l ∈ N such that rk+i (ξ) = rl+i (η) for all i ∈ N. Without loss of generality, assume that l ≥ k. Then ξ and η must be of the form ξ = [a0 ; a1 , . . . , ak , c1 , c2 , c3 , . . .] and η = [b0 ; b1 , . . . , bk , bk+1 , . . . , bl , c1 , c2 , c3 , . . .]. In order to prove the assertion of the lemma, it is sufficient to show that r 1 1 → 0 for n → ∞. − rl+n (η) + sl+n−1 (η) k+n (ξ) + sk+n−1 (ξ) For this it is sufficient to show that |rk+n (ξ) + sk+n−1 (ξ) − (rl+n (η) + sl+n−1 (η))| → 0 for n → ∞. But this follows, since |rk+n (ξ) + sk+n−1 (ξ) − (rl+n (η) + sl+n−1 (η))| = |sk+n−1 (ξ) − sl+n−1 (η)| = [c 1 1 → 0 ( for n → ∞). − [cn−1 ; . . . , c1 , bl , . . . , b0 ] n−1 ; . . . , c1 , ak , . . . , a0 ] 2 Definition 2.7 An irrational ξ ∼ γ is called noble number (i.e. the continued fraction expansion of a noble number has from some stage onward exclusively 1’s as its entries). Corollary 2.8 • For each irrational number ξ we have that ν(ξ) ≤ • A number η is a noble number if and only if ν(ξ) = √1 . 5 √1 . 5 Theorem 2.9 Let N be some fixed positive integer. If ξ = [a0 ; a1 , a2 , . . .] is irrational such that for some n ∈ N we have that ξ − pi > √ 1 qi q 2 N 2 + 4 i is fulfilled for all i ∈ {n, n + 1, n + 2}, then it follows that an+2 < N . 6 Proof: √ proof of the first two theorem of the section (with 2, √ We proceed as in the resp. 5, now replaced by N 2 + 4). In this way, considering i = n and i = n + 1, we derive p s2n+1 − N 2 + 4 sn+1 + 1 < 0. And also, by considering i = n + 1 and i = n + 2, we derive along the same lines s2n+2 − p N 2 + 4 sn+2 + 1 < 0. Then, using the quadratic formula, we obtain √ √ N2 + 4 − N N2 + 4 + N −1 < si , si < for i = n + 1, n + 2. 2 2 Using this, we then have √ √ N2 + 4 + N N2 + 4 − N −1 − = N. an+2 = sn+1 + an+2 − sn+1 = sn+2 − sn+1 < 2 2 2 Corollary 2.10 For each irrational number ξ and for every N ∈ N, exactly one of the following two alternatives occurs. Either: ξ − p n ≤ √ 1 qn qn2 N 2 + 4 √ is fulfilled for infinitely many pn /qn (or with other words, ν(ξ) ≤ 1/ N 2 + 4), Or: There exists a number n0 > 0 (depending on N and ξ) such that an < N for all n ≥ n0 (or with other words, ξ ∈ BN (see Definition 2.18).). Corollary 2.11 For each non-noble irrational number ξ we have that ξ − pn ≤ √1 qn 2 2 q 2 n is fulfilled for infinitely many reduced √ pn /qn . (Or with other words, for each nonnoble number ξ we have ν(ξ) ≤ 1/(2 2).) In fact, by means of similar ideas as in the proof of Hurwitz’s theorem (theorem 2.3), one derives that √ 1 ν(ξ) = √ if and only if ξ ∼ 2 (= [1; 2, 2, 2, . . .]). 2 2 Proof: This follows immediately, since if ξ = [a0 ; a1 , . . .] is non-noble then we have an ≥ 2, for infinitely many n. Hence, by theorem 2.9, we have ξ − p ≤ √ 1 q 8 q2 n √ for infinitely many reduced p/q, which implies that ν(ξ) ≤ 1/(2 2). 7 2 Lemma 2.12 Let ξ = [a0 ; a1 , a2 , . . .] be an irrational number such that ν(ξ) is nei1 ther equal to √15 nor to 2√ , but such that ξ ∼ [b0 ; b1 , b2 , . . .] with bi ≤ 2 for all i ∈ N. 2 It then follows that 6 ν(ξ) ≤ . 17 Proof: Without loss of generality we can assume that there are infinitely many √ 1’s and √ 2’s in [b0 ; b1 , b2 , . . .] (since otherwise ξ would be equivalent to either 1/ 5 or 1/(2 2)). Hence there are infinitely many values n such that an = 1 and an+1 = 2. For these n, we have rn+1 + sn = [an+1 ; an+2 , . . .] + It follows that ν(ξ) ≤ 1 ≥2+ [an ; . . . , a1 ] 1 1 2+ 1 + 1 1 1+ 1 = 7 1 17 + = . 3 2 6 2 6 17 . Lemma 2.13 If ξ = [a0 ; a1 , . . .] is irrational such that an ≥ 3 for infinitely many n, then ν(ξ) ≤ √113 . Proof: By Theorem 2.9 we have that if an ≥ 3 for infinitely many n, then 1 ξ − pn−2 ≤ √ 2 2 qn−2 3 + 4 qn−2 1 =√ 2 13 qn−2 must hold for infinitely many n. Hence, ν(ξ) ≤ Definition 2.14 ! 2 √1 . 13 • The set of numbers L := {ν(ξ) : ξ is irrational } is called Lagrange spectrum. • The set of numbers M := L ∩ 1 1 ,√ 3 5 is called Markov spectrum. Note, in some books one finds a slightly different use of the term Markov spectrum. Also note that since 13 > √113 , we have by Lemma 2.13 that irrational numbers in the Markov spectrum, that is ξ with ν(ξ) > 1/3, must have the property that they are equivalent to irrational numbers whose continued fraction expansion contain exclusively 1’s and 2’s. As an immediate consequence of Hurwitz’s Theorem (Theorem 2.3), we obtain the following theorem. 8 Theorem 2.15 1 L ⊂ 0, √ . 5 Proposition 2.16 For an irrational number ξ we have that ν(ξ) ∈ M if and only if ξ ∼ [a0 ; a1 , a2 , . . .], for [a0 ; a1 , a2 , . . .] such that an ≤ 2 for all n ∈ N. One can say much more about the structure of the Markov spectrum. It has the following very interesting properties. The proof of this theorem is slightly more involved and will be omitted. Theorem √ 2.17 The Markov spectrum M consists of a countable set of numbers in (1/3; 1/ 5], and these numbers accumulate only at the value 13 . In fact, much more can be said about the Markov and the Lagrange spectrum. Nevertheless, there are still plenty of fascinating open problems concerning these spectra. We now list a few known results about them. Some of these we have already obtained. q • Each number in the Markov spectrum is of the form 1/ 9 − m42 , where m is a positive integer solution of the equation m2 + k 2 + l2 = 3mkl, for k and l some positive integers. It is known that there are infinitely many solutions m of this equation. The first numbers in the Markov spectrum are 1 1 √ = q 5 9− 1 q 9− 4 132 1 ,q 9− 1 1 , √ = q 4 2 2 9− 12 4 292 ,q 1 9− 4 342 1 ,q 9− 5 1 , √ =q 4 221 9− 22 4 892 ,q 1 9− 4 1942 ,q 4 52 , 1 9− 4 4332 ,.... q Note that since 1/ 9 − m42 accumulates at 1/3 (for m tending to infinity), it is clear that the Markov spectrum accumulates at 1/3. • We have that ν(x) ≥ √112 if and only if x is equivalent to a number whose continued fraction expansion contains exclusively 1’s and 2’s. • In the interval √1 , √1 13 12 the Lagrange spectrum is empty. That is L∩ 1 1 √ ,√ 13 12 = ∅. • Let f be the so called Freimann number which is given by f := 491993569 √ . 2221564096 + 283748 462 One then knows that in the interval [0, f ) the Lagrange spectrum is continuous. This means that for every c ∈ [0, f ) there exists an irrational number x such that ν(x) = c. 9 2.3 Badly Approximable Numbers Definition 2.18 For N ∈ N define BN := {ξ = [a0 ; a1 , a2 , . . .] irrational : ∃ n0 > 0 such that an < N ∀ n ≥ n0 }. The set of badly approximable numbers B is then defined as B := [ BN = {ξ irrational : ∃ N > 0 such that ξ ∈ BN }. N >0 With other words, ξ ∈ BN if and only if ξ ∼ η, for some η = [b0 ; b1 , . . .] with bi < N for all i ∈ N. Furthermore, ξ ∈ B if and only if there exists M ∈ N such that ξ ∈ BM . The following corollary clarifies why the elements in B are called ‘badly approximable’. Lemma 2.19 • If ξ is an irrational number such that ξ ∈ / BN for some N ∈ N, then ξ − p n ≤ √ 1 qn qn2 N 2 + 4 √ is fulfilled for infinitely many reduced pn /qn (i.e. ν(ξ) ≤ 1/ N 2 + 4). • For each ξ ∈ B there exists a constant C > 0 such that for all n ∈ N we have ξ − p n > C . qn q 2 n Proof: The first part is an immediate consequence of theorem 2.9. For the second part, consider ξ = [a0 ; a1 , . . .] ∈ B. Then there exist numbers M and m0 such that an < M for all n ≥ m0 . Using this, we derive rn+1 + sn < M + 1 + 1 = M + 2, and hence 1 ξ − p n > for all n ≥ m0 . qn (M + 2) qn2 For n < m0 we have that there exists a number cn > 0 such that ξ − pn > cn . qn q 2 n If we define C := min{1/(M + 2), c0 , c1 , . . . , cm0 −1 } (i.e. C is the smallest number in this finite set of numbers), then the result follows. 2 10 3 Metrical Diophantine Approximations In this section we restrict the investigations to the unit interval I := [0, 1). 3.1 The Borel-Cantelli Lemma Definition 3.1 A set Σ of subsets of I is called a σ-algebra of I if the following conditions are satisfied. • I ∈ Σ; • If A ∈ I, then Ac ∈ I (where Ac := I \ A denotes the complement of A in I); • S n∈N An ∈ I for all sequences (An ) with An ∈ I (for all n ∈ N). Definition 3.2 The Borel-σ-algebra Σ0 of I is the smallest σ-algebra of I which contains all possible intervals of I of the form [x, y) (for 0 ≤ x < y < 1). The elements of Σ0 are called Borel sets. Definition 3.3 Each element in Σ0 can be measured by the Lebesgue measure λ in I. In particular, if A is an interval (i.e. A = [x, y) for some 0 ≤ x < y < 1), then λ(A) is just the ‘length’ of that interval (i.e. λ(A) = λ([x, y)) = y − x). Properties: • λ(I) = 1; • λ(A) ≥ 0 for all A ∈ Σ0 ; • λ ( n∈N An ) = n∈N λ(An ) for every sequence (An ) of pairwise disjoint elements An ∈ Σ0 (i.e. Ai ∩ Aj = ∅ ∀ i 6= j). S P • For A ∈ Σ0 we have: λ(A) = 0 if and only if for all > 0 there exists a sequence (An ) of elements An ∈ Σ0 such that A⊂ [ An and n∈N X λ(An ) < . n∈N Note, every countable set in I is of zero λ-measure. More general, in order to find out if a given Borel set is of zero λ-measure, the following theorem is often helpful. Theorem 3.4 (Borel-Cantelli lemma) P If (An ) is a sequence of elements An ∈ Σ0 such that n∈N λ(An ) < ∞, then we have λ(A∞ ) = 0, where the lim sup-set A∞ is defined by A∞ := {ξ ∈ I : ξ ∈ An for infinitely many n}. 11 Proof: The convergence of integer n0 such that P n∈N λ(An ) X implies that for each > 0 there exists an λ(An ) < . n≥n0 Now note that by definition of A∞ , we have that [ A∞ ⊂ An . n≥n0 Hence, it follows that λ(A∞ ) ≤ λ [ An ≤ X λ(An ) < . n≥n0 n≥n0 2 3.2 Metrical Diophantine Approximations Definition 3.5 Let a1 , . . . , an ∈ N \{0} be given. The n-cylinder I(a1 , . . . , an ) (also called ‘fundamental interval of order n’) is defined by (here we use the common notation [x1 , x2 , . . .] := [0; x1 , x2 , . . .]) I(a1 , . . . , an ) := {ξ = [x1 , x2 , x3 , . . .] ∈ I irrational : xi = ai for all 1 ≤ i ≤ n}. Properties: • For every ξ ∈ I(a1 , . . . , an ) we have ξ= pn rn+1 (ξ) + pn−1 , qn rn+1 (ξ) + qn−1 where pn , pn−1 , qn , qn−1 are fixed (depending only on a1 , . . . , an ). • pn , pn +pn−1 qn qn +qn−1 I(a1 , . . . , an ) = pn +pn−1 , pn qn +qn−1 qn for n even for n odd. • λ(I(a1 , . . . , an )) = qn2 (1 1 . + sn ) Proof: These properties are immediate consequences of the following. By theorem 1.2, we have ξ= pn rn+1 (ξ) + pn−1 pn + pn−1 /rn+1 (ξ) = . qn rn+1 (ξ) + qn−1 qn + qn−1 /rn+1 (ξ) 12 Since 1 ≤ rn+1 (ξ) and since rn+1 (ξ) can get arbitrary large if ξ varies, we see that pn + pn−1 pn pn qn + pn−1 qn − pn qn − qn−1 pn λ(I(a1 , . . . , an )) = − = qn + qn−1 qn qn2 (1 + sn ) pn−1 qn − qn−1 pn 1 = = . 2 2 q (1 + s ) q (1 + s ) n n n n Furthermore, observe that pn + pn−1 pn < if and only if pn qn−1 − qn pn−1 < 0. qn qn + qn−1 But we know (since qn pn−1 − pn qn−1 = (−1)n ) that the left hand side of the latter inequality is equal to (−1) if and only if n is even. 2 For the next theorem, recall the definition of the set of badly approximable irrational numbers (Definition 2.18). Theorem 3.6 For B 0 := B ∩ I we have λ(B 0 ) = 0. Proof: For n, N ∈ N we define the sets AN := {ξ = [a1 , a2 , . . .] ∈ I irrational : ai < N ∀ i ∈ N}, A := [ AN , N ∈N (n) AN := {ξ = [a1 , a2 , . . .] ∈ I irrational : ai < N ∀ i ∈ {1, . . . , n}}. (n) We want to show that λ(A) = 0. For this, since AN ⊂ AN , it is sufficient to show (n) that limn→∞ λ(AN ) = 0, and this is what we are now going to prove. (n+1) (n) (n+1) Note that AN ⊂ AN , and that each AN can be written as a union of disjoint fundamental intervals as follows (n+1) AN [ = [ I(a1 , . . . , an , an+1 ) = [ For fixed (a1 , . . . , an ), we now calculate the Lebesgue measure of as follows. I(a1 , . . . , an , k). k: (a1 ,...,an ) ai <N,i=1,...,n k<N (a1 ,...,an+1 ): ai <N,i=1,...,n+1 S k:k<N I(a1 , . . . , an , k) pn + pn−1 pn N + pn−1 I(a1 , . . . , an , k) = − = ... qn + qn−1 qn N + qn−1 1≤k<N λ [ N −1 N −1 1 < 2 = 1− 2 qn (1 + sn )(N + sn ) qn N (1 + sn ) N = 13 λ(I(a1 , . . . , an )). Using the latter estimate, we get (n+1) λ(AN [ [ ) = λ k: (a1 ,...,an ): ai <N,i=1,...,n k<N 1 N λ(I(a1 , . . . , an )) 1 − (a1 ,...,an ): ai <N,i=1,...,n [ X I(a1 , . . . , an , k) = X ≤ λ 1 N k: k<N (a1 ,...,an ): ai <N,i=1,...,n = 1− I(a1 , . . . , an , k) (n) λ(AN ). Applying this estimate n times, we derive (n+1) λ(AN ) 1 ≤ 1− N (n) λ(AN ) 1 ≤ 1− N 2 (n−1) λ(AN ) 1 ≤ ... ≤ 1 − N n (1) λ(AN ), which then implies (n+1) λ(AN ) → 0 for n → ∞. (n+1) From this we obtain that (since AN ⊂ AN ) λ(AN ) = 0 for all N ∈ N, and hence, since ! λ(A) = λ [ AN ≤ X λ(AN ) = 0, N ∈N N ∈N we obtain the desired result λ(A) = 0. Finally, observe that ξ ∈ B 0 if and only if ξ ∈ A, from which we derive λ(B 0 ) = 0. 2 By inspection of the proof of the previous theorem, we find that in there we in fact proved slightly more than we actually formulated in the theorem. Namely, we have seen that the following is true. 0 := B ∩ I we have Corollary 3.7 For BN N 0 λ(BN ) = 0 for all N ∈ N. Also, combining the previous theorem and corollary 2.19, we immediately obtain the following result. Corollary 3.8 λ p C p ξ ∈ I irrational : ∃C > 0 such that ξ − > 2 for all = 0. q q q 14 We have now seen that the set of badly approximable numbers does not contribute to sets of irrational numbers of positive Lebesgue measure. Hence, if we want to investigate sets of positive measure, then we have to look for irrationals which are more rapidly approximated by their approximants than it is the case for badly approximable irrationals. The contra-positive of the following theorem gives a first indication of how an irrational number has to look like in order to have a chance to contribute to positive Lebesgue measure. In particular, the theorem specifies how fast the an (ξ) have to increase at least such that ξ has a chance to contribute to positive Lebesgue measure. Theorem 3.9 If φ : N → R+ is a function such that P∞ n=1 1/φ(n) diverges, then λ(Bφ ) = 0, where Bφ := {ξ = [a1 , a2 , . . .] ∈ I irrational : an < φ(n) ∀ n ∈ N}. Note: A good choice for φ would be φ(n) = n log(n) (recall that diverges). P∞ 1 n=1 n log(n) Proof: The proof is basically the same as the proof of the previous theorem. As before, we obtain that [ λ I(a1 , . . . , an , k) < k: k<φ(n+1) 1 1− λ(I(a1 , . . . , an )). φ(n + 1) (n) Hence, with Bφ := {ξ = [a1 , a2 , . . .] ∈ I irrational : ai < φ(i) ∀ i ∈ {1, . . . , n}}, we get (n+1) λ(Bφ ) n Y 1 1 (n) (1) λ(Bφ ) < . . . < 1− λ(Bφ ). < 1− φ(n + 1) φ(k + 1) k=1 Using the fact that 1 − x < e−x for each 0 < x < 1, we can continue as follows. (n+1) λ(Bφ − )<e Pn 1 k=1 φ(k+1) (1) λ(Bφ ), which implies (since nk=1 1/φ(k + 1) gets arbitrary large, due to the divergence condition in the theorem) P (n+1) λ(Bφ (n+1) and hence (since Bφ ⊂ Bφ ) → 0 for n → ∞, for all n), λ(Bφ ) = 0. 2 15 Note that with the special choice of φ, that is φ(n) = n log(n), an immediate consequence of the previous theorem is (for this essentially consider the complement of Bφ in I) λ ({ξ = [a1 , a2 , . . .] ∈ I irrational : an ≥ n log(n) for infinitely many n ∈ N}) = 1. In contrast to the previous theorem, we now investigate how fast the an (ξ) can increase at most such that ξ has a chance to contribute to positive Lebesgue measure. Theorem 3.10 If ϕ : N → R+ is a function such that P∞ n=1 1/ϕ(n) converges, then λ(Wϕ ) = 0, where Wϕ := {ξ = [a1 , a2 , . . .] ∈ I irrational : an > ϕ(n) for infinitely many n}. Note: A good choice for φ would be φ(n) = n (log(n))1+ , for any fixed > 0 (recall P 1 that ∞ n=1 n(log(n))1+ converges, for every > 0). Proof: We have that [ λ k: k≥ϕ(n+1) = pn pn ϕ(n + 1) + pn−1 − = ... I(a1 , . . . , an , k) = qn ϕ(n + 1) + qn−1 qn 1 1 + sn 2 < λ(I(a1 , . . . , an )). qn2 (1 + sn ) ϕ(n + 1) + sn ϕ(n + 1) (n) Hence, with Wϕ := {ξ = [a1 , a2 , . . .] ∈ I irrational : an > ϕ(n)}, we get [ [ (a1 ,...,an ) k: k≥ϕ(n+1) λ(Wϕ(n+1) ) = λ < I(a1 , . . . , an , k) X 2 2 λ(I(a1 , . . . , an )) ≤ . ϕ(n + 1) (a ,...,a ) ϕ(n + 1) 1 n Now, an application of the Borel-Cantelli lemma (theorem 3.4) finishes the proof. 2 Note that with the special choice of ϕ, that is ϕ(n) = n (log(n))1+ , an immediate consequence of the previous theorem is (for this essentially consider the complement of Wϕ in I) that for each > 0, λ {ξ = [a1 , a2 , . . .] ∈ I irrational : ∃n0 such that an < n (log(n))1+ ∀n ≥ n0 } = 1. Combining this with the remark after Theorem 3.9, we hence have that the continued fraction expansion of an irrational number ξ = [a1 , a2 , . . .] which contributes to a set of full Lebesgue measure has the property that for each > 0 we have an > n log(n) for infinitely many n, whereas an < n (log(n))1+ eventually. 16 Finally, we mention the following important theorem (without proof). In this theorem we use the notion of a (α, β)-Khintchine function, by which we mean the following. • A (α, β)-Khintchine function ψ : R+ → R+ is a non-increasing function which is not ‘decreasing too rapidly’, in the sense that there exist positive numbers α < 1 and β ≤ 1 such that for all x ∈ R+ we have that ψ(x) ≥ βψ(αx). Theorem 3.11 (Khintchine’s theorem) For ψ a (α, β)-Khintchine function let pn ψ(qn ) Kψ := {ξ ∈ I : ξ − < is fulfilled for infinitely many n}. qn qn2 Then the following holds. (i) λ(Kψ ) = 0 if and only if P n) converges. (ii) λ(Kψ ) = 1 if and only if P n) diverges. n∈N ψ(α n∈N ψ(α Remark: In case (i), a good choice for the function ψ would be ψ(x) = (log(x))−(1+) (for any > 0). And in case (ii), a good choice for the function ψ would be ψ(x) = (log(x))−1 . With these choices, we then obtain that for ξ from a set of full λ-measure we have that the two inequalities pn 1 1 < ξ − < 2 , 2 1+ qn (log(qn )) qn qn log(qn ) are fulfilled simultaneously for infinitely many pn /qn (more precisely, the left-hand inequality is fulfilled even for all pn /qn apart from finitely many exceptions). 17 4 A first Trip through the Zoo of Prime Numbers Definition 4.1 A positive integer p 6= 1 is called a prime number if p is divisible only by 1 and p. Theorem 4.2 (Unique prime factorisation theorem) Every positive integer N 6= 1 is either a prime number or can be written uniquely as a product of prime numbers. Proof: This is left as an exercise (use complete mathematical induction). 2 It’s a very old fact (Euclid 325-265 B.C., in Book IX of the Elements) that the set of primes is infinite. Presumably, one of the first rigorous proofs of this fact was given by Euler. Theorem 4.3 (Euler) There are infinitely many prime numbers. Proof: Assume (by way of contradiction) that there are only finitely many prime numbers. Then let P := {p1 , p2 , . . . , pn } be the set of all these primes, ordered such that pi < pi+1 for all i = 1, . . . , n − 1. Consider q := p1 · p2 · ... · pn + 1. Since q > pn , it follows that q ∈ / P, and hence q is not a prime number. Since every number has a unique prime factorisation, it follows that there exist q1 , . . . , qk ∈ P such that q := q1 · q2 · ... · qk . Combining this with the definition of q, it follows that p1 · p2 · ... · pn + 1 = q1 · q2 · ... · qk . Since in the product p1 · p2 · ... · pn every prime number appears exactly once as a factor, we must have that one of these, say pj , is equal to q1 . That is, q1 = pj for some j ∈ {1, . . . , n}. By dividing the above equality by q1 , we obtain p1 · p2 · ...pj−1 · pj+1 · ... · pn + 1 = q2 · ... · qk . q1 Since in here the right hand side is an element of N whereas the left hand side is not, this gives a contradiction. 2 The Sieve of Erathosthenes. This ‘sieve’ represents a method for how to find all prime numbers less than some given number N ∈ N. The method is as follows. 1. Write down a list consisting of all numbers from 2 up to N . 2. p1 (= 2) is the first prime number, and hence stays on the list. Then remove all multiples of 2 from the list. 18 3. p2 (= 3) is the next number (following p1 ), and hence stays on the list. Then remove all multiples of p2 from the list. 4. p3 (= 5) is the next number (following p2 ), and hence stays on the list. Then remove all multiples of p3 from the list. 5. p4 is the next number (following p3 ) (of course p4 = 7), and hence stays on the list. Then remove all multiples of p4 from the list. .................. The ‘sieve’ ends once we have reached for the first time a number, say pn , for which p2n > N . In this way we have obtained all prime numbers p1 , p2 , . . . , pn which are between 2 and N . Lemma 4.4 There are arbitrarily large gaps in the sequence of prime numbers. Or with other words, for each arbitrarily large number N ∈ N there exists a number n ∈ N such that there are no prime numbers between n and n + N . Proof: Let N ∈ N be given. Then define n := (N + 1)!, and observe the following • n + 1 might be a prime number; • n + 2 = (N + 1)! + 2 is divisible by 2, and hence not a prime number; • n + 3 = (N + 1)! + 3 is divisible by 3, and hence not a prime number; • n + 4 = (N + 1)! + 4 is divisible by 4, and hence not a prime number; . . . • n + (N + 1) = (N + 1)! + (N + 1) is divisible by (N + 1), and hence not a prime number. Therefore, we now have that all the N numbers between n + 2 and n + (N + 1) are not prime numbers. 2 Definition 4.5 The ‘prime number counting function’ π is defined for each N ∈ N by π(N ) := {p : p is a prime number, and p ≤ N }. As a first little estimate we obtain the following. Note that an immediate consequence of this lemma is that there are infinitely many primes, and hence the lemma gives an alternative proof of Euler’s theorem. Lemma 4.6 π(N ) > log N for all N ∈ N \ {1}. 2 log 2 19 Proof: Let us first remark the following. By the unique prime factorisation theorem we have that every positive integer n 6= 1 can be written as the product of a square number and a product of distinct prime numbers. To see this, note that for each n 6= 1 there have to be prime numbers p1 , . . . , pk , pk+1 , . . . , pk+l , as well as k odd numbers m1 , . . . , mk and l even numbers mk+1 , . . . , mk+l , such that (note, since the m1 , . . . , mk are odd, we have that mi2−1 is either equal to zero or a positive integer (for i = 1, . . . , k)) k+l Y n = i=1 i pm i = k Y i pm · i k+l Y i=1 = p1 · . . . · pk · i pm i i=k+1 k Y i −1 pm i · i=1 = p1 · . . . · pk · k Y k+l Y i pm i i=k+1 mi −1 2 · pi i=1 k+l Y mi 2 2 pi . i=k+1 Now, let N ∈ N \ {1} be given, and consider all integers between 2 and N . We give an upper bound for the number of ways one can possibly write the numbers between 2 and N in form of a product of a square number and a product of distinct prime numbers. • Question: How many distinct squares are there among the numbers between 2 and N ? √ Answer: Less than N . • Question: How many distinct products of distinct prime numbers are there at most among the numbers between 2 and N ? Answer: Less than ! ! ! ! π(N ) π(N ) π(N ) π(N ) + + + ... + + 1 = 2π(N ) − 1 < 2π(N ) . 1 2 3 π(N ) − 1 (Recall that the ‘binomial coefficients’ n k are defined by n k = n! (n−k)!·k! ). Combining these two estimates, we obtain √ N < N 2π(N ) . Solving this for π(N ) then gives the result. 2 Already Euclid knew that the set of primes is infinite, and a much more recent and famous result (by Jacques Hadamard (1865-1963) and C.-J. de la Valleé-Poussin (1866-1962)) shows that the density of primes is ruled by the following law. This law had already been conjectured before by Gauss. Since the proof of this law is rather involved (and strictly speaking this result is not a number theoretical result, since most proofs make heavy use of probability theory), here we can only state this very important law. 20 Theorem 4.7 lim N →∞ π(N ) N log N = 1. Getting a more exact figure for the function π is presumably one of the most important problems in mathematics. Here, a very big step forward would be to verify the Riemann Hypothesis (given that it is true). In here we use li to denote the function which is given by Z N 1 li(N ) = d(x). 2 log x Note that limN →∞ li(N )/(N/ log(N )) = 1. • Riemann Hypothesis: There exists a constant C > 0 such that for all N sufficiently large 1 1 li(N ) − CN 2 + ≤ π(N ) ≤ li(N ) + CN 2 + for all > 0. The Riemann Hypothesis is Problem 8 on Hilbert’s famous list of 23 problems (Paris, 1900). In the meanwhile, mathematicians found numerous ways to state the Riemann Hypothesis in equivalent forms which come in completely different disguises. Let us give one of these. Definition 4.8 For n ∈ N let Fn := p ∈ [0, 1] : 0 ≤ p < q ≤ n, and p, q are coprime . q If we assume that the set Fn = {f1 , . . . , fkn } is ordered such that f1 < f2 < . . . < fkn , then Fn is called the Farey sequence of order n. Note that in here kn denotes the number of elements in Fn . In the 1920’s Franel and Landau found the following elementary way of formulating the Riemann Hypothesis. • Riemann Hypothesis: There exists a constant C0 > 0 such that for all n sufficiently large 1 n 2 + − C0 ≤ kn X 1 fi − i − 1 ≤ n 2 + + C0 for all > 0. kn i=1 Finally, note that the original statement of the Riemann Hypothesis was formulated in terms of the Riemann zeta-function ζ(z) which is given by ζ(z) := ∞ X 1 n=1 nz 21 for z ∈ C. For real values z this series is called the harmonic series ζ(s), that is ζ(s) := ∞ X 1 n=1 ns for s ∈ R. Of course, this series converges if and only if s > 1 (note for complex values the situation is by no means as simple as this!). Nevertheless, to compute the actual values of this series for particular s > 1 is usually a problem. It can be done for instance for even numbers s. Here we have the following result of Euler. Theorem 4.9 (Euler) ζ(2n) = (−1)n−1 (2π)2n B2n for all n ∈ N, 2(2n)! where the Bk are the Bernoulli numbers given by ∞ X z Bk k = z . ez − 1 k=0 k! For instance, we have ζ(2) = π2 π4 π6 , ζ(4) = , ζ(6) = . 6 90 945 For odd numbers s things are far less well-understood. It was a mathematical sensation, when in 1978 Apéry proved ζ(3) is irrational. Also, for instance one knows (this is a result of Zudilin) One of ζ(5), ζ(7), ζ(9), ζ(11) has to be an irrational number. Furthermore, it is know that (this is a result of Rioval) ζ(2n + 1) is irrational for infinitely many n ∈ N. Twin Primes. Definition 4.10 A couple of primes (p, q) are said to be twin primes if q = p + 2. Except for the couple (2, 3), this is clearly the smallest possible distance between two primes. For example (3, 5), (5, 7), (11, 13), (17, 19), (29, 31), ..., (419, 421), ... are twin primes. So far the following is not known. 22 • Conjecture: There are infinitely many twin primes. Based on heuristic considerations, G. H. Hardy (1877-1947) and J. E. Littlewood (1885-1977) developed a law (the twin prime conjecture (1922)) to estimate the density of twin primes. The prime number theorem says that the probability that a number N is prime is about 1/ log(N ). Therefore, if the probability that N + 2 is prime would be independent of the probability that N is prime, we should have the approximation for the twin prime counting function π2 (N ) := {(p, q) : (p, q) are twin prime number, and p, q ≤ N }, π2 (N ) lim = 1. N N →∞ (log(N ))2 A more careful analysis shows that this is too simple. In fact we have the following more accurate conjecture. • Twin prime conjecture: lim N →∞ π2 (N ) N (log(N ))2 = 2C2 , where C2 = 0.6601618151468695739278121100145 . . . is the twin prime constant. The following rather remarkable result from 1919 is due to the Norwegian mathematician V. Brun (1885-1978). Although it is not known if there are infinitely many twin prime numbers, Brun showed that the sum of the inverses of all twin primes converges to a constant. Theorem 4.11 X (p,q) twin primes 1 1 + p q = B2 , where B2 = 1.902160583104... is the Brun constant. A few more Conjectures. There are still loads of other (old and new) unsolved problems concerning prime numbers. Here is just a very tiny list of some of them. 1. Are there infinitely many primes of the form N 2 + 1 ? (Dirichlet proved that every arithmetic progression (a + bn)n∈N with a, b coprime contains infinitely many primes.) 2. Is there always a prime between N 2 and (N + 1)2 ? (The fact that there is always a prime between N and 2N was proved by Chebyshev.) 23 3. N 2 − N + 41 is prime for 0 ≤ N ≤ 40. Are there infinitely many primes of this form? The same question applies to N 2 − 79N + 1601 which is prime for 0 ≤ N ≤ 79. 4. Are there infinitely many primes of the form Y p + 1 ? p prime p≤N 5. Are there infinitely many primes of the form Y p − 1 ? p prime p≤N 6. Are there infinitely many primes of the form N ! + 1? 7. Are there infinitely many primes of the form N ! − 1? 8. If p is a prime number, is 2p − 1 then always not divisible by the square of a prime number. 9. Does the Fibonacci sequence contain infinitely many prime numbers? 24