MAT3166: Elementary Number Theory Fall 2011 Course Notes, University of Ottawa Prof: Monica Nevins These course notes have been developed over several years, using material from many sources, regretfully only some of which are directly acknowledged. Last update: September 16, 2011 i Contents 1 Introduction 1 2 Integers and divisibility 2.1 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The greatest common divisor . . . . . . . . . . . . . . . . . . 2.3 The Fundamental Theorem of Arithmetic . . . . . . . . . . . 2.4 Some applications of the Fundamental Theorem of Arithmetic 2.5 Applications of the GCD: Linear Diophantine Equations . . . 2.6 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Further notes on primes . . . . . . . . . . . . . . . . . . . . . 2.8 Alternate proof of the infinitude of primes: the zeta function 2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 5 6 7 8 10 11 13 14 3 Modular Arithmetic 3.1 The ring Z/nZ . . . . . . . . . . . . . . . . . 3.1.1 The set Z/nZ . . . . . . . . . . . . . . 3.1.2 Arithmetic on Z/nZ . . . . . . . . . . 3.1.3 Cancellation and invertibility in Z/nZ 3.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 15 16 18 20 ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Introduction Number theory is an ancient area of mathematics. Carl Friedrich Gauss (1777-1855), whose work defines where elementary number theory turned into modern number theory, is quoted (by G.H. Hardy, in “A Mathematician’s Apology” (1940)) as saying “Mathematics is the queen of the sciences and number theory is the queen of mathematics.” The famous quote by Kroenecker “God made integers, all else is the work of man” captures where number theory begins: with the integers, and questions arising from arithmetic. What makes number theory so fascinating, and such a vibrant area of mathematical research, is how seemingly simple questions can end up being incredibly difficult, or incredibly deep; how much new mathematics has been generated in the pursuit of proofs in number theory. In 1900, the world-renowned mathematician David Hilbert proposed a list of 23 important problems in mathematics. History has proven Hilbert’s wisdom and insight: these were in many cases fundamental problems to be tackled, and in all cases interesting problems to try to resolve. Four of them concerned number theory, the most celebrated of which is Hilbert’s 8th problem, which included as a first step the Riemann hypothesis (that the nontrivial zeros of the Riemann zeta function all have real part equal to 1 2 ), as well as Goldbach’s conjecture (that every even integer > 2 is the sum of two primes) and the twin primes conjecture (that there are infinitely many pairs of prime numbers whose difference is two). Hungarian mathematician George Pólya quotes Hilbert as saying, in response to what he’d ask upon awakening, if he went to sleep for a thousand years: “I’d ask if anyone had proved the Riemann hypothesis.” So far: no one has. In 2000, the Clay Mathematics Institute drew up a set of 7 Millenium Prize problems, offering a million-dollar prize for their solutions. (One problem, the Poincaré conjecture, was solved in 2003 by Grigori Perelman, but he declined the prize. He also declined the Fields medal.) Two of these problems are in number theory: one is the Riemann hypothesis, and another is the BSD conjecture, about the number of rational points on an elliptic curve. So why the excitement about conjectures in number theory? In part: prime numbers seem to at once possess incredible regularity and exhibit so many special properties, and yet they defy any patterns or formulas; it’s completely tantalizing. For another: so many conjectures are easy to state and (with computers) easy to verify for the first million, billion or more cases, and yet that proves absolutely nothing! Think of the famous Pólya conjecture (below). Conjecture 1.1 (Pólya conjecture). For each n > 1, the number of numbers less than n with an odd number of prime factors is always greater than or equal to the number of numbers less than n with an even number of prime factors. 1 • n = 5: 2, 3 have 1 prime factor, 4 has two. • n = 6: 2, 3, 5 have 1 prime factor, 4 has two. • n = 10: 2, 3, 5, 7, 8 have an odd number of prime factors, 4, 6, 9 have two. This conjecture was postulated in 1919 but proven false in 1958. The first proof was of the existence of a counterexample, but did not calculate one explicity; an explicit one was found 2 years later and in 1980 it was proven that the smallest counterexample is n = 906, 150, 257. On the other hand, there are many conjectures which haven’t been proven but which “everyone” believes are true, and so they are called a “hypothesis”. Examples include the Riemann hypothesis, the hypothesis that P = N P , and the continuum hypothesis (whose resolution as being unresolvable was yet another incredible outcome from Hilbert’s list of problems). Mathematicians prove results based on this hypothesis, understanding that if ever the hypothesis was proven false, their results would be meaningless. Other conjectures and results in number theory are absolutely tantalizing but completely useless; they’re just really fascinating. A great example is “Fermat’s Last Theorem”, which was known by this name for over 350 years even though it was only proven in 1995 (by Andrew Wiles). Theorem 1.2 (Fermat’s Last Theorem). For any integer n > 2, the equation an + bn = cn has a no integer solution (a, b, c) ∈ Z3 with abc 6= 0. It’s a result that begs you to try your hand at finding a counterexample, and thousands of attempts were made at its proof over the years. In the end, it was proven using incredibly sophisticated mathematics; its proof was brilliant mathematics which has advanced many areas of number theory and beyond, even though the result itself has “no” mathematically interesting consequences! What’s also exciting about number theory is that although it’s just about numbers and primes and patterns and things that really don’t seem to have anything to do with the real world — in fact, G.H. Hardy (1877-1947), a brilliant number theorist (and pacifist), celebrated number theory as being truly “pure mathematics”, with no applications to the real world — it has turned out to provide the key building block for our very modern real world, through its use in public-key cryptography, which is the backbone to all secure connections on the internet. Our goal in this course is to explore the basic constructs, and some famous and interesting problems, of elementary number theory (where you can roughly define “elementary number theory” as “everything discovered before Gauss stepped in”). This also sets the stage for you to continue exploring related ideas in algebra, or in any of the modern subfields into which number theory has evolved (combinatorial, analytic, algebraic, etc). More precisely, we’ll cover much of the first 9 chapters of our text [KR], with a few topics from chapters thereafter as time permits. Our starting point is the integers, picking up where you left off in Grade 5 or so. 2 Chapter 2 Integers and divisibility The integers: Z = {· · · , −3, −2, −1, 0, 1, 2, 3, · · · }. 1 They form a ring under addition and multiplication. • positive integers have been understood since ancient times • zero has two roles: a place-holder in our number system, and as a number representing the absence of a quantity; we credit Indian mathematicians in the 7th century AD (including principally Brahmagupta) with this discovery, which was spread westward by arab mathematicians and eventually made it to Europe for use by Fibonacci around 1200AD. (Independently, the Mayans of south Central America had discovered zero as part of their own base-20 number system.) • negative numbers predate zero (the concept of debt and credit, for example); Chinese mathematicians were using negative numbers as early as 200 BC, and Indian mathematicians set out rules for arithmetic with negative numbers also in the 7th century AD. From the integers, the next reasonable step was to construct the rational numbers Q={ m | m, n ∈ Z, n 6= 0}/ ∼ n where ab ∼ dc whenever ad = bc; this is an abstraction where we have created a multiplicative inverse of every nonzero integer and extended the operations of arithmetic from Z to Q in the only consistent way. From now on we simply write ab = dc , thinking of the fraction symbol as the equivalence class of this relation. Q is a field2 . The next historical step is a giant leap: by thinking of (positive) rational numbers as geometric quantities (lengths), one interpolates and thus infers the real numbers, R. This was put on firm footing with the advent of analysis. (In fact, are were other ways to complete the rational numbers, leading to the field of p-adic numbers for each prime p, and these are quite interesting for number theory.) Now the question: Q and R are so easy to work with; what’s so interesting about Z? Here let’s mention just two. 1 A (commutative) ring (with unity) is a set R endowed with two operations, say + and ×, such that (R, +) is an abelian group, and (R \ {0}, ×) is a commutative monoid, such that distributivity holds (a(b + c) = ab + ac). A monoid is a set which is closed under an operation such that there is a unit for the operation in that set, and the operation is associative (and in this case, commutative). A group is a monoid such that every element has an inverse in that set. 2 A field is a commutative ring F with unit such that additionally (F \ {0}) is an abelian group. 3 • One of the key properties of the integers is the Well-Ordering Principle: Every nonempty set of positive integers contains a least element. The corresponding statement in Q or R is false. Notice that it’s the Well-Ordering Principle which allows mathematical induction to work: if we have a statement Pn which we conjecture is true for every positive integer n, then either it’s true or else there’s a least n at which it is false, and induction is the process of proving there is no such least counterexample. • Divisibility is an interesting question in Z: sometimes we can solve ax = b and sometimes we can’t. Once we are in the context of R or Q, divisibility is trivial: the solution to ax = b, when a 6= 0, is x = b/a. Questions of divisibility form the core of elementary number theory. 2.1 Divisibility Definition 2.1. Let a and b be integers, with a 6= 0. We say that a divides b if there exists an integer c such that ac = b. We write a|b in this case and say “a is a divisor of b” and “b is divisible by a.” Example 2.2. So every integer divides 0, and −2 divides 12. We excluded zero as a divisor; clearly zero does not divide any nonzero number, but we don’t want to accept the statement “0 divides 0” because it’s a weird case, the only one where the quotient is not unique. Lemma 2.3. Let a, b, c, d, x, y ∈ Z. Then the following statements are true. (a) a divides b iff −a divides b. (b) If a|b and c|d then ac|bd. (c) Transitivity: if a|b and b|c then a|c. (d) If a|b and a|c then a|(bx + cy). Proof. (a) Suppose a|b. Then there exists c ∈ Z such that ac = b. We know −c ∈ Z; and it follows that (−a)(−c) = ac = b, so (−a)|b. The converse holds since −(−a) = a. (b) If a|b then there exists x ∈ Z such that ax = b. If c|d then there exists y ∈ Z such that cy = d. Then bd = (ax)(cy) = (ac)(xy), so ac|bd. (c) If a|b then there exists x ∈ Z such that ax = b. If b|c then there exists y ∈ Z such that by = c. Then c = by = (ax)y = a(xy) so a|c. (d) If a|b then there exists r ∈ Z such that ar = b. If a|c then there exists s ∈ Z such that as = c. Then for any x, y ∈ Z, we have bx + cy = (ar)x + (as)y = a(rs + sy) so a|(bx + cy). Lemma 2.4. If a|b and b 6= 0 then |a| ≤ |b|. Proof. If a|b then there is some c ∈ Z such that ac = b. Therefore |b| = |ac| = |a| |c|. Now c ∈ Z and c 6= 0 so |c| ≥ 1. Thus |b| ≥ |a|. What can we say when a 6 |b? 4 Lemma 2.5 (Division theorem). Suppose a, b ∈ Z and a 6= 0. Then there exist unique q and r in Z such that b = qa + r and 0 ≤ r < |a|. Proof. WLOG we may assume a > 0; the case a < 0 is left as an exercise. Consider the set S = {b − qa | q ∈ Z, b − qa ≥ 0}. If 0 ∈ S, then b = qa for some q ∈ Z and we may set r = 0. Otherwise, we first claim that S is nonempty. If b > 0 then since b ∈ S we are done; otherwise, since a 6= 0, we have |ba| > |b|, so choosing q = −b gives a positive value b − qa ∈ S. Since S is a nonempty set of positive integers, it contains a least element r, which by definition is equal to b − qa for some q ∈ Z. We have only to show that r < |a| = a. For suppose r ≥ a. Then r − a ≥ 0 so b − (q + 1)a = b − qa − a = r − a ≥ 0. Thus b − (q + 1)a ∈ S, but by construction b − (q + 1)a < r, a contradiction of the minimality of r. Hence r < a. Now suppose there were two elements r1 and r2 in S satisfying 0 ≤ r1 < r2 < a. We could then write b = aq1 + r1 = aq2 + r2 whence a(q1 − q2 ) = r2 − r1 . Thus a|(r1 −r2 ) which means, since r1 −r2 > 0, that a < r1 −r2 . But this is impossible, since by construction the difference r1 − r2 is strictly less than a. Hence we must have r1 = r2 , which implies also q1 = q2 , and unicity follows. We are immediately led to the notion of prime numbers.3 . Definition 2.6. An integer p > 1 is called a prime number if its only positive factors are 1 and p. End of lecture # 1 2.2 The greatest common divisor Given two nonzero integers a, b, c, we say that c is a common divisor of a and b if c|a and c|b. By Lemma 2.4, we have that such a c satisfies |c| ≤ |a| and |c| ≤ |b|. So the set of common divisors is bounded above, and so has a maximal element. We call this value, say g, the greatest common divisor of a and b and we write g = (a, b). Lemma 2.7. Let a, b be nonzero integers and g = (a, b). Then g is the least element of the set S = {ax + by | x, y ∈ Z, such that ax + by > 0}. 3 In ring theory, we define a prime to be a nonunit p with the property that whenever p divides a product, it divides one of the factors. We show that in many rings, this is equivalent to the notion of an irreducible, which is a nonunit p with the property that whenever p = ac, one of a or c is a unit. These two notions agree over the integers (which we’ll show eventually), and so for example 5 and −5 are prime. For tradition (and ease of use), in number theory we restrict our definition to positive integers. 5 Proof. Since 1 is a common divisor of a and b, clearly g ≥ 1. S is a nonempty set of positive integers so has a least element s. Say s = ax0 + by0 Dividing s into a by the division algorithm gives us a = qs + r with 0 ≤ r < s. Thus r = a − qs = a − q(ax0 + by0 ) = a(1 − qx0 ) − b(qy0 ). If 0 < r < s then r ∈ S and r < s, contradicting the defn of s; therefore r = 0 and s divides a). Similarly s divides b, so s is a common divisor of a and b. Thus by definition s ≤ g. On the other hand, since g|a and g|b, g|(ax0 + by0 ) so g|s. By Lemma 2.4, this gives g ≤ s so g = s. Corollary 2.8. If d is a common divisor of a and b, then d|(a, b). Proof. Set g = (a, b). Then there exist x, y such that g = ax + by. Since d|a and d|b, d|(ax + by) so d|g. How do we compute the gcd of two numbers? Lemma 2.9. If a = bq + r where 0 ≤ b < r, then (a, b) = (b, r). Proof. Let d = (a, b) and e = (b, r). Then d|r and d|b so d|e. Similarly, e|a and e|b so e|d. Hence since gcds are positive, d = e. Corollary 2.10. One can compute the gcd of a and b by repeated application of the division algorithm. We call this the Euclidean Algorithm. [KR, Algorithm 2.5.16] Example: 81 = 57 + 24, 57 = 2 × 24 + 9, 24 = 2 ∗ 9 + 6, 9 = 6 + 3, 6 = 2 × 3 + 0 so gcd is 3. Example: (15, 11) = 1 and can find x, y so that 15x + 11y = 1. The Extended Euclidean Algorithm is the process of recovering the equation ax + by = (a, b) by reversing the steps of the Euclidean algorithm. (See [KR, Algorithm 2.6.4] for an very efficient implementation.) The simple (but non-efficient interpretation of the extended Euclidean algorithm is: at least set of the division algorithm, you have an equation of the form ri = qi+1 ri+1 + ri+2 , and therefore can solve for ri+2 in terms of ri and ri+1 . Repeating this process must eventually yield an equation for ri+2 in terms of a and b. In particular this holds when ri+2 = (a, b), that is, is the last remainder in the repeated division algorithm. 2.3 The Fundamental Theorem of Arithmetic We begin with a definition and a lemma. Note that if p is prime then (p, n) = p if p|n and is 1 otherwise. This motivates the definition: two nonzero integers a and b are relatively prime if (a, b) = 1. Lemma 2.11. If p is prime and p|ab then p must divide one of a or b or both. 6 Proof. Suppose p|(ab). If p|a then we are done. So now suppose p does not divide a. Then (p, a) = 1. This means there exist x, y such that 1 = px + ay and so b = pbx + aby. Now p|p and p|(ab) so p|b, and we’re done. End of lecture # 2 Theorem 2.12 (Fundamental Theorem of Arithmetic). Every positive integer can be factored as a product of primes in a unique way (up to ordering of the factors). (We accept that n = 1 factors as an empty product of primes.) Proof. Let S be the set of integers n > 1 which cannot be factored into a product of prime numbers. By the Well-Ordering Principle, if S is nonempty we may let N be the least element of S. So N cannot be prime, since then it admits the factorization N = N , a contradiction since N ∈ S. Thus N must have some factorization N = qr, with 1 < q, r < N (using Lemma 2.4 to deduce that both factors are less than N ). But then neither q nor r are in S, since N was the least element of S; thus each of q and r admit a factorization into prime factors, whence N does (as the product of all the factors of q and r). This contradicts the definition of N , whence we conclude that S was empty. Thus every integer can be factored into a product of primes. Now suppose to the contrary that the set T of all positive integers admitting at least two distinct factorizations into prime numbers is nonempty, and let N be the smallest element of T . Since N ∈ T , there exist primes p1 ≤ p2 ≤ . . . ≤ pk and q1 ≤ q2 ≤ . . . ≤ q` , such that these lists of primes are distinct and N = p1 · · · pk = q1 · · · q` . (2.1) So p1 divides N . By repeated application Lemma 2.11 we deduce p1 must divide qj for at least one j. But qj is also prime, so p1 = qj . It follows that N/pi = N/qj is a positive integer, strictly smaller than N , which admits two different factorizations into primes, and hence lies in T , contradicting minimality the of N . So T is empty, and factorization is unique. √ Corollary 2.13. If N is not prime then there exists a factor q of N with 1 < q ≤ N . Proof. If N √ is not prime, let N p1 · · · pk be a prime factorization of N , with k ≥ 2, and where p1 ≤ p2 · · · ≤ √ =√ pk . If p1 > N then p1 p2 > N N = N , which is a contradiction. 2.4 Some applications of the Fundamental Theorem of Arithmetic The Fundamental Theorem of Arithmetic has a number of useful applications. From now on we make the convention that if we say n = pe11 pe22 · · · pemm is a prime factorization of n, then this means that the pi are distinct and each ei ≥ 1 (unless specified otherwise). Lemma 2.14. Suppose n = pe11 pe22 · · · pemm is a prime factorization of n > 1. Then n is a perfect square if and only if each exponent ei is even. 7 Proof. We say n is a square if there exists some m ∈ Z such that n = m2 . Clearly if n = m2 and q1f1 · · · qrfr is a prime factorization of m, then n = m2 = q12f1 · · · qr2fr is a prime factorizaton of n (hence, by the F.T.A., the unique such factorization) such that each prime occurs with even exponent. 1 2f2 2fm in which every exponent is even, we see that Conversely, given a prime factorization n = p2f 1 p2 · · · pm f f 2 1 fm 2 n = a , where a = p1 p2 · · · pm . Lemma 2.15. Suppose n, m ≥ 1 have prime factorizations n = pe11 pe22 · · · perr m = pf11 pf22 · · · pfrr , and where this time we have only that each ei , fi ≥ 0 so that we may assume the set of primes {p1 , · · · , pr } is the same. 1. We have that n|m if and only if for each i, ei ≤ fi . 2. For each i, set gi = min{ei , fi } ≥ 0. Then (m, n) = pg11 pg22 · · · pgrr . The proof is an exercise. This lemma gives an alternate means of computing the gcd, but this is very inefficient in practice, unless the prime factorization is already known. 2.5 Applications of the GCD: Linear Diophantine Equations An equation of the form ax + by = n for which a, b, n are integers and we seek an integer solution (x, y) ∈ Z2 is called a linear Diophantine equation. These were first systematically solved by Brahmagupta. We begin with a lemma, generalizing a result we noted for prime numbers earlier. Lemma 2.16. Let a, b, c ∈ Z. If c|ab and (b, c) = 1, then c|a. Proof. One could argue from prime factorization; alternately (and perhaps more elegantly): Since (b, c) = 1, there exist x, y ∈ Z such that bx + cy = 1. Thus abx + acy = a. Since c divides each term of the left side, it divides the right; that is, c|a. Lemma 2.17. Suppose d = (a, b). Then (a/d, b/d) = 1. Proof. We have d = ax + by for some x, y ∈ Z; since d|a and d|b we may divide to obtain 1 = (a/d)x + (b/d)y and thus (a/d, b/d) = 1, by minimality of this expression. Theorem 2.18. The equation ax + by = n has a solution if and only if (a, b)|n. In this case, if (x0 , y0 ) is any solution, then all other solutions are of the form x = x0 + b t, (a, b) y = y0 − for t ∈ Z. 8 a t (a, b) Proof. Certainly if d|n then there’s a solution; and if there is a solution then d|n. Given one solution, any other satisfies a(x − x0 ) + b(y − y0 ) = 0 or a(x − x0 ) = −b(y − y0 ) We divide through by d to get a b (x − x0 ) = (y0 − y) d d but since these coefficients are relatively prime by our lemma, our first lemma yields a |(y0 − y) d b |(x − x0 ) d and Let t be such that y0 − y = t ad ; by simplifying the equation above we get x − x0 = t db . This yields the result. Example 2.19. Here’s a problem posed by Euler: A farmer lays out the sum of 1770 crowns in purchasing horses and oxen. He pays 31 crowns for each horse and 21 crowns for each ox, and buys more horses than oxen. How many horses and oxen did the farmer buy? The equation we wish to solve is 31x + 21y = 1770 which we know from linear algebra has infinitely many solutions in the real numbers. By Brahmagupta’s theorem, since (31, 21) = 1, we know this equation has infinitely many solutions over Z. However, the only correct solutions would be those in N × N. Solution: We can either apply the extended Euclidean algorithm, or else just look really hard, to find 31(−2) + 21(3) = 1 is one solution to 31x + 21y = 1. Therefore, a solution (x0 , y0 ) to our sought equation is x0 = −2(1770) = −3540, y0 = 3(1770) = 5310. By the theorem, all solutions have the form y = y0 − 31k. x = x0 + 21k, We want nonnegative solutions; so we need −3540 + 21k ≥ 0 21k ≥ 3540 k ≥ 168.6 as well as 5310 − 31k ≥ 0 31k ≤ 5310 k ≤ 171.2 9 We conclude that the valid answers are for 169 ≤ k ≤ 171. These give: k = 169: x = 9, y = 71 k = 170: x = 30, y = 40 k = 171: x = 51, y = 9 and we deduce the unique solution is this last one (since there were more horses than oxen). 2.6 Prime numbers So how many primes are there? Theorem 2.20 (Euclid, ∼ 300BC). There are infinitely many prime numbers. Proof. Suppose to the contrary that there were only finitely many prime numbers. Then we could enumerate them all as p1 , · · · , pn . Consider N = p1 p2 · · · pn + 1. Since N ∈ Z and N > 1 it can be factored as a nontrivial product of primes. Let pk be a prime factor of N . Then pk |N and by construction pk |(p1 · · · pn ), so pk |(N − p1 p2 · · · pn ). Thus pk |1, implying pk = 1, a contradiction. Hence there must be infinitely many primes. Since ancient times, people have been trying to find a pattern in the set of prime numbers — a formula that could produce a prime number, or predict if a number were prime. But primes stubbornly refused to present any regularity, until Gauss, at age 14, got a book of logarithms and of prime numbers. He tried to find patterns; and his key idea was to consider their distribution as a probability. That is, he defined the following function. Definition 2.21. Let x ∈ R and define π(x) = #{n ∈ N | n ≤ x, n is prime} to be the number of primes up to x. For example: x 100 103 104 106 108 1012 π(x) 25 168 1229 78498 5,761,455 37,607,912,018 log10 x × π(x) 50 504 4916 470,988 46,091,640 ∼ 4.5 × 1011 Gauss noticed that π(x) ∼ x log(x) 10 ln(x) × π(x) ∼ 115 ∼ 1160 ∼ 11319 ∼ 1, 084, 489 ∼ 1.06 × 108 ∼ 1.04 × 1012 where here log is the natural logarithm (base e). More precisely, he conjectured (that 1 x x < π(x) < 2 2 log(x) log(x) for all x. Chebyshev proved this result (and a slight refinement of it, with 1850. 1 2 and 2 replaced by some constants) in Remark 2.22. In fact, Gauss refined his conjecture by defining the logarithmic integral Z x 1 Li(x) = dt 2 log(t) and suggesting that π(x) ' Li(x). This is a better estimate than the above quotient if you are considering the difference |π(x) − Li(x)| as x → ∞ (rather than the quotient of π(x) by its estimate). End of lecture # 3 2.7 Further notes on primes Theorem 2.23 (Prime Number Theorem). (Proved by Hadamard and de la Vallée-Poussin in 1896; reproved by “elementary” methods in 1949 by Selberg and Erdös) lim x→∞ π(x) =1 x/ log(x) The interpretation of this theorem is, for example: consider x = 109 . Then log(x) = 21 so about 1 out of every 21 numbers near x is prime. Unfortunately (or fortunately), this gives you no information whatsoever about the primality or factorization of any given number near 109 . Eratosthenes (around 200BC) discovered a sieve method for producing a list of all prime numbers up to N , now called the sieve of Erathosthenes: given a table of all numbers from 2 to N , circle 2 and cross off all multiples √ of 2; then choose the smallest noncrossed entry, circle it and cross off all of its multiples; repeat until N ; circle all remaining uncrossed numbers. The circled numbers are all the primes up to N . This sieve method is quite effective for finding primes, up to a point. Nowadays, you can find lists of prime numbers on the internet, up to quite large numbers. Another strong interest is to produce formulas or sequences which generate prime numbers. Some exist (see [R]) but are of limited use (since they depend on real constants who exact value is unknown, for example, or because the effort of using them to compute new primes is worse than using the sieve of Eratosthenes). We discuss some interesting failed examples next. Mersenne primes Proposition 2.24. Let n ≥ 2, a ≥ 2. If an − 1 is prime then a = 2 and n is prime. 11 Proof. We have that an − 1 = (a − 1)(an−1 + · · · + a + 1); so if an − 1 is prime, one of these factors must be 1. If n ≥ 2 the second factor is not 1, so we have a−1=1 or a = 2. Now suppose n = rs is composite. Setting a = 2r in the expression above we have that 2rs − 1 = (2r − 1)(2r(s−1) + · · · + 2r + 1) which, for r, s > 1, is a nontrivial factorization of 2n − 1. So if 2n − 1 is prime then necessarily n is prime. Mersenne guessed: Mp = 2p − 1, for p a prime, would always be prime. We see that Mp is prime for p ∈ {2, 3, 5, 7}. But M11 is composite; and so far there are only some 47 known Mersenne primes. Definition 2.25. A number of the form a = 2n − 1, n ≥ 1, is called a Mersenne number and if a is prime then it’s called a Mersenne prime. Fermat primes Proposition 2.26. Let n ≥ 1, a ≥ 2. If an + 1 is prime, then a is even and n = 2r for some r ≥ 0. Proof. Suppose an + 1 is prime; then (since an ≥ 1) it is odd, so an must be even, so a is even. A variant on our usual telescoping formula for factorization yields, for odd n, the factorization: an + 1 = (a + 1)(an−1 − an−2 + an−3 − · · · − a + 1) so if an + 1 is prime, then n is even. Furthermore, if n = 2q m for some odd m, then we could apply the q above identity with a replaced by a2 and n replaced by m to again derive a contradiction; hence n = 2q for some q. n Fermat guessed that all Fn = 22 + 1 were prime. We see that Fn is prime for n ∈ {0, 1, 2, 3, 4}. No other prime Fn have been found to date. Definition 2.27. Fn is called a Fermat number; when Fn is prime it is called a Fermat prime. Other formulas The number n2 + n + 41 is prime for all n up to 39, but n = 40 gives a composite answer. In fact, no polynomial in one variable can always yield primes (since for any n, p, h, we have f (n + ph) = f (n) + px for some integer x). So we don’t have formulas for primes, but: Finding huge primes isn’t too tough these days: by the Prime Number Theorem, you know about how many odd numbers of a given size you ought to generate at random to ensure that at least one is prime; and then there are many efficient probabilistic algorithms (even a deterministic polynomial time algorithm!) that tell you if a given number is prime. We’ll come back to this a bit later, when we see some of the ways that prime numbers essentially self-identify on tests. 12 2.8 Alternate proof of the infinitude of primes: the zeta function Definition 2.28. The zeta function is defined on all real numbers s with s > 1 by: ζ(s) = ∞ X 1 . ns n=1 We know from analysis that this series converges for all s > 1 but when s = 1 it is the harmonic series ∞ X 1 n n=1 which diverges. In some cases its explicit value has been computed; for example Euler proved that ζ(2) = π 2 /6. The zeta function is related to primes by the following theorem of Euler: Theorem 2.29. For any s > 1, ∞ Y X 1 1 = . ns 1 − p−s n=1 p prime Proof. Recall the geometric series: n X xn = n=0 −s whenever |x| < 1. Setting x = p 1 1−x < 1, we have 1 = 1 + p−s + p−2s + · · · ≥ 1. 1 − p−s So the right hand side is Y 1 + p−s + p−2s + · · · p prime but what is an infinite product of an infinite sum? Well, denote the kth prime by pk (so p1 = 2, p2 = 3, etc). Then we define this expression as: lim m Y m→∞ −2s 1 + p−s + ··· k + pk k=1 Since the series on the right are absolutely convergent, we can manipulate at will: m Y −2s 1 + p−s + ··· = k + pk X 2s p1−i1 s p−i · · · pk−ik s 2 i1 ,...,ik ∈N k=1 = X (pi11 pi22 i1 ,...,ik ∈N 1 · · · pikk )s = X 1 ns n∈Nm where Nm is the set of all positive integers whose unique factorization includes only the first m primes. Now (a) since the elements of Nm are distinct and (b) since the numbers 1 up to m are all in the set Nm , we have the two inequalities: m X 1 X 1 ≥ . ζ(s) ≥ s ns n n=1 n∈Nm 13 which yields ζ(s) ≥ m Y k=1 m X 1 −2s 1 + p−s + p + · · · ≥ k k ns n=1 so that upon taking the limit as m → ∞, we have the sought equality by the squeeze theorem. Corollary 2.30. There are infinitely many primes. Proof. If the number of primes is finite, then the Euler product E(s) is a finite product, and hence computes to a finite number, for any s ∈ R. The geometric series used in the proof of the above argument is convergent even for s = 1, so our argument holds, and we could conclude that ζ(1) = E(1) < ∞. But the zeta function diverges at s = 1, being a harmonic series; hence a contradiction. E(1) must also diverge. Riemann (1826-1866) lived about 100 years after Euler (1707-1783). He considered the function ζ(s) for s ∈ C. The formula for ζ(s) converges for all s ∈ C such that re(s) > 1. One can show that ζ can be extended (analytically continued) to a complex-valued function on all of C in the sense that there exists a meromorphic function on C which agrees with ζ(s) wherever ζ(s) converges. We call this meromorphic function the Riemann zeta function. 2.9 Exercises 1. Let S be a set of integers which is bounded above, that is, there exists a such that for all x ∈ S, x ≤ a. Deduce from the Well Ordering Principle that S has a maximal element. 2. Find the prime factorization of 13!. 3. Show that if n = pa1 1 · · · pakk with p1 < p2 < · · · < pk and a1 a2 · · · ak > 0 (for example, 12 = 22 31 ) then the total number of positive divisors of n is ν(n) = (a1 + 1)(a2 + 1) · · · (ak + 1). 4. Prove Lemma 2.15. 5. Define the least common multiple of a and b to be the least n ≥ 1 such that a|n and b|n. Show that lcm(a, b) divides any other common multiple of a and b, and that lcm(a, b)|ab. 6. Prove that the sieve of Eratosthenes works, that is, prove that all the circled numbers at the end of the algorithm are prime, and that these exhaust all primes up to N . 7. Prove that for any k > 0, there exist k consecutive composite integers. 8. Show that for any n, p, h, if f is a polynomial with integer coefficients we have f (n + ph) = f (n) + px for some integer x. The binomial theorem is useful here. Use this to deduce that f (m) cannot be prime for all m. 14 Chapter 3 Modular Arithmetic A key tool in elementary number theory is to use modular arithmetic. For material in this chapter, see [KR, Ch 3]. 3.1 3.1.1 The ring Z/nZ The set Z/nZ Definition 3.1. Let a, b ∈ Z, and n ≥ 1. We say a is congruent to b mod n, and write a ≡ b (mod n), whenever n|(a − b). Thus, for example, 1 ≡ 5 (mod 4), and a ≡ 0 (mod n) exactly when n|a. For all a, b ∈ Z, a ≡ b (mod 1), so we often exclude this trivial case. Lemma 3.2. Congruence is an equivalence relation, that is, it satisfies: 1. (reflexivity) a ≡ a (mod n) 2. (symmetry) if a ≡ b (mod n) then b ≡ a (mod n). 3. (transitivity) if a ≡ b (mod n) and b ≡ c (mod n) then a ≡ c (mod n). The proof is an exercise. Since congruence is an equivalence relation, we deduce that we can partition the integers into equivalence classes according to this relation. Fix n and write a for the class of a (mod n). Then for example if n = 10 then 1 = {. . . , −19, −9, 1, 11, 21, . . .}; and this is 11 = −9 as well. We define Z/nZ = {a | a ∈ Z}. Suppose a ∈ Z; then we can apply the division algorithm to write a = qn + r with 0 ≤ r < n. Since n|(a − r), we have that a ≡ r (mod n), and so a = r. We conclude that every equivalence class contains a smallest nonnegative representative r, with 0 ≤ r < n. No two of these are in the same congruence class. Hence we may write Z/nZ = {0, 1, . . . , n − 1}. 15 End of lecture # 4 3.1.2 Arithmetic on Z/nZ Let us consider some properties and operations on Z, and see which ones descend to define properties and operations on Z/nZ. Throughout, let us assume n > 1 to avoid the trivial case. Proposition 3.3. Let a, b, c, d ∈ Z and suppose n ≥ 2. If a ≡ b (mod n) and c ≡ d (mod n) then • a + c ≡ b + d (mod n), and • ac ≡ bd (mod n). Consequently, the operations of addition and multiplication mod n are well-defined, that is, we may define a + b := a + b and a · b := ab because this is independent of the representatives (a and b) chosen. The proof is an exercise. So we write 4 + 5 = 2 ∈ Z/7Z, or 1 + 1 ≡ 0 (mod 2). Corollary 3.4. Let a, b, c ∈ Z and n ≥ 2. The following properties of arithmetic hold in Z/nZ. 1. addition is commutative: a + b = b + a 2. addition is associative: (a + b) + c = a + (b + c) 3. there is a zero element: a + 0 = a 4. every element has an additive inverse: a + −a = 0, where −a = −1 · a 5. multiplication is commutative: a · b = b · a 6. multiplication is associative: (ab)c = a(bc) 7. there is a multiplicative unit: a1 = a 8. multiplication distributes over addition: a(b + c) = ab + ac. However, in general not every nonzero element of Z/nZ has a multiplicative inverse, nor does the cancellation property of multiplication necessarily hold. Proof. These properties all hold for integer arithmetic, and hence hold in modular arithmetic since we may check both sides using any representative of each equivalence class. To prove the final assertions, it suffices to give examples. So for example, 2 6= 0 mod 4 has no multiplicative inverse since this would require the existence of an integer a such that 2a ≡ 1 (mod 4), meaning 4|(2a − 1). But 2a − 1 is odd; impossible. We also have in Z/15Z that 5 · 4 = 20 = 5 = 5 · 1; hence since 4 6= 1, cancellation fails. 16 This corollary identifies Z/nZ as a ring, which is a well-behaved algebraic object, and will be pleasant to work with (particularly since it is finite, a nice advantage over Z). Example 3.5. Let n = 9. Then we calculate 10 ≡ 1 mod 9 100 ≡ 1 mod 9 1000 ≡ 1 mod 9 Hence 486 = 4(100) + 8(10) + 6 =4+8+6 = 18 = 10 + 8 =1+8 =9=0 which is just the familiar rule of casting nines. One can formulate the rules for divisibility by 2 and 5 in the same way. Example 3.6. Let n = 4. Then 100 ≡ 0 (mod 4), so 234248972 ≡ 72 mod 4 whence divisibility by 4 is determined by the last two digits. More generally, one can use the properties to simplify what at first seem monstrous calculations. Example 3.7. Compute 232 mod 11. By this we mean: find the least nonnegative representative of the mod 11 congruence class of 232 . One option: compute 23 2 (around 4 billion) and then divide by 11 to find the remainder. Second option: we note that 25 = 32 ≡ −1 (mod 11) and so 2 2 210 = 25 = −1 = 1 whence 3 23 2 = 210 22 = 14 = 4 so the answer is 4. We can also use modular arithmetic to change questions about all integers to questions about a finite set, which can then be answered exhaustively. Example 3.8. Prove that if n is an odd integer then n2 − 1 is a multiple of 8. One option: consider prime factorization. Another option: Let’s rephrase this as a question in modular arithmetic. We wish to show that for any odd integer n, n2 ≡ 1 (mod 8). Thus it suffices to show that (in Z/8Z, n2 = 1 holds for n ∈ {1, 3, 5, 7}. It does, and so we are done. 17 Example 3.9. Show that all integer solutions to x2 + y 2 + z 2 = xyz are such that x, y and z are divisible by 3. Solution: It suffices to show that if equality holds mod 3, then x,y and z are all 0. We can fill in a table of all possible combinations. If any of x,y or z are 0, then the right size is 0 and the left side is a sum of 0, 1 or 2 nonzero squares. But the only squares mod 3 are 0 and 1, so equality can only hold if all are 0. Now suppose than none are 0; then the right side is nonzero while the left side is 1+1+1 = 0, a contradiction. Hence the only solution was all 0, as required. Exercise: Show that if m ≡ 3 (mod 4), then m cannot be the sum of two squares. Exercise: Find a formula for 24k mod 5. Remark 3.10. Let us conclude this section by listing some properties which do not descend to properties of Z/nZ. That is, the following statements and symbols are undefinable: • a is prime • a is even/odd (unless n is even) • (a, b) • ab And there are many more. 3.1.3 Cancellation and invertibility in Z/nZ So far we have held n fixed and considered operations on the set of class mod n. Lemma 3.11. If a ≡ b (mod n) and d|n then a ≡ b (mod d). This is clear from the definition, and the transitivity of divisibility. Proposition 3.12. Suppose a, b, c ∈ Z with c 6= 0, and n ≥ 1. Then ac ≡ bc(mod n) ⇒ a ≡ b(mod n ) (n, c) Proof. Set d = (c, n) and suppose ac ≡ bc (mod n). Then n|(ac − bc) which implies n|c(a − b). Consequently, n c | (a − b) d d since these fractions are integers. Now by a previous lemma, (n/d, c/d) = 1, which allows us to conclude that n |(a − b) d whence the result. We deduce that the cancellation law holds for those c for which (n, c) = 1 holds. (Exercise: check that this property is independent of the choice of representative c; this is implicit in the proof of the preceding lemma.)The subset of such c is of particular importance to us. 18 Definition 3.13. Let n ≥ 2 and set Un = {a ∈ Z/nZ|(a, n) = 1}; we call this the group of units of Z/nZ, and it is sometimes denoted Z/nZ∗ . For example, U8 = {1, 3, 5, 7} whereas U5 = {1, 2, 3, 4}. Definition 3.14. Let n ≥ 2. The number of elements in Un is denoted φ(n). We set φ(1) = 1. The function φ is called Euler’s totient function, or sometimes just Euler’s phi function. For example, we have φ(2) = 1, φ(3) = 2, φ(4) = 2, φ(5) = 4, φ(6) = 2, φ(7) = 6, φ(8) = 5, . . . Is there an easier way to compute this function, rather than listing the elements of Un ? Lemma 3.15. We have • If p is prime, then φ(p) = p − 1 and for any k ≥ 1, φ(pk ) = pk − pk−1 . • If m, n ≥ 1 and (m, n) = 1 then φ(mn) = φ(m)φ(n). Proof. First suppose that p is prime. Then p is relatively prime to all positive integers less than p, whence |Up | = p − 1 = φ(p). Now let k ≥ 1. A number x < pk fails to be relative prime to pk if and only if it admits p as a prime factor. There are exactly pk−1 multiples of p less than pk (including 0), so φ(pk ) = pk − pk−1 . Finally, suppose m, n ≥ 2 and (m, n) = 1. (The case that either is 1 is trivial.) Note that (a, mn) = 1 iff (a, m) = 1 and (a, n) = 1. Form a matrix of all the integers from 0 to mn − 1, as follows: 0 m 2m 3m .. . 1 m+1 2m + 1 3m + 1 .. . 2 m+2 2m + 2 3m + 2 .. . ··· ··· ··· ··· .. . m−1 2m − 1 3m − 1 4m − 1 .. . (n − 1)m (n − 1)m + 1 (n − 1)m + 2 ··· nm − 1 Consider the jth column, whose entries are j, m + j, 2m + j, · · · , (n − 1)m + j • We see that either each of these numbers is relatively prime to m, or else none of them are. Therefore there are φ(m) columns in which all elements are relatively prime to m. • Also, no two of these elements can be congruent modulo n, since km + j ≡ `m + j (mod n) implies km ≡ `m (mod n) which, since (m, n) = 1, implies k ≡ ` (mod n). But since 0 ≤ k, ` < n, this is impossible unless k = `. Hence the n elements in the jth column represent the n different congruence classes modulo n. Of these, there are φ(n) which are relatively prime to n (but we can’t say in which order they appear in the column). 19 We have identified φ(m) columns in which all elements are relatively prime to m, and within these, φ(n) elements are also relatively prime to n. This gives exactly φ(m)φ(n) elements in the table which are relatively prime to both m and n, and hence to mn. Example 3.16. φ(49) = 49 − 7 = 42; φ(15) = φ(5)φ(3) = 8. Exercise: find φ(360) and φ(100). Now that we know how many elements are in the group of units Un , what else can we say about this set? Proposition 3.17. For n ≥ 2, Un is a group under the multiplication in Z/nZ. That is, it is closed under multiplication, contains 1, and each element a of Un has a multiplicative inverse. We note that commutativity and associativity of multiplication in Un follow from that of multiplication in Z/nZ. Proof. Clearly (1, n) = 1 so 1 ∈ Un . Closure: if (a, n) = 1 and (b, n) = 1 then (ab, n) = 1, whence if a, b ∈ Un , we have ab ∈ Un . Multiplicative inverse: If (a, n) = 1 then there exist x, y ∈ Z such that ax + ny = 1. This means that modulo n we have ax + ny = 1 whence ax = 1. The equation ax + ny = 1 assures us that (x, n) = 1, so x is the multiplicative inverse we sought. Example 3.18. In U5 , we have 1−1 = 1 and 2−1 = 3; the rest follow from the identity (a−1 )−1 = a. In U31 we have 21−1 = 3, by a previous calculation. In U8 , we have 5−1 = 5; in fact all elements are self-inverse. 3.2 Exercises 1. Give examples to show that each of the properties in Remark 3.10 and each of the symbols are undefined, by showing that these concepts depend on the choice of representative of the given class. 2. Is the sum of three consecutive cubes always divisibly by 9? 3. If a ≡ b (mod n), does it follow that (a, m) = (b, m)? Q 4. Prove the formula for n > 1: φ(n) = n p|n (1 − p1 ). That is, take all primes dividing n, compute Q 1 − 1/p for each of these, multiply them all together, and multiply the result by n. is the product symbol and the subscript indicates the product runs over the set of prime divisors of n. 20 Bibliography [KR] Ramanujachary Kumanduri, Cristina Romero. Number theory with computer applications, Prentice Hall 1998. [R] Paulo Ribenboim, The Little Book of Big Primes, Springer, 1991. 21