MATH 537 Class Notes Ed Belk Fall, 2014 1 Week One 1.1 Lecture One Instructor: Greg Martin, Office Math 212 Text: Niven, Zuckerman & Montgomery Conventions: N will denote the set of positive integers, and N0 the set of nonnegative integers. Unless otherwise stated, all variables are assumed to be elements of N. §1.2 – Divisibility Definition: Let a, b ∈ Z with a 6= 0. Then a is said to divide b, denoted a|b, if there exists some c ∈ Z such that ac = b. If in addition a ∈ N, then a is called a divisor of b. Properties of Divisibility: For all a, b, c ∈ Z with a 6= 0, one has: • If a|b then ±a| ± b • 1|b, b|b, a|0 • If a|b and b|a then a = ±b • If a|b and a|c, then a|(bx + cy) for any x, y ∈ Z If we assume that a and b are positive, we also have • If a|b then a ≤ b The Division Algorithm: Let a, b ∈ N. Then there exist unique natural numbers q and r such that: 1. b = aq + r, and 2. 0 ≤ r < a Proof : We prove existence first; consider the set R = {b − an : n ∈ N0 } ∩ N0 . By the well-ordering axiom, R has a least element r, and we define q to be the nonnegative integer q such that b − aq = r. Then b = aq + r and r ≥ 0; moreover, if r ≥ a then one has 0 ≤ r − a = (b − aq) − a = b − a(q + 1) < b − aq + r, contradicting the minimality of r ∈ R, and we are done. 1 Now, suppose q 0 and r0 are such that we have b = aq + r = aq 0 + r0 . Without loss of generality we may assume than r ≥ r0 . Then r − r0 = (b − aq) − (b − aq 0 ) = a(q 0 − q) ⇒ a|(r − r0 ); but 0 ≤ r − r0 ≤ r < a, and so the above equation is a contradiction unless r − r0 = 0, and the result is immediate. Greatest Common Divisor: Given any two integers a and b not both equal to zero, we define their greatest common divisor (commonly abbreviated gcd) to be the largest d ∈ N such that d|a and d|b; we write d = (a, b). Note that because a and b each have only finitely many divisors, the gcd is always well-defined. Theorem 1.1.1 Let a, b ∈ Z, not both equal to zero. Then: 1. (a, b) = min S, where S = ({ax + by : x, y ∈ Z} ∩ N), and 2. For any c ∈ Z such that c|a and c|b, we have c|(a, b). The existence of integers x, y so that ax + by = (a, b) as in part (1) is known as Bézout’s identity. Proof : 1. Let m = min S, with u and v such that m = au + bv, and let g = (a, b); note that m ≤ a. Since g|a and g|b, we know from the properties of divisibility that g|m and so g ≤ m. Now, if m - a then by the division algorithm we may write a = mq + r with 0 < r < m, and thus r = a − mq = a − q(au + bv) = a(1 − qu) + b(−qv) ∈ S, and we deduce that r ≥ m = min S, a contradiction; thus m|a. In the same fashion we show m|b, and so by definition m ≤ (a, b) = g, and we are done. 2. If c|a and c|b, then we know c|(ax + by) for every x, y ∈ Z, and in particular for those u, v such that (a, b) = au + bv, whose existence is guaranteed by part 1. 2 1.2 Lecture Two Recall: Bézout’s identity states that (a, b) is the smallest positive integer that may be written ax + by, where x, y ∈ Z. Proposition 1.2.1 For a, b ∈ N, one has (ma, mb) = m(a, b). a b Corollary 1: If d|a, d|b, then ad , db = d1 (a, b); in particular, (a,b) , (a,b) = 1. Proof : Set g = (a, b), so that we may write ax + by = g, for some x, y ∈ Z. Then mg = (ma)x + (mb)y, thus mg ≥ (ma, mb). Furthermore, g|a and so mg|ma; similarly mg|mb, thus mg ≤ (ma, mb), and we are done. Definition: Two integers a and b are called relatively prime (or coprime) if (a, b) = 1. nb. We observe that (a, b) = 1 if and only if there exist x, y such that ax+by = 1. The corresponding statement with (a, b) = k > 1 is not, in general, true, however it is the case that ax + by = k ⇒ (a, b)|k. Proposition 1.2.2 If (a, n) = (b, n) = 1, then (ab, n) = 1. Proof : Suppose we have u, v, x, y so that au + nv = bx + ny = 1; then we have 1 = 1 · 1 = (au + nv)(bx + ny) = ab(ux) + n(auy + bvx + nvy), and the result is immediate. [Aside: Compare with the analagous result in commutative algebra. If R is a commutative, unital ring and I, J, K ⊂ R are ideals such that I + K = J + K = R, then IJ + K = R.] Proposition 1.2.3 If a|c, b|c, and (a, b) = 1, then ab|c. (Note that this is not, in general, true for (a, b) > 1, e.g. a = b = c = 2.) Proof : Choose m, n, x, y so that c = am = bn and ax + by = 1. Then c = cax + cby = (bn)ax + (am)by = ab(nx + my), and we deduce that ab|c. Theorem 1.2.4 (Theorem 1.10, Niven) If d|ab and (b, d) = 1, then d|a. Proof : Exercise. nb. If d|a, d|b, then d|b + ax for any x ∈ Z. In fact, the condition is also necessary, as b = (b + ax) − x(a). The Euclidean Algorithm: How can we find the gcd of two integers, for example 537 and 105? By the division algorithm, we have 537 = 5 · 105 + 12, and so by the above note we know (537, 105) = (105, 12). Repeating this process, we see 105 = 8 · 12 + 9 ⇒ (105, 12) = (12, 9); 12 = 1 · 9 + 3 ⇒ (12, 9) = (9, 3); 3 9 = 3 · 3 + 0 ⇒ (9, 3) = (3, 0) = 3. Thus (537, 105) = 3. Notation: The least common multiple of a and b is denoted lcm(a, b) or, more commonly, [a, b]. Exercise: Show that (a, b)[a, b] = ab. §1.3 – Primes Definition: A natural number n is called prime if it has exactly two divisors. n is called composite if there exists some d with 1 < d < n such that d|n. The integer n = 1 is neither prime nor composite. Notation: Unless otherwise stated, p will denote a prime number. Lemma 1.2.5 (Euclid’s lemma) If p|ab, then p|a or p|b. Proof : Suppose p - b. Then (p, b) = 1, and so by theorem 1.2.4 we know that p|a. Theorem 1.2.6 (The Fundamental Theorem of Arithmetic) Every n ∈ N, n > 2 may be written as the product of primes; moreover this expression is unique up to reordering of the factors. Proof : (existence) We use strong induction. The case n = 2 is trivial from the definition of a prime, therefore suppose n > 2. If n is prime we have the trivial factorization n = n, otherwise we may write n = ab, with 1 < a < n and 1 < b < n. By the inductive hypothesis we may write a = p1 p2 · · · pk , b = q1 q2 · · · ql , with each pi , qj prime, and the result is immediate. (uniqueness) Let n ∈ N and suppose we have n = p1 p2 · · · pk = q1 q2 · · · ql , each pi , qj prime. Since p1 |q1 q2 · · · ql we have by lemma 1.2.5 that p1 |q1 or p1 |q2 · · · ql . Repeating this process as many times as necessary, we find qt such that p1 |qt , and by relabelling the qj if necessary we will assume t = 1. Since p1 6= 1 this implies that p1 = q1 , as q1 has no other factors. We then cancel p1 = q1 on both sides of the equation and we have p2 p3 · · · pk = q2 q3 · · · ql . We apply the same argument to this expression to obtain p2 = q2 , p3 = q3 , and so on; it follows that k = l, and we are done. 4 2 2.1 Week Two Lecture Three Doing a linear algebra problem backwards. Consider the augmented matrix 1 0 537 ; 0 1 105 x 537 this system clearly has solution = . Moreover, from basic linear algebra we know that the application y 105 of elementary row operations to this augmented system will not change the solution; with R1 , R2 therefore, x 537 respectively denoting the first and second row of the matrix, we observe that = is also a solution y 105 to the augmented matrices 1 −5 12 (R1 → R1 − 5R2 ), 0 1 105 1 −5 12 (R2 → R2 − 8R1 ), −8 41 9 9 −46 3 (R1 → R1 − R2 ), −8 41 9 9 −46 3 (R2 → R2 − 3R1 ). −35 179 0 Thus we have the matrix equation 9 −46 −35 179 3 537 . = 0 105 The first entry of this equation indicates that 9(537) + (−46)(105) = 3 = (537, 105), while the entries in the 105 537 second row of the matrix are −35 = − (537,105) and 179 = (537,105) . This operation is known as the extended Euclidean algorithm. Lemma 2.1.1 Let a, b ∈ N and use the division algorithm to write b = aq + r with 0 ≤ r < a. Then a|b if and only if r = 0. Proof : If r = 0 then b = aq and we are done. Conversely, if a|b then a|b−ax for every x, and since r = a−bq < a, we must have r = 0. Theorem 2.1.2 (Euclid’s theorem) There are infinitely many prime numbers. Proof : It suffices to show that every finite list of primes excludes at least one prime number. Let {p1 , p2 , . . . , pk } be a set of finitely many primes and let N = p1 p2 · · · pk + 1. Then N ≥ 2 and so by the fundamental theorem of arithmetic N is the product of primes, so there exists some prime p such that p|N . Applying the division algorithm with N and any pj yields N = pj (p1 · · · pj−1 pj+1 · · · pk ) + 1, which (since 1 < pj ) by lemma 2.1.1 implies that pj - N for any j. Thus we deduce that p 6= pj for any j = 1, 2, . . . , k, and therefore that the set of primes {p1 , p2 , . . . , pk } is not exhaustive. 5 §2.1 – Congruences Definition: Let m ∈ Z, m 6= 0. Given a, b ∈ Z, we say that a is congruent to b modulo m, written a ≡ b mod m, if m|(b − a). For example, we have 53 ≡ 7 mod 23, but 5 6≡ 37 mod 23. Lemma 2.1.3 For fixed m 6= 0, “congruence modulo m” is an equivalence relation. Proof : Clearly a ≡ a mod m because m|0 = a − a, which proves reflexivity. Symmetry is an immediate consequence of the fact that m|(b − a) ⇔ m|(a − b), and to prove transitivity we observe that a ≡ b mod m, b ≡ c mod m ⇒ m|(b − a), m|(c − b) ⇒ m|(c − b) + (b − a) = (c − a), and we are done. Thus in particular, congruence modulo m (as every equivalence relation) partitions Z into equivalence classes, called residue classes modulo m. For example, one residue class modulo 23 is the set {. . . , −39, −16, 7, 30, 53, . . .}. In general, a residue class modulo m is of the form {a + km : k ∈ Z}. Note in particular that a ≡ b mod m if and only if a and b have the same remainder when dividing by m. Lemma 2.1.4 Suppose a ≡ b mod m, c ≡ d mod m. Then: 1. If d|m then a ≡ b mod d, 2. a + c ≡ b + d mod m, 3. ac = bd mod m. Proof : We prove only (3), as the others are clear from the definitions: since m|(b − a), m|(c − d), we must have that m divides c(b − a) + b(d − c) = bd − ac, and the result follows. The last two parts of lemma 2.1.4 imply further that a − c ≡ b − d mod m, and more generally, if f (X) ∈ Z[X], then f (a) ≡ f (b) mod m whenever a ≡ b mod m. In particular, we have that ak ≡ bk mod m for any k ∈ N. Question: If j ≡ k mod m, do we have aj ≡ ak mod m? In general, no: some counterexamples include a = 2, m = 3 or a = 2, m = 4. We have seen that the operations of addition, subtraction, and multiplication behave well with respect to congruence modulo m; does division? Again, in general the answer is no: 18 ≡ 28 mod 10, but 9 6≡ 14 mod 10, as we might expect if we were allowed to “divide by 2.” Theorem 2.1.5 (Theorem 2.3, Niven) We have ax ≡ ay mod m if and only if x ≡ y mod if (a, m) = 1 then ax ≡ ay mod m ⇔ x ≡ y mod m. 6 m (a,m) . In particular, m m a a Proof : Suppose ax ≡ ay mod m so that m|a(y−x); then we have (a,m) | (a,m) (y−x), and since (a,m) , (a,m) =1 m m m m we know that (a,m) |(y − x), hence x ≡ y mod (a,m) . Now, suppose x ≡ y mod (a,m) so that (a,m) |(y − x). Then a m |a(y − x), hence (a,m) m|a(y − x) and so in particular m|a(y − x), and we are done. we certainly have a (a,m) Definition: Given m ∈ Z, m 6= 0, a complete residue system modulo m is a set containing exactly one element from each residue class modulo m. For example, with m = 5 we may take any of the sets {0, 1, 2, 3, 4}, {1, 2, 3, 4, 5}, {−2, −1, 0, 1, 2}, or {−17, 60, 101, 12, −111}. A reduced residue system is a set of representatives from all residue classes relatively prime to m; continuing in the same example, we may take {1, 2, 3, 4} or {537, −7, 1, 99999929}. 7 2.2 Lecture Four Recall: A reduced residue system modulo m is a set consisting of exactly one element form each residue class modulo m whose elements are relatively prime to m; these are called reduced residue classes. Equivalently, we may take any complete residue system modulo m, and discard all elements d such that (d, m) > 1. Example: If m = 10, a complete residue system is given by {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; by discarding all elements not relatively prime to 10, we obtain the reduced residue system {1, 3, 7, 9}. If m is prime, a reduced residue system is given by {1, 2, . . . , m − 1}. Definition: The Euler φ-function (or Euler totient function) is the function which assigns to m ∈ N the cardinality of a reduced residue system modulo m; that is, φ(m) = #{1 ≤ i ≤ m : (i, m) = 1}. For example, φ(10) = 4, and φ(p) = p − 1 for any prime p. Lemma 2.2.1 Let {r1 , r2 , . . . , rφ(m) } be a reduced residue system modulo m and let a ∈ Z with (a, m) = 1. Then {ar1 , ar2 , . . . , arφ(m) } is also a reduced residue system modulo m. For example, with m = 10, a = 13, we see that {13, 39, 91, 117} = {13 · 1, 13 · 3, 13 · 7, 13 · 9} is a reduced residue system modulo 10. Proof : By assumption a and each rj are relatively prime to m, and so each arj is also relatively prime to m. Moreover, if ari , arj lie in the same residue class, then one has ari ≡ arj mod m. By theorem 2.1.5, we may cancel a (which is relatively prime to the modulus) to yield the congruence ri ≡ rj mod m, and hence (since we began with a reduced residue system) we know that i = j, and the result is immediate. Theorem 2.2.2 (Euler’s theorem) If (a, m) = 1, then aφ(m) ≡ 1 mod m. Proof : Let {r1 , r2 , . . . , rφ(m) } be a reduced residue system modulo m. Then by lemma 2.2.1, the elements ar1 , ar2 , . . . , arφ(m) are congruent (in some order) to the elements r1 , r2 , . . . , rφ(m) , and therefore r1 r2 · · · rφ(m) ≡ (ar1 )(ar2 ) · · · (arφ(m) ) mod m ≡ aφ(m) r1 r2 · · · rφ(m) mod m. Since (r1 r2 · · · rφ(m) , m) = 1, we may cancel it, and the result follows. Corollary 1: (Fermat’s little theorem) If p is prime and p - a, then ap−1 ≡ 1 mod p, and for all a ∈ Z one has ap ≡ a mod p. Corollary 2: Let (a, m) = 1. If there exist e and f with e ≡ f mod φ(m), then ae ≡ af mod m. For example, 537 ≡ 1 mod 4, and since 4 = φ(10) we have that 3537 ≡ 31 mod 10. 8 Proof : Suppose without loss of generality that f ≥ e and write f = e + kφ(m). We have af = ae+kφ(m) = ae (aφ(m) )k ≡ ae (1)k mod m ≡ ae mod m, as claimed. Definition: Given a, m ∈ Z with m 6= 0, we call x ∈ Z a (multiplicative) inverse of a modulo m if ax ≡ 1 mod m. Theorem 2.2.3 (Theorem 2.9, Niven) If (a, m) > 1, then a has no inverse modulo m. If (a, m) = 1, then there exists a unique reduced residue class modulo m which contains all inverses of a. We denote any such inverse as ā or a−1 . Note that the notation a−1 is justified, as for example if we define a−k to be (a−1 )k mod m, then we indeed have (ak )−1 = (a−1 )k . Proof : Let g = (a, m); note that if ax ≡ 1 mod m then ax ≡ 1 mod g, and since g|a this congruence becomes 0x ≡ 1 mod g, a contradiction unless g = 1. Thus with the assumption that g = 1, we first prove uniqueness: if ax ≡ 1 mod m and ay ≡ 1 mod m, then ax ≡ ay mod m, hence (since (a, m) = 1) x ≡ y mod m, as claimed. To show existence, we give two short proofs: (1) By Euler’s theorem, we have 1 ≡ aφ(m) mod m ≡ a · aφ(m)−1 mod m, so we may take a−1 = aφ(m)−1 . (2) Since (a, m) = 1, there exist integers u, v such that au + bv = 1. Taking this equation modulo m yields the congruence au ≡ 1 mod m, and so we may take a−1 = u. 9 2.3 Lecture Five Calculating inverses: Suppose we want to calculate the (multiplicative) inverse of 9 modulo 20; note that this calculation is well-defined, as (9, 20) = 1. We perform the Euclidean algorithm: 20 = 9 · 2 + 2; 9 = 2 · 4 + 1 ⇒ 1 = 9 − 2 · 4 = 9 − 2 · (20 − 2 · 9) = 9 · 9 − 4 · 20. Taking this last equation modulo 20, we see that 92 ≡ 1 mod 20, so 9−1 ≡ 9 mod 20. The same equation also tells us that 20−1 ≡ 4 mod 9. One clearly has 20−1 ≡ 1 mod 19, 19−1 ≡ −1 mod 20, 19−1 ≡ 1 mod 9, 9−1 ≡ −2 mod 19. Definition: A collection of integers m1 , m2 , . . . , mr are called pairwise coprime (or pairwise relatively prime) if (mi , mj ) = 1 for all i 6= j. Note that this is stronger than the statement that (m1 , m2 , . . . , mr ) = 1. For example, (6, 10, 15) = 1, but (6, 10) = 2, (6, 15) = 3, (10, 15) = 5. Theorem 2.3.1 (Theorem 2.18, Niven; the Chinese remainder theorem) Let m1 , m2 , . . . , mr be pairwise coprime, and let {a1 , a2 , . . . , mr } be any set of integers. Then there exists a solution x to the system of congruences x ≡ a1 mod m1 , x ≡ a2 mod m2 , .. . x ≡ ar mod mr , and moreover the set of all solutions is exactly the residue class of x modulo M = m1 m2 · · · mr . Proof : For j = 1, 2, . . . , r, let Nj = m1 mm2j···mr , and note that (mj , Nj ) = 1. Therefore we may define bj to be the inverse of Nj modulo mj , so Nj bj ≡ 1 mod mj . Set x0 = r X Nj bj aj ; j=1 we claim that x0 solves our system. Indeed, modulo mj , each Ni with i 6= j is congruent to 0 modulo mj , and so x0 ≡ (Nj bj )aj mod mj ≡ aj mod mj , as claimed. Now, if x ≡ x0 mod M , then in particular for each j we have x ≡ x0 mod mj ≡ aj mod mj , so x is also a solution. Finally, if y is any solution to our system, then y ≡ aj mod mj ≡ x0 mod mj for every j, so mj |(y − x0 ). Since the mi are pairwise coprime, we have m1 m2 |(y − x0 ), m1 m2 m3 |(y − x0 ), and so on, until we obtain M |(y − x0 ), and we are done. Remark: If m1 , m2 , . . . , mr are not pairwise coprime, then there may be no solution, or there may be one residue class of solutions modulo [m1 , m2 , . . . , mr ]. For example, the system x ≡ 0 mod 6, x ≡ 1 mod 4, 10 has no solution, while x ≡ 0 mod 6, x ≡ 2 mod 4, has as its solution the residue class of 6 modulo 12. Example: Greg steals B boxes of 20 Timbits each. There are an equal number of each of the 9 flavours, and one extra to fill the last box. In class, he divides the Timbits equally among the 19 students, with 4 leftover for himself. What is the smallest possible value of B? Solution: Let t be the total number of Timbits; we have t ≡ 0 mod 20, t ≡ 1 mod 9, t ≡ 4 mod 19. Set m1 = 20, m2 = 9, m3 = 19; then N1 = 171, N2 = 380, N3 = 180. We need b1 ≡ N1−1 mod m1 ≡ (9 · 19)−1 mod 20 ≡ (9)−1 (19)−1 mod 20 ≡ 11 mod 20, from our previous work. Similarly, b2 ≡ 5 mod 9, b3 ≡ −2 mod 19. Hence x0 = N1 b1 a1 + N2 b2 a2 + N3 b3 a3 = (171)(11)(0) + (380)(5)(1) + (180)(−2)(4) = 460. Structural comments: Let Zm = Z/mZ be the set of residue classes modulo m. If d|m, then there is a well-defined projection map πd : Zm → Zd given by πd (a mod m) = a mod d. Note that this map is not well-defined if d - m. Now, let m1 , m2 , . . . , mr be pairwise coprime. We have a map π : Zm1 m2 ···mr −→ Zm1 × Zm2 × · · · × Zmr , given in each component Zmi by πmi . The Chinese remainder theorem gives a map ρ : Zm1 × Zm2 × · · · × Zmr −→ Zm1 m2 ···mr such that π ◦ ρ = id. Since each set is finite, we know that π and ρ are bijections. One can check that: 1. π and ρ respect coprimality, and 2. π and ρ respect multiplication and addition. Hence, π and ρ are ring isomorphisms. In particular, if Z× m is the set of reduced residue classes modulo m, then × × π : (Zm1 m2 ···mr )× −→ Z× m1 × Zm2 × · · · × Zmr is an isomorphism of multiplicative groups. It follows from this, and the formula for the Euler φ-function, that φ(m1 m2 · · · mr ) = φ(m1 )φ(m2 ) · · · φ(mr ). 11 3 Week Three 3.1 Lecture Six Suppose n ∈ N has prime factorization n = pα1 1 pα2 2 · · · pαr r , with αi > 0 and pi 6= pj for all i 6= j. Then as discussed last time, we have maps π : Zm1 m2 ···mr −→ Zm1 × Zm2 × · · · × Zmr , ρ : Zm1 × Zm2 × · · · × Zmr −→ Zm1 m2 ···mr , where π = πpα1 × πpα2 × · · · × πpαr r and ρ is the map given by the Chinese remainder theorem. These maps are 1 2 mutual inverses, and moreover are ring isomorphisms. In particular, these maps respect coprimality, and so their restrictions to their respective multiplicative groups of units yield mutually inverse group isomorphisms × × π̃ : (Zm1 m2 ···mr )× −→ Z× m1 × Zm2 × · · · × Zmr , × × × ρ̃ : Z× m1 × Zm2 × · · · × Zmr −→ (Zm1 m2 ···mr ) . By definition, (Zn )× has cardinality φ(n), and so it follows that φ(m1 m2 · · · mr ) = φ(m1 )φ(m2 ) · · · φ(mr ). Thus we are led to compute φ(pα ) for prime p; but since the only 1 ≤ k ≤ pα with (pα , k) > 1 must have (pα, k) = p, we deduce that exactly the multiples of p are not relatively prime to pα , hence φ(pα ) = pα − pα−1 = pα 1 − p1 . It follows that Y 1 φ(n) = n , 1− p p|n with the product running over all prime divisors p of n. Lemma 3.1.1 Fix m ∈ N, and consider the following statements: 1. x2 ≡ 1 mod m 2. x−1 ≡ x mod m 3. x ≡ ±1 mod m For any m, one has (1) if and only if (2), and that (3) implies (1). If m is prime, then all three are equivalent. Proof : The first statement is clear, as is the statement that (3) implies (1). Thus we will assume m is prime; then one has (3) if and only if m|x2 − 1 = (x + 1)(x − 1). Thus by Euclid’s lemma we have m|x + 1 or m|x − 1, and the result is immediate. We saw in the last lecture that 9−1 ≡ 9 mod 20, but clearly 9 6≡ ±1 mod 20. The same is true for 11 ≡ −9 mod 20. Theorem 3.1.2 (Wilson’s theorem) If p is prime, then (p − 1)! ≡ −1 mod p. 12 Proof : The cases p = 2, p = 3 are clear by computation. For p > 3, we pair off the numbers {2, 3, . . . , p − 2} as {a1 , b1 , a2 , b2 , . . . , ak , bk }, where k = p−3 2 and ai bi ≡ 1 mod p. We know that this is well-defined by lemma 3.1.1, and the fact that inverses modulo p are unique. One then has (p − 1)! = 1 · 2 · · · (p − 1) = 1 · (p − 1) · a1 b1 · · · ak bk ≡ 1 · (p − 1) · 1 · 1 · · · 1 mod p ≡ −1 mod p, as claimed. §2.2 – Solutions of congruences How many solutions has X 4 + 2X 3 + X + 1 ≡ 0 mod 5? As integers, we have solutions x ∈ {· · · , −14, −13, −9, −8, −4, −3, 1, 2, 6, 7, 11, 12, · · · }. As residue classes modulo 5, we have only x ≡ 1 mod 5 and x ≡ 2 mod 5; we say that our congruence has only 2 solutions modulo 5. Definition: Given a polynomial f (X) ∈ Z[X], the number of solutions of f (X) ≡ 0 mod m, denoted σf (m), is the number of residue classes modulo m which satisfy the congruence; equivalently, σf (m) = #{1 ≤ x ≤ m : f (x) ≡ 0 mod m}. Example: Let f (X) = X 2 − 1. We saw that σf (20) ≥ 4, while by lemma 3.1.1 we know that if p is an odd prime then σf (p) = 2, while σf (2) = 1. We begin our investigation by studying linear congruences of the form ax ≡ b mod m. Theorem 3.1.3 (Theorem 2.17, Niven) Let m ∈ N and set f (X) = aX − b, a, b ∈ Z. Set g = (a, m). Then σf (m) = 0 unless g|b, in which case σf (m) = g. Proof : If ax ≡ b mod m, then ax ≡ b mod g, i.e. 0x ≡ b mod g, since g|a, and hence we must have g|b. Now, suppose g|b and write a = αg, b = βg, m = µg. Then ax ≡ b mod m ⇔ αx ≡ β mod µ, by theorem 2.1.5. But (α, µ) = 1 by construction, so α−1 modulo µ exists, and we have the unique solution given by x ≡ α−1 β mod µ. This yields g = m µ solutions modulo m, as claimed. Example: Let m = 100 and g = 5, so that µ = 20. Then x ≡ 14 mod 20 if and only if x ≡ 14, 34, 54, 74, or 94 modulo 100. Let m have prime factorization m = pe11 pe22 · · · perr . By the Chinese remainder theorem, the congruence f (x) ≡ 0 mod m is equivalent to the system of congruences f (x) ≡ 0 mod pe11 , f (x) ≡ 0 mod pe22 , .. . f (x) ≡ 0 mod perr . 13 In particular, this implies that σf (m) = r Y σf (pei i ), i=1 and thus it suffices to study polynomial congruences modulo prime powers; this will be the focus of our next lecture. 14 3.2 Lecture Seven Exercise: Prove that the product of any k consecutive integers is a multiple of k!. Solution: The pigeonhole principle implies that among any k consecutive integers must be a multiple of 1, of 2, and so on up to k, but this is not quite enough, since these numbers need not be pairwise coprime. Instead, we may prove it one prime at a time, from which the general case follows. On the other hand, we may simply use the identity j(j − 1) · · · (j − k + 1) j! j = = ∈ Z, k! k!(j − k)! k from which the fact is apparent; granted, the last method is a Deus ex machina. §2.6 – Prime power moduli Lemma 3.2.1 Let f (X) ∈ C[X] have degree d. Then for any a ∈ C, we have f (a + h) = f (a) + hf 0 (a) + h2 f (d) (a) f 00 (a) + · · · + hd . 2! d! Proof : Fix a; both expressions above are polynomials in h of degree d, and their zeroth derivatives agree at h = 0, as do their first derivatives, second, and so on up to the dth derivatives. Thus their derivative, which is a polynomial in h of degree at most d, is divisible by hd+1 , which implies that they must, in fact, be equal. nb. With the notion of a derivative not defined here, we instead will use the formal derivative of a polynomial or power series, i.e. if f (X) = m X 0 n an X , then f (X) = n=0 m X nan X n−1 , m ∈ N0 ∪ {∞}. n=0 Lemma 3.2.2 If f (X) ∈ Z[X], then for any a ∈ Z, k ∈ N, we have that Proof : Write f (X) = d X f (k) (a) k! is an integer. an X n , an ∈ Z. Then n=0 d f (k) (a) X n(n − 1) · · · (n − k + 1) n−k = a , k! k! n=0 and by the exercise we know that n(n−1)···(n−k+1) k! ∈ Z. Theorem 3.2.3 (Hensel’s lemma) Let f (X) ∈ Z[X] and let pj be a prime power. Suppose there exists a ∈ Z so that f (a) ≡ 0 mod pj and f 0 (a) 6≡ 0 mod p. Then there exists a unique integer t, 0 ≤ t < p such that f (a + tpj ) ≡ 0 mod pj+1 . Example: Take f (X) = X 2 − 2, a = 4, pj = 71 . Then f (4) = 16 − 2 ≡ 0 mod 7, f 0 (4) = 2(4) 6≡ 0 mod 7. It follows that exactly one element of {4, 11, 18, 25, 32, 39, 46} is a root of f (X) modulo 72 ; it turns out to be 39. 15 Note that the residue class a modulo pj is the union of the p residue classes a + tpj , 0 ≤ t < p. The one which is a root modulo pj+1 is called a lift of a. Proof of Hensel’s lemma: By lemma 3.2.1, we may write f (a + tpj ) = f (a) + tpj f 0 (a) + (tpj )d f (d) (a) (tpj )2 f 00 (a) + ··· + . 2! d! Taking this expression modulo pj+1 yields f (a + tpj ) ≡ f (a) + tpj f 0 (a) mod pj+1 . Since f (a) ≡ 0 mod pj , we have that this is the case if and only if f (a) ≡ −tf 0 (a) mod p. pj Since f 0 (a) 6≡ 0 mod p, we have that f 0 (a) is a unit modulo pj+1 , and so we find the unique class t to be given by −(f 0 (a))−1 f (a) mod p, t≡ pj as can be easily verified. Example: Using the same example from before, we calculate to take t = −(1)−1 (2) ≡ 5 mod 7, and indeed f (a) pj = 14 7 = 2, f 0 (a) = 8 ≡ 1 mod 7, so we ought f (4 + 5 · 7) = f (39) = 1519 ≡ 0 mod 72 . Corollary 1: Given f (X) ∈ Z[X], a prime p, and a ∈ Z with f (a) ≡ 0 mod p and f 0 (a) 6≡ 0 mod p, then for every j ≥ 2 there exists a unique lift of a to a root of f modulo pj ; that is, a unique residue class aj mod pj such that f (aj ) ≡ 0 mod pj and aj ≡ a mod p. Proof : Exercise. (hint: use induction and Hensel’s lemma) Remark: The aj of the corollary are given recursively by a1 = a and, for j ≥ 1, aj+1 = aj − f 0 (aj )−1 f (aj ). nb. The condition f 0 (a) 6≡ 0 mod p is the condition that a is a nonsingular root of f (X) modulo p. As written, this formula fails for singular roots: consider f (X) = X 2 . Then a = 0 is a root modulo p, and every lift of a is a root of f modulo p2 . Similarly, for g(X) = X 2 − p, a = 0 is a root modulo p, but no lifts of a are roots modulo p2 . There is a more general version of Hensel’s lemma (theorem 2.24 of Niven) which accommodates such roots. Fact: There exist polynomials, such as (X 2 − 2)(X 2 − 17)(X 2 − 34), or 3X 3 + 4Y 3 + 5Z 3 , which have roots modulo m for every m ∈ N, but have no roots over the rationals. 16 3.3 Lecture Eight §2.7 – Prime modulus P P Definition: Let f (X) = aj X j , g(X) = bj X j ∈ Z[X]. We will say that f (X) is congruent to g(X) modulo m, written f (X) ≡ g(X) mod m, if aj ≡ bj mod m for every j. In other words, f (X) ≡ g(X) mod m if and only if f (X) and g(X) have the same image in (Z[X])/(m) ∼ = (Z/mZ)[X]. Example: Suppose f (X) = 15X 2 + 3X + 8 ∈ Z[X]. We note that deg f = 2 over Z, but deg f = 1 over Z5 , and deg f = 0 over Z3 . Lemma 3.3.1 Let p be prime, a an integer, and f (X) ∈ Z[X]. If f (a) ≡ 0 mod p, then there exists g(X) ∈ Z[X] with deg g = deg f − 1 such that f (X) ≡ (X − a)g(X) mod p. Proof : We saw in our last lecture that (with d = deg f ) f (a + h) = f (a) + hf 0 (a) + h2 We set g(X) = d X f 00 (a) f (d) (a) + · · · + hd . 2! d! (X − a)j−1 j=1 f (j) , j! and we have that f (X) = f (a) + (X − a)g(X) ≡ (X − a)g(X) mod p. Note that the leading coefficient of f (X) is f (d) (a) d! and that deg g = d − 1. Observe that the primality condition is necessary; indeed, if f (X) = X 2 − 1, then f has roots ±1, but we may factor f (X) = (X − 5)(X + 5). Theorem 3.3.2 (Theorem 2.26, Niven) Let f (X) ∈ Z[X], deg f = d modulo p, with p prime. Then f has at most d roots modulo p. Proof : We induct on deg f . For deg f = 0 the result is clear, so suppose deg f = d > 0. If f has no roots modulo p we are done; otherwise, write f (X) ≡ (X − a)g(X) mod p, where f (a) = 0 and deg g = d − 1, as guaranteed by lemma 3.3.1. Since p is prime, any root of f (X) modulo p is a root of X − a or g(X). By the inductive hypothesis, g has at most d − 1 roots modulo p, and X − a has a single root modulo p, from which we deduce the result. Example: Consider f (X) = X p − X with p prime. By Fermat’s little theorem, every residue class modulo p is a root of f , and by lemma 3.3.1 it follows that f (X) = X(X − 1)(X − 2) · · · (X − p + 1) mod p. Comparing coefficients yields some interesting congruences, among which we have in the coefficient of X p−1 0 + 1 + 2 + · · · + (p − 1) ≡ 0 mod p, p > 2, 17 and in the coefficient of X p−2 X jk ≡ 0 mod p, p > 3. 0≤j<k≤p−1 Finally, from the coefficient of X we may deduce Wilson’s theorem (p − 1)! ≡ −1 mod p. Remark: This example implies that if f (X), g(X) ∈ Z[X] are such that f (a) ≡ g(a) mod p for every a ∈ Z, then f (X) − g(X) ≡ h(X)(X p − X) mod p for some h(X) ∈ Z[X]. In fact, this condition is also sufficient. Proposition 3.3.3 Let F (X) be any function (i.e. set map) from Zp to Zp . Then there exists a unique polynomial g(X) modulo p of degree at most p − 1 such that F (a) ≡ g(a) mod p for every a ∈ Z. Proof : We show uniqueness first. If g(X), h(X) both satisfy the condition, then from our remark above we have that g(X) − h(X) = q(X)(X p − X), some q(X) ∈ Z[X]. Comparing degrees, we see that we must have g = h. For existence, we give two proofs. First of all, if we set p−1 X g(X) = (1 − (X − a)p−1 )F (a), a=0 then by Fermat’s little theorem we see that g(a0 ) ≡ (1 − 0)F (a0 ) mod p ≡ F (a0 ) mod p. Alternatively, we observe that there are exactly pp functions Zp → Zp , and there are exactly pp polynomials over Zp of degree at most p − 1. No two of these polynomials give the same function, and it follows that the two sets must coincide. Corollary 1: (Corollary 2.30, Niven) Let p be prime and suppose that d|(p − 1). Then X d − 1 has exactly d roots modulo p. Proof : By theorem 3.3.2 there are most d roots, so we need only show there are at least d roots. Note that X p−1 − 1 ≡ (X − 1)(X − 2) · · · (X − p + 1) mod p has exactly p − 1 roots modulo p. Since d|(p − 1), we have X p−1 − 1 = (X d − 1)(X p−1−d + X p−1−2d + · · · + X 2d + X d + 1). The second factor has at most p − 1 − d roots modulo p, and so by the pigeonhole principle X d − 1 must have at least d roots modulo p, as claimed. §2.8 – Primitive roots and power residues Consider the congruence X n ≡ 1 mod m; note that any solution a must satisfy (a, n) = 1. Definition: Given a with (a, m) = 1, the multiplicative order of a modulo m (often called simply the order of a) is the least positive integer k such that ak ≡ 1 mod m. One sometimes says that a belongs to the exponent k modulo m. 18 Example: Let m = 11, a = 3. We have 31 ≡ 3 mod 11, 32 ≡ 2 mod 11, 33 ≡ 5 mod 11, 34 ≡ 4 mod 11, 35 ≡ 1 mod 11, and we see that the order of 3 modulo 11 is 5. Fact: The order of a modulo m always divides φ(m). 19 4 Week Four 4.1 Lecture Nine Lemma 4.1.1 (Lemma 2.31, Niven) ak ≡ 1 mod m if and only if the order of a modulo m divides k. Proof : Let h be the order of a modulo m. If h|k, we have k = hq for some q, hence ak = ahq = (ah )q ≡ 1q mod m ≡ 1 mod m. Conversely, if ak ≡ 1 mod m, we may use the division algorithm to write k = hq + r, 0 ≤ r < h. One then has 1 ≡ ak mod m ≡ (ah )q ar mod m ≡ ar mod m. Since h is the minimal positive integer such that ah ≡ 1 mod m, it follows that r = 0, and we are done. If (a, m) = 1, then the order of a modulo m divides φ(m). Lemma 4.1.2 (Lemma 2.33, Niven) If a has order h modulo m, then ak has order For example, the order of a2 modulo m is h 2 h (h,k) modulo m. if h is even, and h if h is odd. Proof : The following statements about positive integers j are equivalent: 1. (ak )j ≡ 1 mod m 2. h|(kj) 3. h k (h,k) | (h,k) j 4. h (h,k) |j It follows that the least positive j satisfying (4), and hence (1), is exactly j = h (h,k) . Remark: The subgroup of Z× m generated by a is a cyclic group of order h. The same proof shows that the h smallest positive integer y such that ky ≡ 0 mod h is y = (h,k) . Lemma 4.1.3 Let a have order r modulo m, and let b have order s modulo m. Then the order of ab modulo [r,s] rs rs m divides (r,s) = [r, s], and moreover is a multiple of (r,s) 2 = (r,s) . In particular (Lemma 2.34, Niven), if (r, s) = 1, then the order of ab modulo m is exactly rs. Proof : Let t be the order of ab modulo m. Then (ab)rs/(r,s) = (ar )s/(r,s) (bs )r/(r,s) ≡ (1)(1) mod m ≡ 1 mod m, rs . We also have and it follows that t| (r,s) ast ≡ ast (bs )t mod m ≡ ((ab)t )s mod m ≡ 1 mod m, r s hence r|st, so (r,s) | (r,s) t⇒ rs it follows that (r,s)2 |t. r (r,s) |t. By a symmetric argument we may show that s (r,s) |t, and since r s (r,s) , (r,s) =1 Definition: An integer a is called a primitive root modulo m if it has order φ(m) modulo m. In this case, Z× m is the cyclic group of order φ(m). 20 Proposition 4.1.4 If m has a primitive root, then it has exactly φ(φ(m)) primitive roots. Proof : Let g be a primitive root modulo m. Then we have a reduced residue system modulo m given by φ(m) {g, g 2 , . . . , g φ(m) }. By lemma 4.1.2, the order of g j modulo m is exactly (j,φ(m)) , which equals φ(m) exactly when (j, φ(m)) = 1. There are exactly φ(φ(m)) such residue classes, and we are done. Lemma 4.1.5 (Lemma 2.35, Niven) Let p, q be primes and let r ∈ N be such that q r |(p − 1). Then there are q r − q r−1 residue classes of order q r modulo p. r Proof : The order of a modulo p divides q r if and only if aq ≡ 1 mod p. This congruence has exactly q r solutions r−1 by corollary 1 of proposition 3.3.3. The order of a modulo p divides q r−1 if and only if aq ≡ 1 mod p, which r−1 has exactly q solutions. The result is now immediate. Theorem 4.1.6 (Theorem 2.36, Niven) Every prime p has a primitive root. Proof : If p = 2 the result is immediate, so assume p is odd and write p − 1 in its prime factorization p − 1 = q1r1 q2r2 · · · qkrk . r For each 1 ≤ j ≤ k, let aj be some integer of order qj j modulo p, whose existence is guaranteed by lemma 4.1.5. r Since (qiri , qj j ) = 1 for all i 6= j, we have by lemma 2.34 of Niven that a1 a2 has order q1r1 q2r2 modulo p, that a1 a2 a3 has order q1r1 q2r2 q3r3 modulo p, and continuing in this fashion, we eventually see that a1 a2 · · · ak has order p − 1 modulo p, as claimed. 21 4.2 Lecture Ten Example: Modulo 5, the reduced residue classes are 1, 2, 3, and 4, with respective orders 1, 4, 4, and 2; we see that 2 and 3 are the φ(φ(5)) primitive roots modulo 5. What are the primitive roots modulo 25? Exactly {2, 3, 8, 12, 13, 17, 22, 23}. Note that there are 8 = φ(φ(25)) of them, and that all are also primitive roots modulo 5. In fact, we may lift any primitive root modulo p to p − 1 primitive roots modulo p2 , and for j ≥ 2, any primitive root modulo pj lifts to exactly p primitive roots modulo pj+1 . Proposition 4.2.1 For n ≥ 1, we have X φ(d) = n. d|n Proof : The fractions { n1 , n2 , . . . , nn } are not all in lowest terms; when we do so, we may consider their denominators. For every divisor d of n, exactly φ(d) of these fractions have denominator d; indeed, these fractions are exactly k(n/d) : 1 ≤ k ≤ d, (k, d) = 1 . n Since there are exactly n fractions in our original set, the result follows. Alternative proof of the existence of primitive roots modulo p: We use strong induction to find the number of elements of order k modulo p, namely φ(k) if k | (p − 1), and 0 if k - (p − 1). The case k = 1 is trivial. For k > 1, k | (p − 1), we first note that X X φ(k) + φ(d) = φ(d) = k. d|k, d<k d|k Since p is prime, there are exactly k solutions to the congruence xk ≡ 1 mod p, which are exactly those x modulo p with order dividing k. This, again, is exactly the sum X #{x : ordp (x) = k} + #{x : ordp (x) = d}, d|k, d<k where ordp (x) denotes the order of x modulo p; the result is now immediate. Lemma 4.2.2 If d|n, then for any a with (a, n) = 1, the order of a modulo d divides the order of a modulo n. Proof : If ordn (a) = h, then ah ≡ 1 mod n, so ah ≡ 1 mod d. Proposition 4.2.3 If g is a primitive root modulo pr with r ≥ 2, then gp r−2 (p−1) 6≡ 1 mod pr . Moreover, the converse holds if g is a primitive root modulo pr−1 . Proof : If g is a primitive root modulo pr , then ordpr (g) = φ(pr ) = pr−1 (p − 1) > pr−2 (p − 1), 22 from which it follows that gp r−2 (p−1) 6≡ 1 mod pr . Now, suppose that g is a primitive root modulo pr−1 and that gp r−2 (p−1) 6≡ 1 mod pr . The order of g modulo pr divides φ(pr ) = pr−1 (p − 1), and by lemma 4.2.2 must be a multiple of pr−2 (p − 1). Since ordpr (g) 6= pr−2 (p − 1) by assumption, we deduce the result. Theorem 4.2.4 Primitive roots exist modulo p2 for any prime p. Proof : Let g be a primitve root modulo p and consider the lifts g + tp modulo p2 , 0 ≤ t ≤ p − 1. We claim that all but one of these lifts are primitive roots modulo p2 . Indeed, by proposition 4.2.3 it suffices to show that exactly one lift satifsies (g + tp)p−1 ≡ 1 mod p2 . Let f (X) = X p−1 − 1. Then g is a root of f (X) modulo p, and f 0 (g) = (p − 1)g p−2 6≡ 0 mod p. Thus g is a nonsingular root of f modulo p, and so by Hensel’s lemma exactly one lift of g is a root of f modulo p2 ; every other such lift must then yield a primitive root. Lemma 4.2.5 If g is a primitive root modulo p2 , then it is also a primitive root modulo p. Proof : If ak ≡ 1 mod p, then apk − 1 = (ak − 1)((ak )p−1 + (ak )p−2 + · · · + ak + 1). Both factors are multiples of p, so it follows that apk ≡ 1 mod p2 . In particular, if g is a primitive root modulo p2 , then g pk 6≡ 1 mod p2 for k = 1, 2, . . . , p − 2. Hence g k 6≡ 1 mod p for 1 ≤ k ≤ p − 2, and it follows that the order of g modulo p is p − 1. Next, we will consider primitive roots modulo pr for r ≥ 3. No more degenerate cases arise here, except when p = 2. In this case, there are no primitive roots modulo 2r for any r ≥ 3. 23 4.3 Lecture Eleven Theorem 4.3.1 Let p be an odd prime and let r ≥ 2. Then any primitve root modulo p2 is a primitive root modulo pr . Proof : We induct on r. The case r = 2 is trivial, so for r > 2 assume g is a primitive root modulo pr ; we will show that g is a primitive root modulo pr+1 . Indeed, by proposition 4.2.3 we have that gp r−2 (p−1) 6≡ 1 mod pr , r−1 and so by the same proposition it suffices to show that g p (p−1) 6≡ 1 mod pr+1 . By Euler’s theorem we have that r−2 g p (p−1) ≡ 1 mod pr−1 , so we can write g p r−2 (p−1) = 1 + npr−1 for some n 6≡ 0 mod p. By the binomial theorem we have that g pr−1 (p−1) = (1 + np p X p (npr−1 )k , ) = k r−1 p n=0 and since p| and so p k for 2 ≤ k ≤ p − 1, we see that pr+1 | gp r−1 (p−1) p k (npr−1 )k . In fact we also have this divisibilty when k = p, ≡ 1 + npr mod pr+1 6≡ 1 mod pr+1 , and we are done. nb. We only use the fact that p is odd in the cancellation of p 2 n2 p2r−2 . Lemma 4.3.2 If r ≥ 3, then the order of every odd integer modulo 2r divides 2r−2 = 21 φ(2r ). In particular, there are no primitive roots modulo 2r . Proof : Again we induct on r. We did the case r = 3 in the last lecture, and so assuming the claim is true for some r with r ≥ 3, then r−2 a2 ≡ 1 mod 2r r−2 for every odd a. Then 2r |(a2 r−2 − 1) and 2|(a2 r−2 2r+1 |(a2 r−1 whence a2 + 1) by parity, hence r−2 − 1)(a2 r−1 + 1) = a2 − 1, ≡ 1 mod 2r+1 , as claimed. α nb. The same proof shows that if a ≡ 5 mod 8, then 2α+2 ||(a2 − 1), where pk ||n if and only if pk | n and pk+1 - n. r−2 Theorem 4.3.3 (Theorem 2.43, Niven) Let r ≥ 3; then the set {±5, ±52 , . . . , ±52 } is a reduced residue system modulo 2r . In particular, 5 has order 2r−2 modulo 2r , and the abelian group homomorphism f : Z2r−2 × Z2 −→ Z× 2r given by f (x, y) = 5x (−1)y is an isomorphism. 24 By way of comparison, note that if p is odd, the map is an isomorphism f : Zpr−1 (p−1) −→ Z× pr given by f (x) = g x for any primitive root g modulo pr−1 . Proof : The order of 5 modulo 2r divides 2r−2 by lemma 4.3.2, and so if 2r−2 is not the order, then the order divides 2r−3 , hence r−3 52 ≡ 1 mod 2r . r−3 But then 2r |52 − 1, contradicting our previous remark with α = r − 3. Thus 5 has order 2r−2 modulo 2r , and so the residue classes r−2 {5, 52 , . . . , 52 } are distinct modulo 2r , as are the residue classes r−2 {−5, −52 , . . . , −52 }. Finally, 5k ≡ 1 mod 4, while −5k ≡ 3 mod 4, so the two sets above are disjoint, and we are done. e1 e2 er We now know the group structure of Z× n for every n. If n has prime factorization n = p1 p2 · · · pr , then by the Chinese remainder theorem × × ∼ × Z× n = Zpe1 × Zpe2 × · · · × Zper . 1 r 2 If p is odd, then ∼ Z× , e = Z ei −1 p (p −1) p i i i i and for p = 2 we have Z× 2r Z1 ∼ = Z2 Z2r−2 × Z2 if r = 1, if r = 2, and if r ≥ 3. Primitive roots modulo non-prime powers Note that φ(n) is even for every n ≥ 3. If we can write n = cd with (c, d) = 1 and c, d ≥ 3, then the order of any a modulo n must divide 21 φ(n) = 12 φ(c)φ(d), as we have aφ(n)/2 = (aφ(c) )φ(d)/2 ≡ 1φ(d)/2 mod c ≡ 1 mod c, and similarly aφ(n)/2 = (aφ(d) )φ(c)/2 ≡ 1φ(c)/2 mod d ≡ 1 mod d, since by our assumption 2|φ(c), 2|φ(d). Our claim then follows by the Chinese remainder theorem. The only integers a which do not have such a factorization are powers of 2, or are of the form a = pr or a = 2pr , where p is an odd prime and r ≥ 1. Numbers of this form are the only ones which could possibly have primitive roots. Theorem 4.3.4 (Theorem 2.41, Niven) The moduli that have primitive roots are exactly 1, 2, 4, pr , and 2pr , where p is an odd prime and r ≥ 1. Proof : Next lecture. 25 5 5.1 Week Five Lecture Twelve Fun fact! If S(x) denotes the set of squarefree numbers s with s ≤ x, then one has 6 #S(x) = 2. n→∞ x π lim Recall theorem 4.3.4 from last lecture, and let P R denote the set of moduli which have primitive roots. For example, modulo 18, we have φ(18) = 6, and indeed a reduced residue system is given by {1, 5, 7, 11, 13, 17}, which have respective order 1, 6, 3, 6, 3, and 2. Thus 5 and 11 are primitive roots modulo 18, and as expected we find there are 2 = φ(φ(18)) of them. Similarly, modulo 9 a reduced residue system is given by {1, 2, 4, 5, 7, 8} with respective orders 1, 6, 3, 6, 3, and 2 (note the similarity with Z× 18 ), and we have the same result with the primitive roots 2 and 5. Proof : (of theorem 4.3.4) We need only check that m = 2pr has primitive roots, the other claims having already been proven. If {a1 , a2 , . . . , aφ(pr ) } is a reduced residue system modulo pr , then we claim that {aj : 2 - aj } ∪ {aj + pr : 2 | aj } is a reduced residue system modulo 2pr . Indeed, we see that we have exactly φ(2pr ) = φ(2)φ(pr ) = φ(pr ) residue classes, that all are distinct, and since (aj , p) = 1 we have u, v so that aj u + pv = 1; thus writing x = u and y = v − pr−1 u, we have 1 = aj x + p(y + pr−1 x) = (aj + pr )x + py ⇒ (aj + pr , p) = 1, and hence (since p is assumed odd) aj + pr is indeed a unit modulo 2pr , by the Chinese remainder theorem. Furthermore, the order of the elements of the latter set (the lifts of the even aj ) do not change, as for 0 < k < ordpr (aj ) we have k X k n r(k−n) r k (aj + p ) = a p ≡ akj mod pr , n j n=0 which is nonzero by assumption, thus akj 6≡ 0 mod 2pr . The same argument holds for the odd aj , and we see that one of the elements in our reduced residue system must have order φ(pr ) = φ(2pr ), which completes the proof. ∼ × Remark: When m is odd, we have an isomorphism of groups π : Z× m −→ Z2m . Corollary 1: (Corollary 2.42, Niven) Let m ∈ P R and let (a, m) = 1. The congruence xn ≡ a mod m has d solutions if aφ(m)/d ≡ 1 mod m where d = (n, φ(m)), and zero solutions otherwise. Remark: The analogue for m = 2r , r ≥ 3, is corollary 2.44 in Niven. Proof : Let g be a primitive root modulo m. Choose j, 1 ≤ j ≤ φ(m) so that g j ≡ a mod m, and note that if xn ≡ a mod m then one must have (x, n) = 1. For every such x, there exists k so that g k ≡ x mod m, and thus it suffices to solve the congruence (g k )n ≡ g j mod m for k. Since the order of g is φ(m), this congruence has a solution if and only if kn ≡ j mod φ(m). For fixed j, theorem 3.1.3 tells us that there are d = (n, φ(m)) solutions if d|j, and none otherwise. But d|j if and only if j = dl for some 1 ≤ l ≤ m, if and only if a ≡ g dl mod m. 26 Finally, this is equivalent to the statement that aφ(m)/d ≡ g φ(m)l mod m (it is a sufficient condition because g di 6≡ 1 mod m for 1 ≤ i ≤ l − 1); but g φ(m)l ≡ 1 mod m, and we are done. Corollary 2: (Corollary 2.38, Niven; Euler’s criterion): Let p be an odd prime. The congruence X 2 ≡ a mod p p−1 has two solutions if a 2 ≡ 1 mod p, and no solutions otherwise. There is one solution if p|a. Definition: The Carmichael lambda function, denoted λ(m), is the smallest exponent e ∈ N such that ae ≡ 1 mod m for every (a, m) = 1. Remark: We know λ(m)|φ(m), and λ(m) = φ(m) if and only if m ∈ P R. Moreover, as seen last week, if m ∈ P R then λ(m) ≤ φ(m) 2 . By the Chinese remainder theorem, λ(pe11 pe22 · · · perr ) = [pe11 , pe22 , . . . , perr ]. For odd primes, we have λ(pr ) = pr−1 (p − 1), which also holds for p = 2 and r ≤ 2. For r ≥ 3, one has instead λ(2r )/2r−2 . Group theoretically, λ(m) is the exponent of the group Z× m. Definition: A base-b pseudoprime is a composite number m such that bm−1 ≡ 1 mod m. For example, we may take b = 2, m = 341; then 210 = 1024 = 3 · 341 + 1, and so 2341−1 = (210 )34 ≡ 134 mod 341 ≡ 1 mod 341. Thus 341 is a base-2 pseudoprime. This notion gives rise to the Fermat test for primality: if bm−1 6≡ 1 mod m, then m is composite. For example, with m = 341, b = 3, we have 3341−1 ≡ 56 mod 341 6≡ 1 mod 341, and it follows that 341 is not prime. 27 5.2 Lecture Thirteen Recall: Fermat’s test for primality. Definition: Let m be composite. Then m is called a Carmichael number if bm−1 ≡ 1 mod m for all (b, m) = 1. For example, we might take m = 561 = 3 · 11 · 17. If (b, m) = 1, then we have by Euler’s theorem 2 280 (b ) mod 3 ≡ 1 mod 3, b561−1 ≡ (b10 )56 mod 11 ≡ 1 mod 11, 16 35 (b ) mod 17 ≡ 1 mod 17. The Chinese remainder theorem then implies that b560 ≡ 1 mod m. In 1994, Alford, Granville, and Pomerance showed that there are infinitely many Carmichael numbers, in the paper of the same name. In fact, if 6k + 1, 12k + 1, and 18k + 1 are all prime for some k ∈ N, then their product is a Carmichael number. For example with k = 1 we get that 1729 is a Carmichael number. §3.1 – Quadratic residues Most generally, we will investigate congruences of the form aX 2 + bX + c ≡ 0 mod p, where p is an odd prime. Completing the square gives 4a2 X 2 + 4abX + 4ac ≡ 0 mod p ⇒ (2aX + b)2 ≡ b2 − 4ac mod p. Thus we are led to ask when y 2 ≡ ∆ mod p (where ∆ = b2 − 4ac is the discriminant of our polynomial) has a solution. If so, then 2aX + b ≡ y mod p ⇔ x ≡ (y − b)(2a)−1 mod p. We note the obvious analogue of the quadratic formula. Thus it suffices to investigate when X 2 ≡ a mod p can be solved. By Euler’s criterion, this occurs exactly when a p−1 2 ≡ 1 mod p, if p - a. Example: We investigate such congruences modulo 7, when a 0 1 2 3 4 5 6 ord7 (a) – 1 3 6 3 6 2 a3 mod 7 0 1 1 −1 1 −1 −1 p−1 2 = 3. Solutions of x2 ≡ a mod 7 x ≡ 0 mod 7 x ≡ 1, 6 mod 7 x ≡ 3, 4 mod 7 none x ≡ 2, 5 mod 7 none none Definition: If (a, m) = 1, then a is called a quadratic residue modulo m if X 2 ≡ a mod m has a solution, and a quadratic nonresidue otherwise. Definition: If p is an odd prime, define the Legendre symbol ap via if a is a quadratic residue modulo p, 1 a = −1 if a is a quadratic nonresidue modulo p, p 0 if p|a. 28 Remark: If a ≡ b mod p, then a p + 1. a p = b p . Moreover, the number of solutions of X 2 ≡ a mod p is exactly Theorem 5.2.1 (Theorem 3.1, Niven) If p is an odd prime and (a, p) = 1, then a p =a p−1 2 . Proof : We give two proofs. In the first, we simply use Euler’s criterion (this is left as an exercise). For the second, we observe that if a is a quadratic residue modulo p, then we can choose some z such that z 2 ≡ (−z)2 mod p ≡ a mod p. We then pair the reduced residue classes modulo p apart from ±z as (xi , yi ), with xi yi ≡ a mod p. There are p−3 2 such pairs, and by Wilson’s theorem p−3 −1 ≡ (p − 1)! mod p ≡ z(−z) 2 Y xi yi mod p i=1 ≡ −a · a p−3 2 mod p ≡ −a p−1 2 mod p, and the result follows. If a is a nonresidue, we repeat the above construction, this time pairing all residue classes xi y1 ≡ a mod p, i = 1, 2, . . . , p−1 2 , and we are done. a b a2 Corollary 1: For any integers a, b, we have ab p = p p ; in particular, if (a, p) = 1 we have p = 1. In other words, the product of two quadratic residues is a quadratic residue, as is the product of two quadratic nonresidues. The product of a residue and a nonresidue is a nonresidue – compare this behaviour with that of the positive and negative integers. 29 5.3 Lecture Fourteen Recall: The Legendre symbol for p - a is defined ( 1 if x2 ≡ a mod p has a solution, a = p −1 otherwise. By Euler’s criterion, we showed that a p−1 2 a p ≡ mod p. Example: When a = −1 and p is odd, we have that ( p−1 1 −1 ≡ (−1) 2 mod p ≡ p −1 if p ≡ 1 mod 4, if p ≡ 3 mod 4. So X 2 ≡ −1 mod p has two solutions if p ≡ 1 mod 4, and no solutions if p ≡ 3 mod 4. nb. For odd primes p, we have p−1 Y p−1 i ≡ (−1) i= p+1 2 p−1 2 2 Y j mod p ≡ (−1) j=1 p−1 2 p−1 ! mod p. 2 (1) In particular, if p ≡ 1 mod 4 we get 2 p−1 Y p−1 p−1 p−1 2 ! ≡ (−1) i mod p ≡ (p − 1)! mod p ≡ −1 mod p, 2 2 p+1 i= and hence x = p−1 2 2 ! solves x2 ≡ −1 mod p. Theorem 5.3.1 (The Law of Quadratic Reciprocity) Let p 6= q be odd primes; then p−1 q−1 p q = (−1) 2 · 2 . q p In other words, pq = pq if p or q ≡ 1 mod 4, and pq = − pq if p ≡ q ≡ 3 mod 4. Knowing whether or not X 2 ≡ p mod q has solutions is the same as knowing whether or not X 2 ≡ q mod p has solutions. Proof : (due to Rousseau, 1991) First, some background. Let α = p−1 2 ,β = n o pq F = 1≤k< : (k, pq) = 1 2 q−1 2 . Let be the “first half” of Z× pq and let n qo × L = (i, j) ∈ Z× × Z : 1 ≤ i ≤ p − 1, 1 ≤ j < p q 2 × be the “left half” of Z× p × Zq , and let π : Zpq → Zp × Zq be the map given by the Chinese remainder theorem. One can see that for every k ∈ Z× pq , one has π(k) ∈ L or −π(k) ∈ L (we will write k ∈ −L). For each such k, choose k ∈ {±1}, ik ∈ {1, 2, . . . , p − 1}, jk ∈ {1, 2, . . . , β} such that π(k) = (ik , jk ). 30 In particular, if k 6= k 0 ∈ F , then π(k) 6= π(k 0 ) and π(k) 6= −π(k 0 ). Thus each ordered pair (ik , jk ) is distinct, and we obtain ! Y Y Y Y Y (k, k) ≡ π(k) ≡ k (ik , jk ) ≡ k (i, j) , (2) k∈F k∈F the calculation taking place in Z× p × k∈F Z× q k∈F (i,j)∈L and the congruences taken (modp, modq). Now, consider the right-hand side of (2): we have (with the same notation convention) Y p−1 β YY (i, j) ≡ (i, j) ≡ (((p − 1)!)β , (β!)p−1 ). i=1 j=1 k∈F From (1), we have that q−1 Y i ≡ (−1)β β! mod q, i=β+1 hence (modp, modq) we have α q−1 Y Y (i, j) ≡ ((p − 1)!)β , β! · i(−1)β ≡ (((p − 1)!)β , (−1)αβ ((q − 1)!)α ), β+1 (i,j)∈L and finally by Wilson’s theorem we obtain Y (i, j) ≡ ((−1)β , (−1)αβ (−1)α ). (i,j)∈L Thus with = Q k∈F k , the right-hand side of (2) becomes ((−1)β , (−1)αβ (−1)α ). Now, on the left-hand side, we look at the first co-ordinate modulo p: Y Y k≡ k∈F , 1≤k< pq 2 −1 Y Y k≡ k k pq pq 1≤k< p-k (pq,k)=1 2 , 1≤k< q|k 2 . (3) , The first factor in (3) splits into intervals of length p − 1, with one exception, namely the interval ending Thus modulo p we see Y Y Y Y Y k= k k · · · k k ; 1≤k< pq , 2 p-k but βp + α = pq 2 1≤k≤p−1 p+1≤k≤2p−1 (β−1)p≤k≤βp−1 βp+1≤k≤βp+α , so we see that Y k ≡ ((p − 1)!)β α! mod p. 1≤k< pq , 2 p-k The second factor of (3) is the inverse of Y 1≤k< pq , 2 q|k q k ≡ q · 2q · · · αq mod p ≡ q α! mod p ≡ α! mod p, p α 31 pq 2 . with the last congruence following by Euler’s criterion. Thus (3) becomes Y k∈F −1 q α! mod p, k ≡ ((p − 1)!) α! p β which by Wilson’s theorem is congruent modulo p to (−1)β Y k∈F k ≡ (−1)α q p . The same proof shows p mod q, q and so (2) becomes β q α p (−1) , (−1) ≡ ((−1)β , (−1)αβ (−1)α ) (modp, modq). p q The first co-ordinate tells us that pq ≡ mod p, and the second that pq = (−1)αβ = (−1)αβ pq (where we have equality rather than congruence, as pq ∈ {±1} and p is odd), hence p q = (−1)αβ , q p as claimed. 32 6 Week Six 6.1 Lecture Fifteen p−1 Recall: Last week, we saw that Euler’s criterion implies that −1 = (−1) 2 for any odd prime p. In other p words, x2 ≡ −1 mod p has 2 solutions if p ≡ 1 mod 4, and no solutions if p ≡ 3 mod 4. There is a single solution if p = 2. Consequently, we see that, for every integer x, all of the prime factors of x2 +1 (other than 2) must be congruent to 1 modulo 4. Similarly, for any x, k ∈ Z we have that all prime factors p of x2 + k 2 satisfy p | 2k or p ≡ 1 mod 4, since if p - k then x2 + k 2 ≡ 0 mod p implies that x2 ≡ −k 2 mod p, hence (xk −1 )2 ≡ −1 mod p and so p = 2 or p ≡ 1 mod 4. Note that in the first case, we must have (x, k) > 1. Example: We use quadratic reciprocity to answer the question: Does x2 ≡ 55 mod 367 have a solution? Note that 367 is a prime congruent to 3 modulo 4. 55 To answer this question we compute the Legendre symbol 367 : by multiplicativity we have 55 5 11 = . 367 367 367 The law of quadratic reciprocity then implies that 5 367 2 = = = −1, 367 5 5 since the quadratic residues modulo 5 are 1 and 4, and similarly 2 11 367 4 2 =− =− =− = −1. 367 11 11 11 55 = (−1)(−1) = 1, and we see that 55 is a quadratic residue modulo 367. The theorem is nonThus 367 constructive, but one may check that (±34)2 ≡ 55 mod 367. We see from this example that one algorithm for calculating (ap) is given by: 1. Factor a completely, a = pe11 pe22 · · · pekk . 2. Use multiplicativity and periodicity: e1 e2 ek p a p1 p2 = ··· k . p p p p 3. Use the law of quadratic reciprocity. 4. If not finished, return to 1. Theorem 6.1.1 (Theorem 3.3, Niven) If p is an odd prime, then p2 −1 2 = (−1) 8 ; p that is, ( 1 2 = p −1 if p ≡ ±1 mod 8, if p ≡ ±3 mod 8. 33 The proof is not given here. §3.3 – The Jacobi symbol Let p1 , p2 , . . . , pk be odd primes (not necessarily distinct), and let Q be their product. The Jacobi symbol a Q is defined Y k a a = , Q pj j=1 where the symbols on the right are Legendre symbols. 8 Example: We compute the Jacobi symbol 15 . We have 8 8 2 2 8 = = = (−1)(−1) = 1. 15 3 5 5 5 8 Note that although the Jacobi symbol 15 is 1, the congruence x2 ≡ 8 mod 15 has no solution, as x2 ≡ 2 mod 3 a hasn’t any. However, we can say that, if Q = −1, then x2 ≡ a mod Q has no solutions. Our example shows that the converse is false; why, then, define the Jacobi symbol at all? There are several reasons, chief among which are 1. It agrees with the Legendre symbol when Q is prime, and 2. It is easy to compute without factoring any integers. The first of these assertions is clear, but the second is not yet. Properties of the Jacobi symbol • It is totally multiplicative in both arguments; that is, if Q and R are odd primes, then for any a, b we have ab a b a a a = , = . Q Q Q QR Q R • It is periodic in the top argument with period Q, i.e. if a ≡ b mod Q then a Q = b Q . The second property is immediate if Q is squarefree, and if not then we write Q = Q0 S with Q0 squarefree and S a perfect square, and we have that a a a a a 2 a √ = = = . 0 0 Q Q S Q Q S Before proceeding, we first record the following Lemma 6.1.2 If b1 , b2 , . . . , bk are odd, then k X bj − 1 b1 b2 · · · bk − 1 ≡ mod 2. 2 2 j=1 Proof : If k = 2, then b1 b2 − 1 − 2 b1 − 1 b2 − 1 + 2 2 = (b1 − 1)(b2 − 1) ≡ 0 mod 2, 2 and the general case follows by induction (exercise). 34 Theorem 6.1.3 (Theorem 3.7, Niven) If Q > 0 is odd, then the Jacobi symbol (−1) Q−1 2 ( 1 = −1 −1 Q equals if Q ≡ 1 mod 4, if Q ≡ 3 mod 4. Proof : Since square factors of Q do not affect the Jacobi symbol (as illustrated above), we may assume without loss of generality that Q = p1 p2 · · · pk is squarefree. Then by lemma 6.1.2 we have that Q−1 p1 − 1 p2 − 1 pk − 1 ≡ · ··· mod 2, 2 2 2 2 hence pk −1 p1 −1 p2 −1 Q−1 −1 −1 −1 −1 2 2 2 = ··· = (−1) (−1) · · · (−1) = (−1) 2 , Q p1 p2 pk as claimed. 35 6.2 Lecture Sixteen Theorem 6.2.1 (Theorem 3.8, Niven; the law of Quadratic reciprocity for Jacobi symbols) Let P, Q ∈ N be odd with (P, Q) = 1. Then ( P −1 Q−1 −1 if P ≡ Q ≡ 3 mod 4, Q P · = (−1) 2 2 = Q P 1 otherwise. Note that if (P, Q) > 1, we must have P Q = 0. Proof : Write P = p1 p2 · · · pk , Q = q1 q2 · · · ql , where the pi and qj are odd (not necessarily distinct) primes. By multiplicativity, we have Y k k Y l Y pi pi P = = , Q Q qj i=1 i=1 j=1 where the factors in the last product are Legendre symbols. The law of quadratic reciprocity (for Legendre symbols) then implies that Y k Y l Pk Pl pi −1 qj −1 pi −1 qj −1 qj P Q · 2 = (−1) 2 = (−1) i=1 j=1 2 · 2 . Q pi P i=1 j=1 By lemma 6.1.2 from our last lecture, the exponent of −1 is exactly k X l X p i − 1 qj − 1 P −1 Q−1 · ≡ · , 2 2 2 2 i=1 j=1 hence P −1 Q−1 P = (−1) 2 · 2 , Q as claimed. 2 Application: We calculate the Legendre symbol p , where p is an odd prime; rather, we will show that the Jacobi symbol Q2 obeys the formula from last lecture, namely ( Q2 −1 1 2 = (−1) 8 = Q −1 if Q ≡ ±1 mod 8, if Q ≡ ±3 mod 8, from which the special case of the Legendre symbol follows. By periodicity in the top argument, we have that Q−1 2 2−Q −1 Q − 2 Q−2 = = = (−1) 2 . Q Q Q Q Q Since Q is odd and positive, we must have that (Q, Q−2) = 1, and so by quadratic reciprocity we see that Q−1 Q−1 Q−3 2 Q = (−1) 2 (−1) 2 · 2 ; Q Q−2 again, since one of Q − 1 and Q − 3 must be divisible by 4, we cancel the last factor and obtain Q−1 Q−1 2 Q 2 = (−1) 2 = (−1) 2 . Q Q−2 Q−2 36 By descent, we obtain Q−1 Q−3 2 3 2 2 2 2 = (−1) , (−1) · · · (−1) (−1) Q 3 and finally since 2 is a quadratic nonresidue modulo 3 we have Q2 −1 Q−1 1 Q−1 Q+1 2 = (−1)1+2+···+ 2 = (−1) 2 · 2 · 2 = (−1) 8 , Q and we are done. We can turn this into a general algorithm for computing the Jacobi symbol. Indeed, to compute apply the following steps: P 1. Factor −1 and any powers of 2 from a, leaving Q with P an odd positive number. a Q , we may 2. Use quadratic reciprocity and periodicity. 3. If not finished, return to 1. Note, in particular, that this algorithm doesn’t require us to factor any integers. Example: 53681 is prime and congruent to 1 modulo 4. Is 1311 a quadratic residue modulo 53681? It suffices to compute the Jacobi symbol, which in the case that Q is an odd prime is exactly the Legendre symbol. Using the algorithm outlined above, we find 1311 53681 −70 −1 2 35 = = = 53681 1311 1311 1311 1311 1311 2 35 1311 16 4 = (−1)(1) =− (−1) = = = 1. 1311 35 35 35 So 1311 is indeed a square modulo 53681. Here we will give an outline of a more “traditional” proof of the law of quadratic reciprocity, nearer to the proof given in Niven. We start with a preliminary result. Lemma 6.2.2 (Gauss’s lemma) Let p be an odd prime and let p−1 p+1 p+3 F = 1, 2, . . . , , −F = , ,...,p − 1 . 2 2 2 Given a with (a, p) = 1, let n = #{k ∈ F : ak mod p ∈ −F }. Then ap = (−1)n . Note that from this we can immediately compute p2 , since in this case n = #{ p4 < k < p2 }. Next, we show that p−1 2 X aj n≡ mod 2, p j=1 and we also use the fact that p−1 q−1 2 X aj 2 X kp j=1 p + k=1 q = p−1 q−1 · . 2 2 One proof of this fact counts lattice points in the rectangle R in the first quadrant, whose vertices are at (0, 0), (0, q), (p, 0) and (p, q); specifically, those lying above and below the line segment joining the origin to (p, q) — but this is all the detail we give here. 37 With this machinery, we can show that there are infinitely many primes congruent to 1 modulo 4. Indeed, if p1 , p2 , . . . , pk is any finite list of such primes, let N = (2p1 p2 · · · pk )2 + 1. Then pi - N for i = 1, 2, . . . , k. But since N is one more than a square and odd, we know that all of its prime factors must be congruent to 1 modulo 4; in particular, there must be such a prime which is not on the list. 38 6.3 Lecture Seventeen Final exam date: Friday, December 8, at noon. Definition: A degree-d form (or homogeneous polynomial) is a polynomial, each of whose monomials has degree d. For example, X 3 + 2Y 3 + 3Y 2 Z − 4XY Z is a degree-3 form. A binary form is a form in two variables, and a quadratic form is a degree-2 form. We will focus on binary quadratic forms. Example: One binary quadratic form is f (X, Y ) = X 2 + Y 2 ; another is g(X, Y ) = 53X 2 + 152XY + 109Y 2 . Among the questions we might ask about binary quadratic forms f (X, Y ), two important ones are: 1. Which m ∈ Z are represented by f ? That is, for which m ∈ Z do we have x, y ∈ Z with f (x, y) = m? 2. Which n ∈ Z can be properly represented by f? That is, when is m represented m = f (x, y) with (x, y) = 1? One motivation for the second question is the observation that for any binary quadratic form f , we have f (dx, dy) = d2 f (x, y). We first investigate the form f (X, Y ) = X 2 + Y 2 , and investigate when f represents a prime p. We observe that 2 = 12 + 12 , and from now on will restrict our attention to odd primes p. Lemma 6.3.1 If p ≡ 3 mod 4 and p|(x2 + y 2 ), then p|x and p|y. Proof : Since p|(x2 + y 2 ), we have that x2 ≡ −y 2 mod p. If p - y, then y is a unit modulo p and we have the equivalent congruence (xy −1 )2 ≡ 1 mod p, or p | ((xy −1 )2 + 1), contradicting our result from the end of the last lecture that p | ((2n)2 + 1) implies p ≡ 1 mod 4. Thus p | y, from which we immediately see p | x. In particular, if p ≡ 3 mod 4, then there is no way to express p as the sum of two squares. Proposition 6.3.2 If p ≡ 1 mod 4, then there exist x, y ∈ Z such that x2 + y 2 = p and (x, y) = 1. Proof : Fix some z so that z 2 ≡ −1 mod p, and consider the set S = {u + zv : 0 ≤ u < √ p, 0 ≤ v < √ p}. √ It is not difficult to see that #S = (1 + b pc)2 , and that √ √ (1 + b pc)2 > d pe2 > p, where dxe denotes the ceiling function. Thus by the pigeonhole principle there must be two distinct elements u + zv, u0 + zv 0 (i.e. with not both u = u0 and v = v 0 ) which are congruent modulo p. Define x = u − u0 , y = v 0 − v. Then since u − u0 ≡ z(v 0 − v) mod p, we see that x2 ≡ −y 2 mod p, and so p|(x2 + y 2 ). Moreover, we see that |x2 + y 2 | ≤ |x|2 + |y|2 < 2p, and since we do not have x = y = 0 by our earlier remarks, it follows that x2 + y 2 = p. Furthermore, if d = (x, y), then it follows that d2 |p and hence d = 1. Theorem 6.3.3 (due to Fermat) An integer n is properly represented by X 2 + Y 2 if and only if 4 - n and no prime p ≡ 3 mod 4 has p | n. 39 Proof : Suppose first that n = x2 + y 2 with (x, y) = 1, and let p ≡ 3 mod 4 be prime. If p|(x2 + y 2 ), then by lemma 6.3.2 p|x and p|y, thus (x, y) > 1, a contradiction. Conversely suppose that no prime factor p of n has p ≡ 3 mod 4. Since we know each prime factor is properly represented, its suffices to prove that the product mn of any numbers m, n properly represented by X 2 + Y 2 , is itself properly represented. Write m = w2 + z 2 and n = x2 + y 2 with (w, z) = (x, y) = 1. Then mn = (wx)2 + (wy)2 + (xz)2 + (yz)2 = (wx − yz)2 + (wy − xz)2 , and it suffices to check coprimality. [Here we encounter an error in the proof, the rest of which has been omitted.] In the next lecture, we will prove the following, also due to Fermat. Theorem 6.3.4 Given n ∈ N, write n in its prime factorization as n = 2α k Y pβi i i=1 l Y γ qj j , j=1 where every pi has pi ≡ 1 mod 4 and every qj has qj ≡ 3 mod 4. Then n is represented by X 2 + Y 2 if and only if every γj is even; in other words, if and only if we can write n = ab2 , where p|a ⇒ p 6≡ 3 mod 4 and p|b ⇒ p ≡ 3 mod 4. 40 7 7.1 Week Seven Lecture Eighteen Recall: Theorem 6.3.4. Proof : Lemma 6.3.1 showed that if q|(x2 + y 2 ) and q ≡ 3 mod 4 is prime, then q|x and q|y, thus q 2 |(x2 + y 2 ). Conversely, proposition 6.3.2 showed the converse statement for p ≡ 1 mod 4, and theorem 6.3.3 for 2 and for q 2 , q ≡ 3 mod 4, and since (a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2 we see that representability by X 2 + Y 2 is multiplicative, which completes the proof. Fact: A positive integer n can be properly represented by X 2 + Y 2 if and only if each γj = 0; that is, if and only if no prime congruent to 3 modulo 4 divides n. The proof of one implication was attempted at the end of the last lecture; today, we develop machinery to prove more general statements. [Aside: Lagrange’s Four-Square theorem asserts that any nonnegative integer can be written as the sum of at most four squares. One proves this first for primes, then by showing multiplicative closure of representability by W 2 + X 2 + Y 2 + Z 2 . We may draw an analogy between the corresponding observation in the proof of theorem 6.3.4 and multiplicativity of the complex norm |a + ib|2 = a2 + b2 , and that of the norm in the ring of quaternions, |a + ib + jc + kd|2 = a2 + b2 + c2 + d2 . Moreover let f (X1 , X2 , . . . , Xn ) be any quadratic form. If f represents every integer in the set {1, 2, . . . , 15}, then f represents every integer. This is known as the Fifteen Theorem.] §3.4 – Binary quadratic forms Notation: For the remainder of this lecture, f (X, Y ) = aX 2 + bXY + cY 2 will denote an arbitrary quadratic form of discriminant d = b2 − 4ac. When does f (x, y) = 0 for x, y not both 0? Suppose d is a perfect square. If a 6= 0 then we may factor f over Q via √ ! √ ! b−2 d b− d f (x, y) = a x + x+ y y , 2a 2a and so by proposition 6.2.2 we see that f also factors over Z. In this case, there are many ways to represent 0, as we need only make one of the factors equal zero. If a = 0 then f (X, Y ) = Y (bX + cY ) and we have the same observation. In the case d = 0, we can write f (X, Y ) = e(gX + hY )2 for some integers e, g, h. If e > 0 then f is positive semidefinite; that is, f (x, y) ≥ 0 for any x, y ∈ Z. Similarly if e < 0 then f (x, y) ≤ 0 for all x, y ∈ Z, and f is said to be negative semidefinite. If furthermore f (x, y) = 0 implies that x = y = 0, then f is said to be positive definite (resp. negative definite). Now, suppose d is not a perfect square; then f is irreducible over Q. In particular, ac 6= 0, else d = b2 which is not the case. Theorem 7.1.1 (Theorem 3.10, Niven) Suppose that a binary quadratic form f (X, Y ) has discriminant d < 0; then f is definite (i.e. positive definite or negative definite). Proof : Suppose f (m, n) = 0 and suppose n 6= 0. The identity 4af (x, y) = (2ax + by)2 − dy 2 41 implies that m + b)2 , n so d < 0 is the square of a rational number, which is a contradiction. A symmetric argument with the assumption m 6= 0 completes the proof. (2am + bn)2 − dn2 = 0 ⇔ dn2 = (2am + bn)2 ⇔ d = (2a We might ask: when is f positive? negative? Theorem 7.1.2 (Theorem 3.11, Niven) Let f be a binary quadratic form of discriminant d. If d > 0 then f is indefinite, that is, f represents both positive and negative values. If d < 0 and a > 0, then f is positive definite. If d < 0 and a < 0, then f is negative definite. Proof : Suppose d > 0. Then if a 6= 0 we have that f (1, 0) = a and f (b, −2a) = −ad, and since d > 0 we know that a and −ad have opposite signs, so f is indefinite. The same argument works if we assume c 6= 0, using f (0, 1) = c, f (−2c, b) = −cd. Finally if a = c = 0 then f (1, 1) = b, f (−1, 1) = −b, and since f 6= 0 by assumption this exhausts all cases. Suppose now that d < 0 so that in particular d is not a perfect square. Then we know a 6= 0 and so by our identity we have that 4af (x, y) = (2ax + by)2 + |d|y 2 ≥ 0, from which it follows that a must have the same sign as f (x, y). The same equation shows that if f (x, y) = 0 then y = 0, thus x = 0, and we are done. 42 7.2 Lecture Nineteen Theorem 7.2.1 (Theorem 3.12, Niven) Let d ∈ Z; then there exists a binary quadratic form of discriminant d if and only if d ≡ 0 or 1 mod 4. Proof : Suppose f (X, Y ) = aX 2 + bXY + cY 2 has discriminant d; then d = b2 − 4ac ≡ b2 mod 4, and since the squares modulo 4 are 0 and 1 the result is clear. Conversely, if d ≡ 0 mod 4 we may take 2 f (X, Y ) = X 2 − d4 Y 2 which has discriminant d, and if d ≡ 1 mod 4 we instead take f (X, Y ) = X 2 +XY − d−1 4 Y with the same result. Theorem 7.2.2 (Theorem 3.13, Niven) Let d, n ∈ Z with n 6= 0. There exists a binary quadratic form of discriminant d that properly represents n if and only if the congruence x2 ≡ d mod 4n has a solution. Remark: This theorem guarantees the existence of some binary quadratic form of discriminant d, but representability by a specific form is a much harder question. Example: Take n = −3. There is a binary quadratic form of discriminant d representing −3 if and only if x2 ≡ d mod −12 has a solution. The squares modulo 12 are 0, 1, 4, and 9, and so we see that the only binary quadratic forms representing −3 have discriminant d lying in one of these residue classes modulo 12. Proof : Suppose u2 ≡ d mod 4n, and write u2 − d = 4nv for some integer v. Then with f (X, Y ) = nX 2 + uXY + vY 2 , we see that the discriminant of f is u2 −4nv = d and that f (1, 0) = n. Conversely, suppose that as2 +bst+ct2 = n with (s, t) = 1 and b2 − 4ac = d. Choose m1 , m2 ∈ Z such that (m1 , m2 ) = 1, m1 m2 = 4n, and also (m1 , t) = (m2 , s) = 1. Note that we can always choose such m1 , m2 : for example, m1 = Y pordp (4n) , m2 = p|s 4n . m1 Recalling from last lecture the identity 4af (x, y) = (2ax + by)2 − dy 2 , hence (2as + bt)2 − dt2 ≡ 0 mod m1 ⇔ d ≡ (2ast−1 + b)2 mod m1 , since (t, m1 ) = 1. A symmetric argument shows that d ≡ (2cts−1 + b)2 mod m2 , and since (m1 , m2 ) = 1 the Chinese remainder theorem implies that we have a solution to the congruence x2 ≡ d mod m1 m2 ≡ d mod 4n, and we are done. Corollary 1: Let d ≡ 0 or 1 mod 4, and let p be an odd prime. There exists a binary quadratic form of discriminant d representing p if and only if dp = 0 or 1. Proof : By Theorem 7.2.2 it suffices to show that x2 ≡ d mod 4p has a solution if and only if dp = 0 or 1. Suppose x2 ≡ d mod 4p so that x2 ≡ d mod p; it follows that dp = 0 or 1. Conversely, if dp = 0 or 1, then we may write x2 ≡ d mod p, and since d is a square modulo 4 by assumption we have y 2 ≡ d mod 4, and the Chinese remainder theorem completes the proof. Thus we are led to investigate the set of all binary quadratic forms of a given discriminant. 43 Example: Determine all integers represented by f (X, Y ) = 53X 2 + 152XY + 109Y 2 . If we set y = 2u − 7v, x = −3u + 10v, then a calculation shows that f (x, y) = u2 + v 2 , and thus if n is represented by f , it is also represented by X 2 + Y 2 . Conversely if n is represented by this latter form, then n = u2 +v 2 = f (−3u+10v, 2u−7v), and we see that both forms represent exactly the same set of integers. We can to any binary quadratic form f (X, Y ) = aX 2 + bXY + cY 2 the 2 × 2 symmetric matrix associate b a F = b 2 , which has the property that c 2 x ~x F ~x = f (x, y), ~x = , y T where AT 53 76 76 109 denotes the matrix transpose. In our above example, F = 1 0 2 2 53X + 152XY + 109Y , and G = is associated to g(X, Y ) = X 2 + Y 2 . 0 1 With this in mind, we write our change of variables from our example above as u −3 10 x =: M~u, = ~x = v 2 −7 y hence f (x, y) = ~xT F ~x = (M~u)T F (M~u) = ~uT (M T F M )~u, and indeed, M T F M = G. 44 is associated to f (X, Y ) = 8 8.1 Week Eight Lecture Twenty Recall from last lecture the binary quadratic forms f (X, Y ) = 53X 2 + 152XY + 109Y 2 , g(X, Y ) = X 2 + Y 2 , with their associated matrices 53 76 1 0 F = and G = , 76 109 0 1 −3 10 a b T respectively. We saw that M F M = G, where M = . Recall that if A = , then 2 −7 c d A −1 In our case, det M = 1 and so M −1 1 1 d −b d −b = . = det A −c a ad − bc −c a −7 −10 u x = ; however, we observe that if M = , then −2 −3 v y −7x − 10y u −1 x . = =M −2x − 3y y v Since f (−u, −v) = f (u, v) for any binary quadratic form, the negative signs in this matrix are of no concern. Thus we obtain F = (M −1 )T GM −1 , which combined with our previous relation G = M T F M implies that f and g represent exactly the same integers. Definition: The modular group Γ is the set of all 2 × 2 matrices over Z with determinant 1, with the group operation being multiplication. Also used to denote Γ are SL2 (Z) and SL(2, Z). Since Γ is a group we have that M ∈ Γ ⇔ M −1 ∈ Γ. Definition: Two binary quadratic forms f and g are called equivalent, denoted f ∼ g, if there exists some M ∈ Γ such that M T F M = G, where F and G are the associated matrices of f and g, respectively. a b t , then f (ax + by, cx + dy) = g(x, y). In our It is easy to see that if f ∼ g with M F M = G, M = c d previous example, we showed that 53X 2 + 152XY + 109Y 2 ∼ X 2 + Y 2 . Remark: If M T F M = G, then (−M )T F (−M ) = G. Thus we may take M or −M as we see fit, or equivalently choose a representative from P SL2 (Z) = Γ/{±I}. Theorem 8.1.1 (Theorem 3.16, Niven) ∼ is an equivalence relation. Proof : Reflexivity is clear, as F = I T F I, as is symmetry by our remarks above, so it suffices to prove transitivity. Suppose f ∼ g, g ∼ h, and let M, N ∈ Γ be such that M T F M = G, N T GN = H. Then M N ∈ Γ and (M N )T F (M N ) = H, so f ∼ h, and we are done. 2 Note that if f (X, Y ) = aX 2 + bXY + cY 2 has associated matrix F , then det F = ac − b4 = − d4 , where d is the discriminant of f . In particular, this means that if f ∼ g then their discriminants are equal. Indeed, in our perennial example f (X, Y ) = X 2 + Y 2 , it is not difficult to see that the discriminant of f is −4, as is the discriminant of g. Theorem 8.1.2 (Theorem 3.17, Niven) Let f ∼ g be binary quadratic forms, and let n ∈ Z. Then: 45 1. The representations of n by f are in one-to-one correspondence with the representations of n by g. 2. The proper representations of n by f are in one-to-one correspondence with the proper representations of n by g. Proof : 1. If f (x, y) = n, then ~xT F ~x = (n), and so with M T F M = G we have (M~x)T G(M~x) = (n). This process is invertible, whence we deduce the result. 2. In the calculation in the proof of the first statement, if m|x and m|y then m divides both entries of M~x, and conversely. We seek to understand the structure of the equivalence classes of binary quadratic forms of discriminant d, which our work above shows to be partitioned by ∼. We begin by showing that every equivalence class contains a “nice” form; that is, roughly speaking, one in which b is the smallest coefficient in absolute value and c the largest. Definition: Let f (X, Y ) = aX 2 + bXY + cY 2 be a binary quadratic form. Then f is said to be reduced if one of the following conditions hold: 1. −|a| < b ≤ |a| < |c|. 2. 0 ≤ b ≤ |a| = |c|. 46 8.2 Lecture Twenty-One Recall from last time the notion of a reduced binary quadratic form; there is an algorithm for converting any given binary quadratic form f into an equivalent, reduced binary quadratic form. Example: We will reduce f = f0 (X, Y ) = 53X 2 + 152XY + 109Y 2 , which corresponds to the matrix F = 53 76 . For n ∈ Z, let 76 109 1 n 0 1 Tn = ,S = . 0 1 −1 0 We note that if F1 is defined via F1 = T T−1 F0 T−1 = T 1 −1 53 76 1 −1 53 23 = , 0 1 76 109 0 1 23 10 which corresponds to the form f1 (X, Y ) = 53X 2 + 46XY + 10Y 2 . Next, we set T F2 = S F1 S = 0 1 −1 0 T 53 23 23 10 0 1 −1 0 = 10 −23 , −23 53 so that f2 (X, Y ) = 10X 2 − 46XY + 53Y 2 . Continuing in this way, we set F3 = T2T F2 T2 T 10 −3 1 2 10 −23 1 2 , = = −3 1 0 1 −23 53 0 1 T 0 10 −3 0 1 F4 = S F3 S = −1 −3 1 −1 0 T 1 1 3 1 −3 T F5 = T−3 F4 T−3 = 0 3 10 0 1 T We see that f0 ∼ f5 and that f5 (X, Y ) = X2 1 = 3 1 −3 = 0 1 1 0 3 , 10 0 . 1 2 + Y is reduced. Thus, if M = T−1 ST2 ST−3 −3 10 , then = 2 −7 we have that M t F0 M = F5 . Theorem 8.2.1 (Theorem 3.18, Niven) Let d ≡ 0 or 1 mod 4, with d not a perfect square. Then every equivalence class of binary quadratic forms of discriminant d contains a reduced form. as b2s 2 2 Proof : Let f0 (X, Y ) = a0 X + b0 XY + c0 Y have discriminant d, and for s ≥ 0 let Fs = bs , with Tn cs 2 and S as above. Define an algorithm via: (A) If |cs | < |as |, set Fs+1 = T T Fs T so that as+1 = cs , cs+1 = as , bs+1 = −bs . (B) If |as | ≤ |cs | but |bs | ∈ / (−|as |, |as |], then choose n ∈ Z so that 2as n + bs ∈ (−|as |, |as |]. Indeed, this choice is unique by the division algorithm, writing |as | − bs = (2as )q + r; set n = q. Then set Fs+1 = TnT Fs Tn , so that as+1 = as , bs+1 = 2as n + bs , cs+1 = as n2 + bs n + cs = fs (n, 1). (C) If |as | = |cs | but bs < 0, then set Fs+1 = S T Fs S. 47 We observe that if a binary quadratic form does not satisfy the premises of (A), (B), or (C), then it is reduced; thus it suffices to show that the algorithm terminates. Since d is assumed not to be a perfect square we know that as 6= 0 for any s. We see that (A) is never followed by (A), nor (B) by (B), nor (C) by (C), and moreover since the output of (C) is reduced by construction it remains only to show that we cannot have an infinite loop (A) followed by (B) followed by (A), and so on. But this is clear, since every time we apply step (A), |as | decreases, and so the well-ordering axiom implies that the algorithm terminates. Note that if d is a perfect square, then applying the above algorithm may obtain as = 0, meaning that none of the steps (A), (B), or (C) is triggered unless as = bs = cs = 0. Theorem 8.2.2 (Theorem 3.19, Niven) Let d ∈ Z with d not a perfect square, and let f (X, Y ) = aX 2 + bXY + cY 2 be a reduced binary quadratic form of discriminant d. Then: q 1. If d > 0 then ac < 0 and 0 < |a| < d2 . q 2. If d < 0 then ac > 0 and 0 < |a| < |d| 3 . It is an immediate consequence of this theorem that there are only finitely many equivalence classes of binary quadratic forms of discriminant d, as there are only finitely many such reduced forms: indeed, we must have p b2 − d 0 ≤ |b| ≤ |a| ≤ |d|, c = . 4a The proof will be given in the next lecture; today, we end with the following definition. Definition: Let d ∈ Z with d not a perfect square. The number of equivalence classes of binary quadratic forms of discriminant d is called the class number of d and is denoted H(d). 48 8.3 Lecture Twenty-Two Recall theorem 8.2.2 from last time. Today, we prove the second assertion of the theorem. Proof : (of Theorem 8.2.2, part (2)) Since d < 0 we know that ac > 0, as b2 − 4ac < 0, so in particular |a| > 0. Then |d| = −d = 4ac − b2 = 4|ac| − b2 . Since f is reduced, we have that |b| ≤ |a| ≤ |c|, and so 4|ac| − b2 ≥ 4a2 − a2 = 3a2 , and we have that |a| ≤ q |d| 3 , as claimed. Recall also the definition of the class number H(d) of d. Example: We compute H(−7). We proceed by listing all reduced binary quadratic forms of discriminant −7 2 2 and then checking whether any are equivalent. √ Theorem 8.2.2 shows that if f (X, Y ) = aX + bXY + cY is reduced of discriminant −7, then 0 < |a| ≤ 73 < 2, hence a = ±1. If |a| = |c| = 1 then we have −1 < b ≤ 1, and if |a| < |c| we have 0 ≤ b ≤ 1; that is, in both cases b ∈ {0, 1}. 2 −d yields the following table: Calculating the possibilities for c = b 4a a 1 1 −1 −1 b 0 1 0 1 c 7 4 2 −7 4 −2 valid? no yes no yes (where the last column indicates whether or not aX 2 + bXY + cY 2 is a valid binary quadratic form). It follows from this that H(−7) ≤ 2. Since the discriminant is negative, it follows that both of the binary quadratic forms f (X, Y ) = X 2 + XY + 2Y 2 , g(X, Y ) = −X 2 + XY − 2Y 2 are (positive or negative) definite, and a calculation shows that f (1, 1) = 4 > 0, g(1, 1) = −2. Thus f is positive definite, g is negative definite, and so in particular f 6∼ g and we have that H(−7) = 2. Note that for any binary quadratic form of discriminant d, we have that d = b2 − 4ac ≡ b2 mod 2, so b must have the same parity as d. Example: Which primes are represented by the reduced form f found in our example above? By theorem 7.2.2 we have that n is properly represented by some binary quadratic form of discriminant −7 if and only if there exists a solution to the congruence x2 ≡ −7 mod 4|n|. If n > 0, then x2 ≡ −7 mod 4n implies that n is properly represented by f , since f is the only positive definite reduced binary quadratic form of discriminant −7. Furthermore, if n = p is prime, then every representation of p is proper. For p = 2, take (x, y) = (0, 1) so that f (x, y) = 2. For odd p, we see that f represents p if and only if x2 ≡ −7 mod p has a solution, by the Chinese remainder theorem. If p = 7 this is clear; otherwise, p −1 7 • If p ≡ 1 mod 4 then −7 p = p p = 7 . p −1 7 • If p ≡ 3 mod 4 then −7 p = p p = 7 . The quadratic residues modulo 7 are 1, 2, and 4; thus p is represented by f if and only if p ≡ 0, 1, 2 or 4 mod 7. 49 Theorem 8.3.1 (Theorem 3.25, Niven) Let f (X, Y ) = aX 2 + bXY + cY 2 , g(X, Y ) = a0 X 2 + b0 XY + c0 Y 2 be reduced, positive definite binary quadratic forms. If f ∼ g, then f = g. Proof : Exercise. Consequently, if d < 0 then H(d) equals the number of reduced binary quadratic forms of discriminant d, which is twice the number of such positive definite forms. p [Aside: there is also the notion of the class number of a number field; when d < 0, the class number of Q( −|d|) equals 12 H(d).] 50 9 9.1 Week Nine Lecture Twenty-Three Recall: Theorem 8.3.1 Can we “compose” two binary quadratic forms? We can generalize the multiplication formula (a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2 . Note that if z = a + ib, w = c + id are complex numbers, then the above formula states exactly that |z|2 |w|2 = |zw|2 . Thus, the binary quadratic form f (X, Y ) = X 2 + Y 2 has a “composition law” given by f (a, b)f (c, d) = f (ab − cd, ad + bc); in particular, this implies that the set of numbers represented by f is multiplicatively closed. Can we generalize this idea to arbitrary binary quadratic forms? Example: Let d = −7. We saw last week that the single equivalence class of positive definite binary quadratic forms of discriminant −7 is represented by the reduced form f (X, Y ) = X 2 + XY + 2Y 2 . We factor over the complex numbers, using the quadratic formula: √ ! √ ! 1+i 7 1−i 7 f (a, b) = a + b a+ b . 2 2 Thus we are led to compute √ ! √ ! √ 1+i 7 1+i 7 1+i 7 a+ c+ b d = (ac − 2bd) + (ad + bc + bd), 2 2 2 which implies f (a, b)f (c, d) = f (ac − 2bd, ad + bc + bd), and again we see that the set of represented values is multiplicatively closed. Example: Suppose d = −20. In assignment 4, we verify that there are exactly two positive definite reduced binary quadratic forms of discriminant −20, namely f+ (X, Y ) = X 2 + 5Y 2 , and f− (X, Y ) = 2X 2 + 2XY + 3Y 2 . Observe that the set of values represented by f− is not multiplicatively closed, as indeed f− (1, 0) = 2, f− (0, 1) = 3, but f− (x, y) 6= 6 for any x, y ∈ Z. Indeed, we have the identity 4af− (x, y) = (2ax + by)2 − dy 2 , hence 8f− (x, y) = (4x + 2y)2 + 20y 2 ⇔ 2f− (x, y) = (2x + y)2 + 5y 2 , and thus f− (x, y) = 6 implies that (2x + y)2 + 5y 2 = 12, which is never satisfied, as can easily be verified by checking possible values of x and y. In particular, this means that there is no multiplicative formula (or “composition law”) for f− as there were for our previous examples. Does such a formula exist for f+ ? The identity √ √ √ (a + i 5b)(c + i 5d) = (ac − 5bd) + i 5(ad + bc) 51 implies f+ (a, b)f− (c, d) = f+ (ac − 5bd, ad + bc). We see that if we factor f− using the quadratic formula, we obtain √ ! √ ! √ ! √ ! √ √ 1+i 5 1+i 5 1−i 5 1+i 5 f− (a, b) = 2 a + 2a + 2a + b a+ b = b b . 2 2 2 2 Calculating as before, we obtain √ √ ! √ ! √ √ 1+i 5 1+i 5 2a + b 2c + d = (2ac + ad − 2bd) + i 5(ad + bc + bd), 2 2 which implies f− (a, b)f− (c, d) = f+ (2ac + ad − 2bd, ad + bc + bd). What happens if we consider the product f+ (a, b)f− (c, d)? The relevant calculation is √ ! √ √ √ √ 1+i 5 1+i 5 a + i 5b 2c + d = 2(ac + 2bc − 3bd) + (ad + 2bc + bd), 2 2 hence f+ (a, b)f− (c, d) = f− (ac + 2bc − 3bd, ad + 2bc + bd). Thus we have obtained the following “multiplication table”: f+ f− f+ f− f+ f− f− f+ The entries are understood to mean, for example, that the product of two numbers represented by f+ may also be represented by f+ . In fact, this relation holds on the level of equivalence classes; that is, if f ∼ f+ , g ∼ f− , then f (a, b)g(c, d) = h(x, y) for some x, y linear combinations of a, b, c, d, and h ∼ f− . In general, the set of equivalence classes of positive definite binary quadratic forms of negative discriminant is a group under the operation of “multiplication” alluded to above. This is known as the class group. This ends our discussion of binary quadratic forms; next, we will discuss arithmetic functions; that is, complex-valued functions whose domain is N. 52 9.2 Lecture Twenty-Four §4.2 – Arithmetic functions Notation: Let τ (n) denote the number of positive divisors of n (also used is the notation d(n)). Lemma 9.2.1 Let n have prime factorization n = pe11 pe22 · · · pekk . Any integer d divides n if and only if d = ps11 ps22 · · · pskk , with 0 ≤ sj ≤ ej for every j. Proof : Clearly, with n and d as above we see that n = d(pe11 −s1 pe22 −s2 · · · pkek −sk ). Conversely, if d|n and s p 6= pj is prime with p|d, then p - n, a contradiction. Finally if sj > ej and pj j |d, then pj | dej ; but pj - nej , a pj contradiction, hence d e pj j - n e pj j pj , if and only if d - n, and we are done. One consequence of this lemma is that if n = pe11 pe22 · · · pekk , then τ (n) = #{(s1 , s2 , . . . , sk ) : 0 ≤ sj ≤ ej } = (1 + e1 )(1 + e2 ) · · · (1 + ek ), or more succinctly written, τ (n) = Y (α + 1). pα kn Proposition 9.2.2 If (m, n) = 1, then τ (mn) = τ (m)τ (n). This statement is false if (m, n) > 1; for example, τ (8) = 4 6= 6 = τ (2)τ (4). Proof : We give two sketches, left as exercises. 1. The assertion follows from the multiplicative formula found above. 2. Divisors d of n are in one-to-one correspondence with pairs of integers (d, e) where de = n. Definition: An arithmetic function f : N → C which is not identically zero is called multiplicative if, whenever (m, n) = 1, we have f (mn) = f (m)f (n). Proposition 9.2.2 shows that τ (n) is multiplicative, and from previous work we know that φ(n) is also multiplicative. Indeed, we used this property to prove the formula Y 1 φ(n) = n 1− . p p|n A similar example is given by the function σf (n) = #{x mod n : f (x) ≡ 0 mod n}, where f (X) ∈ Z[X]. The Chinese remainder theorem tells us that σf (n) is multiplicative, and indeed we observe that φ(n) = σX φ(n) −1 (n). Properties of multiplicative functions: Suppose f is a multiplicative function. • For every n, we have the formula f (n) = Y pα kn 53 f (pα ). In particular, f is determined by its values on prime powers. Conversely, any set map f : {pk : p prime, k ∈ N0 } → C induces a multiplicative function. • f (1) = 1. Indeed, since there must be some n with f (n) 6= 0, we have f (n) = f (1 · n) = f (1)f (n). Definition: If an arithmetic function f , not identically zero, satisfies f (mn) = f (m)f (n) for every pair of numbers m, n, then f is said to be totally multiplicative (or completely multiplicative). Clearly, any totally multiplicative function is also multiplicative. Example: For any λ ∈ R, the function fλ (n) = nλ is totally multiplicative. In particular, when λ = 0 we have fλ = 1 for all n, and for λ = 1 we have fλ (n) = id(n) = n for every n. Example: The iota function ι(n), defined ( 1 ι(n) = 0 if n = 1, if n = 6 1, is totally multiplicative. Example: Let f (n) = (−1)n−1 , so that f (n) = 1 if n is odd and −1 if n is even. Then f is not totally multiplicative, as for example f (8) = −1 6= 1 = f (2)f (4); ( 1 if p is odd, however, f (n) is multiplicative, and indeed f is induced by the map f (pα ) = −1 if p = 2. Example: The function f (n) = (−1)n is not multiplicative, and so in particular is not totally multiplicative. Theorem 9.2.3 (Theorem 4.4, Niven) Let f (n) be a multiplicative function and let X F (n) = f (d). d|n Then F (n) is also multiplicative. Proof : As alluded to in the proof of proposition 9.2.2, divisors d of mn are in one-to-one correspondence with ordered pairs (b, c), with bc = d, b|m, c|n. Thus, if (m, n) = 1, we have X XX XX F (mn) = f (d) = f (bc) = f (b)f (c) d|mn = X b|m c|n b|m c|n X f (b) f (c) = F (m)F (n), b|m c|n and we are done. Example: Let f (n) = n0 = 1. Then F (n) = X f (n) = τ (n), d|n giving another proof of the fact that τ is multiplicative. Note that f is totally multiplicative, while F (n) is not. 54 9.3 Lecture Twenty-Five Recall: Theorem 9.2.3. Motivating questions: • Is the converse of theorem 9.2.3 true? That is, if F (n) = multiplicative? P d|n f (d) is multiplicative, must f (n) also be • Given F (n), how can we get information about f (n)? Remark: Given any arithmetic function F , there is exactly one function f so that F (n) = we set f (1) = 1 and recusively define the other values via X f (n) = F (n) − f (d). P d|n f (d). Indeed, d|n, d<n Example: We find the function f (n) satisfying X d|n ( 1 f (d) = ι(n) = 0 if n = 1, if n > 1. We calculate the first couple of values: f (1) = 1, f (2) = F (2) − f (1) = 0 − 1 = −1. Clearly, for any prime p we have f (p) = F (p) − f (1) = −1, f (p2 ) = F (p2 ) − f (p) − f (1) = 0, and indeed f (pk ) = 0 for k > 1. For composite numbers of the form pq where p, q are distinct primes, we have f (pq) = F (pq) − f (p) − f (q) − f (1) = 0 − (−1) − (−1) − 1 = 1 = f (p)f (q), while for n = p2 q we have f (p2 q) = F (p2 q) − f (p) − f (p2 ) − f (q) − f (pq) − f (1) = 0 = f (p2 )f (q). The above calculations suggest that f is multiplicative, which motivates the following definition. Definition: The Möbius function µ(n) is the multiplicative function satisfying, for every prime p, ( −1 if α = 1, µ(pα ) = 0 if α > 1. Equivalently: if n is not squarefree, then µ(n) = 0. Otherwise, writing n = p1 p2 · · · pk with pj distinct primes, one has µ(n) = (−1)k . Notation: Denote by ω(n) the number of distinct prime divisors of n, and by Ω(n) the number of prime factors of n counted with multiplicity. For example, with n = 720 = 24 · 32 · 5, we have ω(n) = 3, Ω(n) = 4 + 2 + 1 = 5. With this notation, we may define ( (−1)ω(n) if n is squarefree, µ(n) = 0 otherwise. 55 Theorem 9.3.1 (Theorem 4.7, Niven) One has X µ(d) = ι(n). d|n This theorem is much more widely invoked than is the definition of µ(n). Proof : We give two proofs. 1. Both sides of the equation are multiplicative by theorem 9.2.3, and we already know that both sides agree when n is a prime power, from which we deduce the result. 2. By definition, X and so if ω(n) = k then there are exactly µ(d) = k j squarefree divisors d of n with ω(d) = j. Thus k X k j=0 d|n (−1)ω(d) , d|n, d squarefree d|n X X µ(d) = ( 1 (−1)j = (1 − 1)k = j 0 if n = 1, if n > 1, and we are done. Theorem 9.3.2 (Theorem 3.8, Niven; the Möbius inversion formula) Let f (n) be an arithmetic function and P let F (n) = d|n f (d). Then n X f (n) = µ(d)F . d d|n For example, for any multiplicative function f (n), we have f (12) = F (12) − F (6) − F (4) + F (2). Proof : The right-hand side of the equation is n X X X X µ(d)F = f (δ) = µ(d) µ(d)f (δ) d n d|n d|n = X δ|n f (δ) X d| n δ dδ|n δ| d µ(d) = X δ|n f (δ)ι n δ = f (n), where we have used the result of theorem 9.3.1, and the result folllows. 56 10 Week Ten 10.1 Lecture Twenty-Six Recall: The Möbius inversion formula. Example: We have proven the identity n = id(n) = X φ(d), d|n and so Möbius inversion implies that φ(n) = X µ(d)id n d d|n = X µ(d)n d|n d ; that is, φ(n) X µ(d) = . n d d|n Note that µ(d) d is multiplicative, thus by theorem 9.2.3 we know that prime powers, we see for α ≥ 1 that φ(n) n is multiplicative. Indeed, checking on φ(pα ) pα−1 (p − 1) p−1 1 = = =1− , α α p p p p and similarly X µ(d) µ(1) µ(p) µ(p2 ) µ(pα ) (−1) 1 = + + + · · · + =1+ + 0 + ··· + 0 = 1 − . 2 α d 1 p p p p p α d|p Theorem 10.1.1 (Theorem 4.9, Niven) Let F (n) be an arithmetic function and define n X f (n) = . µ(d)F d d|n Then F (n) = X f (d). d|n Proof : We have X X f (d) = d|n d|n X d . µ(δ)F δ δ|d With d fixed, as δ ranges over the divisors of d, so does dδ . Thus X X X d X X d f (d) = µ F (δ) = µ F (δ). δ δ d|n Writing d = δ d δ d|n δ|d δ|n d|δ , we have X d|n f (d) = X δ|n n X d X µ F (δ) = F (δ)ι = F (n), δ δ d n δ|n | δ δ 57 and we are done. Definition: Let f (n), g(n) b two arithmetic functions. Their Dirichlet convolution, denoted f ∗ g, is defined n X f (d)g (f ∗ g)(n) = . d d|n Note that Dirichlet convolution is commutative, as n X n X g = f (d) = (f ∗ g)(n). (g ∗ f )(n) = g(d)f d d d|n d|n Example: If g(n) = 1 for every n, then (f ∗ g)(n) = X f (d). d|n (The function g is sometimes written 1.) In particular, this means that id = φ ∗ 1, ι = µ ∗ 1, and τ = 1 ∗ 1. With this notation, we may restate the Möbius inversion formula as: F = f ∗ 1 if and only if f = F ∗ µ. Theorem 10.1.2 If f and g are multiplicative functions, then f ∗ g is multiplicative. Note that this theorem is a generalization of theorem 9.2.3. Proof : If (m, n) = 1, then (f ∗ g)(mn) = X d|mn f (d)g mn d . For each divisor d of mn, we may uniquely factor d = d1 d2 with d1 |m and d2 |n. Thus XX XX mn m n (f ∗ g)(mn) = f (d1 d2 )g = f (d1 )g f (d2 )g d1 d2 d1 d2 d1 |m d2 |n = X d1 |m f (d1 )g d1 |m d2 |n m X n = (f ∗ g)(m)(f ∗ g)(n), f (d2 )g d1 d2 d2 |n as claimed. [Structural remarks: Let A = {f : N → C} be the set of arithmetic functions and let A× = {f ∈ A : f (1) 6= 0}; then (A× , ∗) forms an abelian group. In this group, ι is the identity and 1−1 = µ, which yields yet another statement of the Möbius inversion formula: F = f ∗ 1 ⇔ µ ∗ F = µ ∗ (f ∗ 1) = f ∗ (µ ∗ 1) = f ∗ ι = f. Moreover, by theorem 10.1.2, the set of multiplicative functions forms a subgroup.] Example: Let ( 1 s(n) = 0 if n is a perfect square, otherwise; we will identify s ∗ (µ2 ). 58 Note that s is multiplicative, and is characterized by ( 1 s(pα ) = 0 if 2 | α, if 2 - α. Moreover, µ2 is multiplicative, as the product of two multiplicative functions; hence f = s ∗ (µ2 ) is also multiplicative. We compute: X pα α µ2 (d) = s(pα )µ2 (1) + s(pα−1 )µ2 (p) + · · · + s(1)µ2 (pα ) = s(pα ) + s(pα−1 ) = 1. f (p ) = s d α d|p So f (pα ) = 1 for every α ≥ 1, and it follows that s ∗ (µ2 ) = 1. Note that µ2 is the characteristic function of squarefree numbers, and indeed we see X (s ∗ µ2 )(n) = s(a)µ2 (b) = #{a, b ∈ N : ab = n, a = s2 some s, b squarefree } = 1. ab=n Thus there is a unique way to factor any n ∈ N as n = n0 s2 where n0 is squarefree. For example, if n = 2·32 ·53 ·74 , we have n = (2 · 5)(3 · 5 · 72 )2 . 59 10.2 Lecture Twenty-Seven Properties of Möbius inversion: • We do not assume multiplicativity of the functions; that is, the inversion formula holds for any arithmetic functions. X f (d) and F (n) is multiplicative, then so is f (n), as f = F ∗ µ. • If F (n) = d|n Recall: Dirichlet convolution. When n = pα is a prime power, then n X f (d)g (f ∗ g)(pα ) = = f (1)g(pα ) + f (p)g(pα−1 ) + · · · + f (pα )g(1). d α d|p Let us assign names to these values, so that f (1) = a0 , f (p) = a1 , f (p2 ) = a2 , . . ., and similarly g(1) = b0 , g(p) = b1 , g(p2 ) = b2 , . . . We obtain the following table: α 0 1 2 3 f (pα ) a0 a1 a2 a3 g(pα ) b0 b1 b2 b3 (f ∗ g)(pα ) a0 b0 a0 b1 + a1 b0 a0 b2 + a1 b1 + a2 b0 a0 b3 + a1 b2 + a2 b1 + a3 b0 We observe the similarity with the coefficients of the product of power series: ! ∞ ! ∞ ∞ X X X α α α α f (p )X g(p )X = (f ∗ g)(pα )X α . α=0 α=0 α=0 Example: Find an arithmetic function f such that φ(n) X = f (d), n d|n forgetting that we found it in the previous lecture. Let F (n) = φ(n) n , so that F = f ∗ 1. By Möbius inversion we know that f = F ∗ µ and that f is multiplicative, since F is. Thus we have a table as before: α 0 1 2 3 F (pα ) 1 1 − p1 1 − p1 1 − p1 µ(pα ) 1 −1 0 0 We see that f is the multiplicative function generated by ( f (pα ) = That is, f (n) = µ(n) n , f (pα ) 1 −1 p 0 0 −1 p if α = 1, 0 if α > 1. as before. 60 Example: Define a multiplicative function r via 2 0 r(pα ) = 1 0 if if if if p ≡ 1 mod 4, p ≡ 3 mod 4, p = 2 and α = 1, p = 2 and α > 1. Now, define R = r ∗ s, where s is the indicator function of the perfect squares from lecture twenty-six; note that R is multiplicative. Determine the values of R(pα ). [Aside: Theorem 3.2.2 of Niven tells us that the number of proper representations of n by the binary quadratic form X 2 + Y 2 equals 4r(n). In the statement of theorem 6.3.3 originally given, there was an error, in that we forgot the necessary condition that 4 - n. 2 2 Note also that any representation x2 + y 2 = n corresponds to a proper representation xd + yd = dn2 , where d = (x, y). Thus if Sn denotes the set of representations of n by X 2 + Y 2 , and Snp ⊂ Sn denotes the subset of proper representations, then X X n X n p #Sn = #Sn/g2 = 4r = 4 r s(d) = 4(r ∗ s)(n) = 4R(n). g2 d 2 2 g |n g |n d|n Note in particular that Niven’s functions R and r correspond to our 4R and 4r, respectively.] First, we assume that p ≡ 1 mod 4. We get the table α 0 1 2 3 4 5 r(pα ) 1 2 2 2 2 2 s(pα ) 1 0 1 0 1 0 R(pα ) 1 2 3 4 5 6 In fact, we can prove that R(pα ) = α + 1 for any p ≡ 1 mod 4: if α is even then α R(p ) = α X j α−j r(p )s(p X α ) = r(1)s(p ) + j=0 j r(p ) = 1 + 1≤j≤α, α even X 1≤j≤α, α even 2=1+2 α 2 = α + 1. A similar proof works for α odd, and is left as an exercise. Now, suppose p ≡ 3 mod 4; we obtain α 0 1 2 3 4 5 r(pα ) 1 0 0 0 0 0 s(pα ) 1 0 1 0 1 0 61 R(pα ) 1 0 1 0 1 0 On these primes, r acts like s, so the restriction of r ∗ s to the primes congruent to 3 modulo 4 is simply s. Finally, suppose p = 2; the table this time is α 0 1 2 3 4 5 r(pα ) 1 1 0 0 0 0 On these prime powers, r acts like µ2 , so R acts function generated by α+1 1 R(pα ) = 0 1 s(pα ) 1 0 1 0 1 0 R(pα ) 1 1 1 1 1 1 like µ2 ∗ s = 1. Thus we conclude that R is the multiplicative if if if if p ≡ 1 mod 4, p ≡ 3 mod 4 and α is even, p ≡ 3 mod 4 and α is odd, p = 2. One consequence of this fact is that R(n) = 0, or R(n) = #{d : d|n and p|d ⇒ p ≡ 1 mod 4}. 62 10.3 Lecture Twenty-Eight Example: Let R(n) be the multiplicative function from the last lecture, generated by α + 1 if p ≡ 1 mod 4, 1 if p ≡ 3 mod 4 and α is even, α R(p ) = 0 if p ≡ 3 mod 4 and α is odd, 1 if p = 2. Find a function g such that R(n) = X g(d). d|n nb. We defined X n X n = r s(d). R(n) = r g2 d 2 g |n d|n Note that, since R = g ∗ 1, the Möbius inversion formula implies that g = R ∗ µ, and since R and µ are both multiplicative, we know that g is as well. We observe that X pα α g(p ) = R µ(d) = R(pα )µ(1) + R(pα−1 )µ(p) + · · · + R(1)µ(pα ) = R(pα ) − R(pα−1 ). d α d|p Thus: • If p ≡ 1 mod 4 then g(pα ) = (α + 1) − α = 1. ( 1−0=1 if α is even, α • If p ≡ 3 mod 4 then g(p ) = 0 − 1 = −1 if α is odd. • If p = 2 then g(pα ) = 1 − 1 = 0. Remarks: • Since g(pα ) = g(p)α for every prime p and positive integer α, it follows that g is totally multiplicative. • On odd primes, g(p) equals the Legendre symbol −1 p , and hence on odd n, g(n) equals the Jacobi symbol n−1 −1 2 . n . Thus, for odd n, g(n) = (−1) Consequently, R(n) = X g(d) = #{d|n : d ≡ 1 mod 4} − #{d|n : d ≡ 3 mod 4}. d|n P Some miscellany: Recall that σ(n) = d|n d = 1∗ id. The Greeks defined a perfect number to be a number n whose proper divisors sum to n itself; that is, a number satisfying n = σ(n) − n ⇔ σ(n) = 2n. For example, 6 is perfect, as 6 = 1 + 2 + 3, as is 28 = 1 + 2 + 4 + 7 + 14. The next perfect number is 496, then 8128. Note that σ(n) is multiplicative, and that σ(pα ) = 1 + p + p2 + · · · + pα = 63 pα+1 − 1 . p−1 We see equivalently that n is a perfect number if and only if 2= Y pα+1 − 1 σ(n) = . n pα (p − 1) α p kn Let us factor the first three perfect numbers: 6 = 2 · 3 = 21 (22 − 1), 28 = 22 · 7 = 22 (23 − 1), 496 = 24 · 31 = 24 (25 − 1). This motivates our next result. Theorem 10.3.1 If q = 2p − 1 is prime, then n = 2p−1 q is a perfect number. Recall from a homework problem that if 2k − 1 is prime, then k must be prime, although this is not a sufficient condition as e.g. 211 − 1 = 2047 = 23 · 89. Proof : We give two. (1) By multiplicativity, σ(2p−1 q) = σ(2p−1 )σ(q) = (2p − 1)(q + 1) = 2p (2p − 1) = 2(2p−1 )(2p − 1) = 2(2p−1 q), and we are done. (2) We simply verify that the divisors of 2p−1 q, namely 1, 2, 22 , . . . , 2p−1 , q, 2q, 22 q, . . . , 2p−1 q, sum to 2(2p−1 q). We know exactly 48 numbers of this form, and note that all such numbers by construction are even. The following theorem gives the converse statement. Theorem 10.3.2 If n is an even perfect number, then n = 2p−1 (2p − 1), where both p and 2p − 1 are prime. Proof : Write n = 2k−1 m where k ≥ 2 and m odd. If n is perfect, then 2k m = 2n = σ(n) = σ(2k−1 )σ(m) = (2k − 1)σ(m). Hence (2k − 1)|2k m, so by Euclid’s lemma we have that (2k − 1)|m. Writing m = (2k − 1)l, we have 2k l = σ(m); but l and m are both divisors of m, so σ(m) ≥ m + l = (2k − 1)l + l = 2k l. Thus we have the equality 2k m = 2k l = (2k − 1)l + l = m + l, 2k − 1 so m has exactly two divisors m and l, which are distinct because k ≥ 2, and we must have l = 1. It follows that m = 2k − 1 is prime. σ(m) = Some open conjectures: 1. There are infinitely many Mersenne primes (that is, primes of the form 2p − 1 with p prime), and hence infinitely many even perfect numbers. 2. There are no odd perfect numbers. 64 11 11.1 Week Eleven Lecture Twenty-Nine Diophantine approximation is the technique of finding rational numbers near given real numbers. One fundamental fact of Diophantine approximation that we will use frequently is that, if n ∈ Z and n 6= 0, then |n| ≥ 1. Example: Define ∞ X 1 e= ; n! n=0 we will prove that e is irrational. Indeed, assume not, and choose a, b ∈ Z, b > 0 such that e = ab . Then be ∈ Z and so in particular b!e ∈ Z. Thus we define m = b!e − ∞ b X X 1 b! = b! ∈ Z. n! n! n=0 n=b+1 Clearly m > 0, and moreover in the last sum we see that every term is at most half the previous term, thus ∞ ∞ X X 1 2b! 1 1 2 < b! · n−(b+1) = = ≤ 1. m = b! n! (b + 1)! 2 (b + 1)! b+1 n=b+1 n=b+1 That is, m ∈ Z and 0 < m < 1, which is a contradiction. Thus e ∈ / Q. Lemma 11.1.1 If ab , dc are distinct rational numbers, then ab − dc ≥ 1 |bd| . Proof : This follows from the basic rules of arithmetic: a b ad − bc 1 − = c d bd ≥ |bd| . Theorem 11.1.2 (Theorem 6.8, Niven; Dirichlet’s theorem on Diophantine approximation) Let x ∈ R, n ∈ N. 1 . Then there exists ab ∈ Q with 1 ≤ b ≤ n and |x − ab | ≤ b(n+1) 1 1 nb. It is slightly easier to prove the bound |x − ab | < bn or b(n−1) , but the inequality in the theorem statement c is the best possible result; indeed, we attain equality with x = n+1 , (c, n + 1) = 1. Proof : Define the fractional part of y to be {y} = y − byc ∈ [0, 1). Consider the n real numbers {x}, {2x}, . . . , {nx} and the n + 1 subintervals 1 1 2 n , , ,..., ,1 , 0, n+1 n+1 n+1 n+1 1 whose disjoint union is [0, 1). If some {jx} ∈ [0, n+1 ), then let a b = bjxc j ; we have 1 a jx bjxc {jx} 1 = < = . x − = − b j j j j(n + 1) b(n + 1) 65 n Similarly, if some {jx} ∈ [ n+1 , 1) then we may take a b = bjxc+1 , j and we have a bjxc + 1 jx 1 − {jx} − = < − x = b j j j 1 n+1 j = 1 . b(n + 1) Finally, if neither of these cases occur, then by the pigeonhole principle there exists some subinterval containing 1 {jx} and {kx} with j < k (say), so that |{jx} − {kx}| < n+1 . Then, with a = bkxc − bjxc, b = k − j, we have 1 n+1 a (k − j)x bkxc − bjxc |{kx}{jx}| − < , x − = = b b b b b and we are done. Corollary 1: If x ∈ R \ Q, then there exist infinitely many a b ∈ Q such that |x − ab | < 1 . b2 Proof : Theorem 11.1.2 gives, for every n ∈ N, a rational number abnn with 1 ≤ bn ≤ n and a 1 1 n 0 < x − ≤ < 2. bn bn (n + 1) bn Since x ∈ / Q, we know that |x − abnn | = 6 0, so any given ab can equal only finitely many of the terms an lim x − = 0. n→∞ bn an bn , since We may generalize lemma 11.1.1 as follows: Lemma 11.1.3 Let p(X) ∈ Z[X] have degree d and let a b ∈ Q. If p a b 6= 0, then |p a b |≥ 1 . bd Proof : If p(X) = cd X d + cd−1 X d−1 + · · · + c1 X + c0 , where ci ∈ Z, cd 6= 0, then a = cd ad + cd−1 ad−1 b + · · · + c1 abd−1 + c0 bd ∈ Z. bd p b Hence if p ab 6= 0, then |bd p ab | ≥ 1, and the result is immediate. Definition: Let α ∈ R. We say that α is algebraic of degree d if there exists an irreducible polynomial p(X) ∈ Z[X] such that p(α) = 0. If α is not algebraic, then α is said to be transcendental. √ For example, 2 is algebraic of degree 2, as it is a root of X 2 − 2. Furthermore, α is algebraic of degree 1 if and only if α ∈ Q. Theorem 11.1.4 (Liouville’s theorem on Diophantine approximation) Let α be algebraic of degree d. Then there exists some constant C = C(α) > 0 such that, for any ab ∈ Q, ab 6= α, we have α − a C(α) ≥ d . b b Proof : By taking C(α) ≤ 1 we may assume that ab satisfies |α− ab | ≤ 1. Choose p(X) ∈ Z[X] to be irreducible of degree d and such that p(α) = 0. Then we must have p ab 6= 0 and so by lemma 11.1.3 that |p ab | ≥ b1d . But a a a − p(α) = − α p0 (t), p = p b b b 66 for some t between α and ab , by the mean value theorem. Thus, taking C(α) = we obtain 1 , max{p0 (t) : t ∈ [α − 1, α + 1]} a a a 1 1 0 ≤ p = − α p (t) ≤ − α , · d b b b C(α) b and we are done. It was using this theorem that Liouville first demonstrated (1844) the existence of transcendental numbers. This work preceded by several decades Cantor’s investigation of uncountable sets, which yields a simpler albeit non-constructive proof of the existence of transcendental numbers. 67 11.2 Lecture Thirty Recall: Theorem 11.1.2. It is a trivial consequence of this theorem that the number α= ∞ X 10−n! = 0.11000100 . . . n=1 is transcendental. Indeed, define k X ak = 10−n! , bk n=1 so that bk = 10k! and thus ∞ X α − ak = 10−n! . bk n=k+1 We note that each summand is at most half the previous one, thus ∞ ∞ X X 1 2 −n! α − ak = 10 ≤ 10−(k+1)! n−(k+1) = (k+1)! . bk 2 10 n=k+1 n=k+1 If α were algebraic of degree d, then for some constant C(α) > 0 we would have ak 2 C(α) ≤ α − ≤ k+1 , d bk bk bk and thus bk+1−d ≤ k 2 C(α) . Taking k → ∞ yields a contradiction, and so we see that α cannot be algebraic. Recall: Last lecture we showed that for all α ∈ R\Q there are infinitely many a b ∈ Q such that |α− ab | < 1 . b2 Theorem 11.2.1 (Roth’s theorem) If α is algebraic, then for any > 0 there exists some constant C = C(α, ) such that a C(α, ) a α − ≥ 2+ , for all ∈ Q. b b b §6.1 – Farey sequences Given n ∈ N, the Farey fractions of order n are those ab ∈ Q such that 1 ≤ b ≤ n and 0 ≤ a ≤ b; that is, a Fn = { : 1 ≤ b ≤ n, 0 ≤ a ≤ b} ⊂ Q ∩ [0, 1]. b Usually the set is thought of as being totally-ordered. For example, 0 1 1 1 2 1 3 2 3 4 , , , , , , , , , ,1 . F5 = 1 5 4 3 5 2 5 3 4 5 If we know the first few elements of Fn , how can we compute the next? Proposition 11.2.2 Let b < y ≤ n, and x = a b ∈ Fn with a 6= b. The next element of Fn after ay+1 b . 68 a b is x y, where y ≡ −a−1 mod b, n − Proof : Since ay + 1 ≡ a(−a−1 ) + 1 ≡ 0 mod b, we know that x ∈ Z. Moreover since y ≤ n and 1 ≤ y(b − a), we know x ay + 1 by = ≤ = 1, y by by and thus x y ∈ Fn . Now, suppose c d ∈ Fn with x c − y d a b < + c d c d < xy . Then − a bx − ay 1 = = . b yb yb But by lemma 11.1.1, we know that x c c a 1 y+b n+1 1 n+1 1 1 − + − + = ≥ ≥ · > , ≥ y d d b yd db ybd ybd yb n yb which is a contradiction, and we are done. Corollary 1: If a b < x y are consecutive Farey fractions (for any fixed n), then xb − ay = 1. Corollary 2: If a b < c d < x y are consecutive Farey fractions, then c d = a+x b+y . For example, F4 = 0 1 1 1 2 3 , , , , , ,1 . 1 4 3 2 3 4 The fractions of F5 \ F4 are exactly 1 0+1 2 1+1 3 1+2 4 3+1 = , = , = , = , 5 1+4 5 3+2 5 2+3 5 4+1 which are seen to lie in the respective intervals 0 1 1 1 1 2 3 1 , , , , , , , . 1 4 3 2 2 3 4 1 Next lecture, we will use the Farey fractions to give an alternate proof of Dirichlet’s theorem. 69 11.3 Lecture Thirty-One In the Farey fractions Fn of order n, we have that if rc − sb = 1 and b r < c s are consecutive, then b b+c c < < with r + s ≥ n + 1. r r+s s Indeed, the condition r + s ≥ n + 1 is necessary for our second result, otherwise the middle fraction is itself a Farey fraction, a contradiction. Recall: Dirichlet’s theorem on Diophantine approximation (theorem 11.1.2), which states that if x ∈ R, n ∈ N, 1 then there exists aq ∈ Q with 1 ≤ q ≤ n and |x − aq | ≤ q(n+1) . Proof : If α ∈ Fn , then take a q = α. Otherwise, choose b r < c s to be consecutive in Fn such that c b <α< , r s by replacing α with {α} if necessary. We now have two cases. 1. Suppose b b+c <α≤ , r r+s and take a q = rb . We have α − b b + c b cr − bs 1 1 ≤ − = = ≤ , r r+s r r(r + s) r(r + s) r(n + 1) and by assumption 1 ≤ r ≤ n. 2. If instead we have we instead take a q b+c ≤α< r+s = sc , and the proof unfolds in the same c , s way. §7.1 – The Euclidean algorithm We can think of continued fractions as a consequence of the Euclidean algorithm. Example: We find (76, 26). Simple calculation shows 73 = 2 · 26 + 21, 26 = 1 · 21 + 5, 21 = 4 · 5 + 1, 5 = 5 · 1 + 0. Note also that 73 21 1 1 =2+ =2+ =2+ 5 . 26 26 (26/21) 1 + 21 Continuing in this fashion, we have 73 =2+ 26 1 5 1+ 21 1 =2+ 70 1+ 1 4+ . 1 5 This is an example of the type of expression we will now study. Definition: A continued fraction is an expression of the form 1 x0 + , 1 x1 + x2 + 1 .. .+ 1 xj where xi ∈ R and x0 , x1 , . . . , xj > 0; we will mostly be interested in the situation when xi ∈ Z for every i. We have the shorthand notation hx0 ; x1 , x2 . . . , xj i. For example, 76 21 26 = 2; 1, = h2; 1, 4, 5i . = 2; 23 21 5 Example: Find a simple expression for h1; 3, 1, 5, xi as a function of x > 0. We have 1 h1; 3, 1, 5, xi = 1 + 1 3+ 1+ 1 =1+ 1 5+ 1 3+ 1 x 1+ 1 =1+ 3+ x 5x + 1 5x + 1 6x + 1 =1+ 6x + 1 29x + 5 = . 23x + 4 23x + 4 We may write the above calculation more compactly as 5x + 1 6x + 1 23x + 4 29x + 5 h1; 3, 1, 5, xi = 1; 3, 1, = 1; 3, = 1; = . x 5x + 1 6x + 1 23x + 4 Some useful identities: • hx0 ; x1 , x2 , . . . , xj i = x0 + hx1 ;x2 ,x13 ,...,xj i . D • hx0 ; x1 , x2 , . . . , xj i = x0 ; x1 , x2 , . . . , xj−2 , xj−1 + 1 xj E . 73 Example: We find a fraction between 14 5 = 2.8 and 26 = 2.8076923, with minimal denominator. Note that 14 76 5 = h2; 1, 4i and 23 = h2; 1, 4, 5i. The function x 7→ h2; 1, 4, xi for x > 0 is a decreasing function of x and satisfies 73 14 f (5) = , lim f (x) = . 26 x→∞ 5 Thus taking x = 6 we have 87 f (6) = h2; 1, 4, 6i = = 2.8064 . . . 31 It is no coincidence that this is the Farey mediant 14+73 5+26 of 14 5 and 73 26 in F31 . It is not difficult to see that f (x0 , x1 , . . . , xk ) = hx0 ; x1 , x2 , . . . , xk i is an increasing function of xj for every even j and a decreasing function of xj for every odd j. Thus if ai , bi ∈ Z, we have that ha0 ; a1 , a2 , . . . , ak i < hb0 ; b1 , b2 , . . . , bk i if and only if 71 • a0 < b0 , or • a0 = b0 and a1 > b1 , or • a0 = b0 and a1 = b1 and a2 < b2 , or . . . Thus we have an alternating lexicographic ordering on the integral continued fractions. To compare ha0 ; a1 , a2 , . . . , ak i to ha0 ; a1 , a2 , . . . , al i with k < l, we write, formally, ha0 ; a1 , a2 , . . . , ak i = ha0 ; a1 , a2 , . . . , ak , ∞i . Finally since we may always write, for example, 4=3+ 1 ⇒ h2; 1, 4i = h2; 1, 3, 1i , 1 we remark on the special case ha0 ; a1 , a2 , . . . , ak i = ha0 ; a1 , a2 , . . . , ak − 1, 1i . Notation: For the Euclidean algorithm applied to the pair (u0 , u1 ), we write u0 = u1 a0 + u2 , 0 < u2 < u1 , u1 = u2 a1 + u3 , 0 < u3 < u2 , .. . uk−1 = uk ak−1 + uk+1 , 0 < uk+1 < uk , uk = uk+1 ak + uk+2 , 0 = uk+2 < uk+1 . We call the ai coefficients partial quotients. We have equivalently u0 1 u0 = a0 + , a0 = , u1 u1 /u2 u1 1 u1 u1 = a1 + , a1 = , u2 u2 /u3 u2 .. . u0 uk = ak , a k = u1 uk+1 = uk . uk+1 Similarly, we have for example u1 1 1 = n o = u0 − a0 . u0 u2 u1 u1 72 12 12.1 Week Twelve Lecture Thirty-Two The Process: Given ξ ∈ R, define ξ0 = ξ and set a0 = bξ0 c, ξ1 = 1 1 = , ξ0 − a 0 {ξ0 } a1 = bξ1 c, ξ2 = 1 1 = , ξ1 − a 1 {ξ1 } and so on. We saw in our last lecture that if ξ = m n , then The Process is exactly the Euclidean algorithm applied to find (m, n); in particular, The Process eventually terminates. Conversely, if ξ ∈ R \ Q, then The Process never terminates. Furthermore, we see that ξ = hξi = ha0 ; ξ1 i = ha0 ; a1 , ξ2 i = · · · The numbers aj are called the partial quotients of ξ. √ Example: Let ξ = 3 2 = 1.25992 . . . We have ξ0 = ξ, and √ 3 a0 = b 2c = 1, ξ1 = 1 = 3.84732 . . . ξ0 − 1 1 = 1.18019 . . . ξ1 − 3 1 a2 = bξ2 c = 1, ξ3 = = 5.54974 . . . ξ2 − 1 1 a3 = bξ3 c = 5, ξ4 = = 1.81905 . . . ξ3 − 5 a1 = bξ1 c = 3, ξ2 = We have that √ 3 solving this expression for ξ4 , we obtain 2 = h1; 3, 1, 5, ξ4 i = 29ξ4 + 5 ; 23ξ4 + 4 √ 432−5 √ ξ4 = . −23 3 2 + 29 Definition: Given a0 ∈ Z, a1 , a2 ∈ N, define recursively the sequences h−2 = 0, h−1 = 1, hj = aj hj−1 + hj−2 for j ≥ 0, k−2 = 1, k−1 = 0, kj = aj kj−1 + kj−2 for j ≥ 0. Furthermore for j ≥ 0 define rj = hj kj ; if the coefficients aj are those found in The Process applied to ξ ∈ R, √ then rj is called the jth convergent to ξ. Continuing from our last example, the partial quotients of 3 2 are 1, 3, 1, 5, . . . We have the following table: j −2 −1 0 1 2 3 aj 1 3 1 5 hj 0 1 1 4 5 29 73 kj 1 0 1 3 4 23 rj 0 ∞ 1 4 3 5 4 29 23 Note that r0 = 1, r1√= 1.3333 . . . , r2 = 1.25, r3 = 1.26087 . . ., so that the convergents are indeed good rational approximations to 3 2 = 1.25992 . . .. Theorem 12.1.1 (Theorem 7.3, Niven) For any x > 0, we have that ha0 ; a1 , a2 , . . . , aj−1 , xi = xhj−1 + hj−2 . xkj−1 + kj−2 In particular, ha0 ; a1 , a2 , . . . , aj−1 , aj i = hj aj hj−1 + hj−2 = . aj kj−1 + kj−2 kj Proof : We use induction. In the j = 0 case we have that hxi = assume the claim holds up to j. We have ha0 ; a1 , a2 , . . . , aj , xi = ha0 ; a1 , a2 , . . . , aj−1 , aj + = x·1+0 0·x+1 which is clearly so, and thus we may (aj + x1 )hj−1 + hj−2 1 i= x (aj + x1 )kj−1 + kj−2 xhj + hj−1 (aj hj−1 + hj−2 )x + hj−1 = . (aj kj−1 + kj−2 )x + kj−1 xkj + kj−1 Example: Suppose aj = 1 for all j ≥ 0. Then hj = Fj+2 , kj = Fj+1 , where Fn are the Fibonacci numbers Fn = Fn−1 + Fn−2 normalized so that F0 = 0, F1 = 1. In particular, Fj+1 j→∞ h1; 1, 1, . . . , 1i = −→ ϕ, | {z } Fj j copies where ϕ = √ 1+ 5 2 = 1.618033 . . . is the golden ratio. Theorem 12.1.2 (Theorem 7.5, Niven) For j ≥ −1 one has hj kj−1 − kj hj−1 = (−1)j−1 . In particular, this means that (hj , kj ) = 1 for every j and that rj − rj−1 = (−1)j−1 . kj kj−1 Proof : Exercise. (hint: use induction) From the last equation, we know that rj > rj−1 if and only if j is odd. Theorem 12.1.3 (Convergence of convergents) Let ξ ∈ R and let a0 , a1 , a2 , . . . be its partial quotients, with ξj , hj , kj , rj defined as above. Then (−1)j ξ − rj = , kj (ξj+1 kj + kj−1 ) and in particular lim rj = ξ. j→∞ Proof : We apply theorems 12.1.1 and 12.1.2 to obtain ξ − rj = ha0 ; a1 , a2 , . . . , aj , ξj+1 i − rj = = ξj+1 hj + hj−1 hj − ξj+1 kj + kj−1 kj hj−1 kj − hj kj−1 (−1)j = , kj (ξj+1 kj + kj−1 kj (ξj+1 kj + kj−1 ) 74 and we are done. Note that aj+1 ≤ ξj+1 < aj+1 + 1. Given n ∈ N, then choosing j so that kj ≤ n < kj+1 , then we can show that 1 ξ − hj ≤ . kj kj (n + 1) Thus every convergent rj confirms Dirichlet’s theorem on Diophantine approximation. We may also restate the theorem thus: kj−1 1 ξ − hj = 1 · , where aj+1 ≤ ξj+1 + ≤ aj+1 + 2. 2 kj kj kj ξj+1 + kj−1 /kj Hence, the greater aj+1 , the better the approximation rj = ha0 ; a1 , a2 , . . . , aj i is to ξ. 75 12.2 Lecture Thirty-Three Recall: Theorem 12.1.1 tells us that ξ= ξj hj−1 + hj−2 , ξj kj−1 + kj−2 ξj = ξkj−2 − hj−2 . −ξkj−1 + hj−1 from which it follows that Example: Let ξ = ξ0 = √ 41 = 6.4312 . . . We see that √ a0 = b 41c = 6, ξ1 = 1 = 2.48062 . . . ξ0 − 6 a1 = bξ1 c = 2, ξ2 = 1 = 2.08062 . . . ξ1 − 2 a2 = bξ2 c = 2, ξ3 = 1 = 12.40312 . . . ξ2 − 2 We have the table: j −2 −1 0 1 2 Thus aj 6 2 2 hj 0 1 6 13 32 kj 1 0 1 2 5 1 ξk−1 − h−1 =√ , −ξk0 + h0 41 − 6 √ ξk0 − h0 41 − 6 √ ξ2 = , = −ξk1 + h1 −2 41 + 13 √ 2 41 − 13 ξk1 − h1 √ = . ξ3 = −ξk2 + h2 −5 41 + 32 ξ1 = Rationalizing denominators, we obtain √ √ 1 41 + 6 41 + 6 ξ1 = √ ·√ = , 5 41 − 6 41 + 6 √ √ √ 41 − 6 2 41 + 13 4 + 41 √ · √ = , ξ2 = 5 −2 41 + 13 2 41 + 13 √ √ √ 2 41 − 13 5 41 + 32 √ · √ = 6 + 41. ξ3 = −5 41 + 32 5 41 + 32 √ 41 = h6; 2, 2, 6 + 41i, hence √ √ √ 6 + 41 = h12; 2, 2, 6, 6 + 41i = h12; 2, 2, 12, 2, 2, 6 + 41i = · · · √ √ Thus 41 = h6; 2, 2, 12i; that is, 41 has a periodic continued fraction. We see that √ Lemma 12.2.1 If the continued fraction of ξ ∈ R is eventually periodic, then ξ is a quadratic irrational, i.e. it is the root of some quadratic polynomial with integer coefficients. 76 Proof : For simplicity we will assume that the continued fraction is purely periodic, although the stronger claim is true; that is, assume ξ = ha0 ; a1 , a2 , . . . , aj−1 i. Then ξ = ha0 ; a1 , a2 , . . . , aj−1 , ξi = ξhj−1 + hj−2 , ξkj−1 + kj−2 hence ξ(ξkj−1 + kj−2 ) = ξhj−1 hj−2 , and so kj−1 ξ 2 + (kj−2 + hj−1 )ξ − hj−2 = 0. √ Lemma 12.2.2 Every real quadratic irrational r + s c, where r, s ∈ Q and c ∈ N is not a perfect square √ m+ d 2 (written c ∈ N \ N ) can be written q , where m, q ∈ Z, d ∈ N \ N2 , and q|(d − m2 ). Proof : Taking a common denominator for r and s, we may write √ √ √ √ a+b c a + cb2 ae + cb2 e2 , r+s c= = = e e e2 and the claim is now immediate. √ The Quadratic Irrational Process: Let ξ = ξ0 = lemma 12.2.2. For j ≥ 0, define aj = bξj c, mj+1 = aj qj − mj , qj+1 m0 + d , q0 where d, m0 , and q0 satisfy the conditions of d − m2j+1 mj+1 + , ξj+1 = = qj qj+1 √ d . The aj and ξj so produced are the same as those produced in The Process. √ Example: ξ = ξ0 = 41, so that m0 = 0, d = 41, q0 = 1. √ k 41 − 62 6 + 41 j = 0 : a0 = 41 = 6, m1 = 6 · 1 − 0 = 6, q1 = = 5, ξ1 = . 1 1 $ √ √ % 41 − 42 4 + 41 6 + 41 = 2, m2 = 2 · 5 − 6 = 4, q2 = = 5, ξ2 = . j = 1 : a1 = 5 5 5 $ √ √ % 6 + 41 4 + 41 41 − 62 j = 2 : a2 = = 2, m3 = 2 · 5 − 4 = 6, q2 = = 1, ξ2 = . 5 5 1 j√ Theorem 12.2.3 (Theorem 7.19, Niven) Given a quadratic irrational ξ0 , we have: 1. The qj from The Quadratic Irrational Process are integers which are eventually positive. 2. The qj and the mj are bounded. 3. The continued fraction for ξ0 is eventually periodic. √ Example: The quadratic irrational − 12 − 43 5 has continued fraction h−3; 1, 4, 4, 1, 1, 1, 5, 3, 5i. Proof : (sketch) (1) ⇒ (2): Since qj > 0 for all j sufficiently large, and qj+1 + qj + m2j = d, we see that there are only finitely many choices for the qj , mj . (2) ⇒ (3) There are only finitely many pairs (mj , qj ), and so by the pigeonhole principle there must eventually occur a duplicate. The pair (mj , qj ) determines the values for the next step of The Quadratic Irrational Process. 77 (3) ⇒ (1) Highly nontrivial, and omitted. √ Theorem 12.2.4 (Theorem 7.21, Niven) Let d ∈ N \ N2 and set c = d. Then bcc + c has a purely periodic continued fraction ha0 ; a1 , a2 , . . . , ar−1 i with a0 = 2c. Hence c = hc; a1 , a2 , . . . , ar i where ar = 2c. √ We refer to our earlier example, where we found that 6 + 41 has a purely periodic continued fraction. Proof : (omitted) √ Facts: If ξ = d and qj are defined as above, then: • For every j we have qj 6= −1. • If r is the period of the continued fraction of ξ, then qj = 1 if and only if r | j. 78 12.3 Lecture Thirty-Four Notation: Throughout this lecture, d denotes a positive√integer that is not a perfect square. The symbols aj , hj , kj denote the terms from The Process applied to d, and similarly for mj , qj . Pell’s equation: We are interested in integer solutions to the equation x2 − dy 2 = N for some fixed N ∈ Z; in particular, we seek solutions where both x and y are positive. √ Theorem 12.3.1 (Theorem 7.24, Niven)√If |N | < d, then for any positive solution (x, y) to Pell’s equation we must have that xy is a convergent to d. In particular, if (x, y) = 1 then we must have that x = hj and y = kj for some j. Proof : (omitted) Example: Every solution of x2 − 41y 2 = −1 must come from a convergent of that in this case h2 = 32, k2 = 5, and indeed √ 41. We saw in our last lecture (32)2 − 41(5)2 = 1024 − 1025 = −1. Theorem 7.22 of Niven gives us the following key identity: for j ≥ −1, one has h2j − dkj2 = (−1)j+1 qj+1 . At the√end of our last lecture we saw that qj = 1 if and only if r|j, where r is the period of the continued fraction of d. It is a corollary (Corollary 7.23) that, for every l ≥ 0, we have 2 h2lr−1 − dklr−1 = (−1)lr . Example: We solve Pell’s equation for d = 45. We have √ 45 = h6; 1, 2, 2, 2, 1, 12i, so r = 6. Then with l = 1, we have by corollary 7.23 that h5 = 161, k5 = 24, hence 1612 − 45(24)2 = q6 = 1. So a solution to x2 − 45y 2 = 1 is x = 161, y = 24. Note that h5 = r5 = h6; 1, 2, 2, 2, 1i. k5 Another solution is given by l = 2; we have h11 51841 = r11 = h6; 1, 2, 2, 2, 1, 12, 1, 2, 2, 2, 1i = , k11 7728 and indeed we have that 518412 − 45(7728)2 = 1. 2 Theorem 12.3.2 (Theorem 7.25, Niven) All solutions to x2 − √ dy = ±1 are of the form x = hlr−1 , y = klr−1 , where l ≥ 0 and r is the period of the continued fraction of d. Furthermore if r is even then there are no positive solutions to x2 − dy 2 = −1, and the positive solutions to x2 − dy 2 = 1 are exactly x = hlr−1 , y = klr−1 with l ≥ 1; if r is odd, then the positive solutions to x2 − dy 2 = −1 are exactly x = hlr−1 , y = klr−1 where l is odd and positive, and the positive solutions to x2 − dy 2 = 1 are exactly x = hlr−1 , y = klr−1 where l is even and positive. In every case, y = y(l) is a strictly increasing function of l. This is the main important result of our foregoing work. Remark: Suppose s2 − dt2 = A, u2 − dv 2 = B. Factoring over the reals gives √ √ √ √ A = (s − t d)(s + t d), B = (u − v d)(u + v d), 79 from which it follows that AB = ((su + dtv) − √ √ d(sv + tu))((su + dtv) + d(sv + tu)) = (su + dtv)2 − d(sv + tu)2 . √ In particular, if A = 1, then we get new solutions to the equation x2 − dy 2 = A by considering (s + t d)l with l ≥ 2. Example: Suppose d = 45. Set s = 161, t = 24 so that s2 − dt2 = 1. We have √ √ √ √ (161 + 24 45)2 = 51841 + 7728 45, (161 + 24 45)3 = 16, 692, 641 + 2, 488, 392 45, and indeed 16, 692, 6412 − 45 · 2, 488, 3922 = 1, h17 = 16, 692, 641, k17 = 2, 488, 392. Proof : (omitted) Theorem 12.3.3 (Theorem 7.26, Niven) Set x1 = hr−1 , y1 = kr−1 , where r is the period of the continued √ fraction of d. Define xl , yl recursively via √ √ xl + yl d = (x1 + y1 d)l . Then xl = hlr−1 and yl = klr−1 . Proof : (omitted) Theorems 12.3.2 and 12.3.3 together tell us that the smallest (in terms of y) solution to x2 − dy 2 = ±1 √ is given by x1 = hr−1 , y1 = kr−1 , and moreover that all solutions may be found by taking exponents of x1 + y1 d. Example: Suppose d = 41; then the smallest positive solution to x2 − 41y 2 = −1 is x1 = h2 = 32, y1 = k2 = 5. Thus √ √ √ x2 + y2 41 = (32 + 5 d)2 = 2049 + 320 41. By theorem 12.3.3, (2049, 320) is the smallest positive solution to x2 − 41y 2 = 1. 80 13 13.1 Week Thirteen Lecture Thirty-Five Miscellany about continued fractions: Given an arbitrary continued fraction, must it correspond to a real number? Let a0 ∈ Z, a1 , a2 , . . . ∈ N, and define L = ha0 ; a1 , a2 , . . .i = lim ha0 ; a1 , a2 , . . . , an i. n→∞ Theorem 13.1.1 The limit L always exists and is irrational. Moreover, the partial quotients of L are exactly a0 , a1 , a2 , . . . Recall: If rn denotes the nth convergent of L, we have rn = ha0 ; a1 , . . . , an i and moreover rn − rn−1 = (−1)n−1 . kn kn−1 This implies that the convergents oscillate around L. Indeed, define αn = rn = a0 + 1 kn kn−1 so that n X (−1)j−1 αj ; j=1 as a decreasing, alternating series, we know that this series converges and thus that the convergents also converge. Example: Define x = h1; 1, 1, . . .i so that x = 1 + x1 . This yields the quadratic equation x2 − x − 1 = 0 and since x > 0 we deduce that x = defined there, we have √ 1+ 5 2 = ϕ, as introduced in lecture thirty-two. With the Fibonacci numbers as 1 Fn = √ (ϕn − (−ϕ)n ), and m|n ⇒ Fm |Fn . 5 Definition: A real number is called simply normal in base-10 if, for every i ∈ {0, 1, . . . , 9}, the probability of randomly selecting an i in its decimal expansion is 0.1. There is an analogous definition for simple normality in base-b. A real number is normal base-b if it is simply normal base-b, base-b2 , base-b3 , and so on. For example, 0.0123456789 is simply normal base-10, but not normal. Theorem 13.1.2 Almost all real numbers are normal base-10. Champernowne’s number: Let c = 0.12345678910111213 . . . D.G. Champernowne showed in 1933 that c is normal base-10. It is conjectured that the following numbers are normal: π, e, log 2, and any q ∈ Q of degree at least 3. It is a trivial consequence of theorem 13.1.2 that almost all real numbers are normal in every base simultaneously. Back to continued fractions: given ξ ∈ R, define δk (ξ) = lim x→∞ #{n ≤ x : an = k} . x Aleksandr Khinchin showed that, for almost all ξ ∈ R, δk (ξ) exists and equals log2 (1 + δ1 ≈ 0.415, δ2 ≈ 0.170, δ3 ≈ 0.093, . . . 81 1 k(k+2) ), thus One number which fails this test is e = h2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10, 1, . . .i Furthermore, any number of the form me+n re+s also fails Khinchin’s theorem. It is conjectured that the following numbers satisfy Khinchin’s theorem: π, e, log 2, and any q ∈ Q of degree at least 3. Khinchin also proved (1934) that, for almost all ξ ∈ R, one has lim (a1 a2 · · · an )1/n = n→∞ ∞ Y 1 k(k + 2) 1+ k=1 log2 k = 2.6854520010 . . . Theorem 13.1.3 (Theorem 7.17, Niven) For all ξ ∈ R \ Q, there exist infinitely many ab ∈ Q such that √ 1 |ξ − ab | < √5b 5 is the best possible such bound. 2 , and moreover √ By discarding the (countable) set of real numbers ξ for which the bound 5 is necessary, we may improve the √ √ √ 221 1517 bound to 8; repeating this process we obtain bounds of 5 , 13 , . . . These numbers arise naturally in the study of the Markov spectrum. Theorem 13.1.4 (Theorem 7.14, Niven) If |ξ − ab | < 1 , 2b2 82 then a b is a convergent to ξ. 13.2 Lecture Thirty-Six Numerical examples of continued fractions Let y = 365.242199 . . . be the number of solar days in a year; it has been a challenge for centuries to construct a calendar which takes into account this lack of integrality. Numa Pompilius devised a calendar (ca. 713 BCE) in which occasional and irregular leap months would be added into the middle of February. Julius Caesar (48 BCE) devised the Julian calendar, in which every year has 365 days, except for every fourth year which has 366. While divergence from the true count is slow in the Julian calendar (amounting to about 11 days over 1800 years) it is noticeable; in 1582, Pope Gregory XIII introduced the Gregorian calendar as a replacement. In this calendar, every year divisible by 4 is a leap year, except years divisible by 100 and not 400. This is the most widely-used calendar in contemporary Western society; it averages 365.2425 days per year, and so diverges by about 3 days every 10,000 years. The continued fraction of y is h365; 4, 7, 1, 3, 5, 20, . . .i, and the convergents to y − 365 are 1 7 8 31 163 , , , , ,... 4 29 33 128 673 To get a good rational approximation, we need to truncate before a large partial quotient. Using the convergent 31 128 , we might say that we have a leap year every year which is divisible by 4, except years that are divisible by 128. In hexadecimal: a year is a leap year if it ends in 0, 4, 8, or C, unless it ends A00. This diverges by about one day every 87,000 years, and we have 365 31 = 365.2421875. 128 Now, let m = 29.53059 . . . be the number of days in a lunar month (that is, from one new moon to the next), y so that we have m = 12.3683 . . . Taking the continued fraction, y = h12; 2, 1, 2, 1, 1, 17, . . .i, x and the convergents of y x − 12 are 1 1 3 4 7 , , , , ,... 2 3 8 11 19 Modern lunisolar calendars have 7 leap months every 19 years, diverging by one month every 6800 years. In modern western music, the A above middle C is assigned the frequency 440Hz. By doubling this frequency, we obtain a note one octave higher; tripling it, we obtain a perfect fifth between 880Hz and 1320Hz. Unfortunately much like the alignment of months and years, the alignment of octaves and fifths is out of sync; indeed, (3/2)12 ≈ 1.015. 27 However, an equally-tempered tuning divides each octave into 12 equal segments, so each semitone is an increase by a factor of 21/12 ; in this case we see 27/12 ≈ 1.498. We take the continued fraction: log2 (3/2) = log(3/2) = 0.58496 . . . = h0; 1, 1, 2, 2, 3, 1, 5 . . .i, log 2 with convergents 1 3 7 24 1, , , , , . . . 2 5 12 41 83 So if we wanted to divide the octaves into x notes so that an interval of y of them make a perfect fifth, we would be better to take x = 41, y = 24. Pythagorean triplets: What are all positive integer solutions to the equation x2 + y 2 = z 2 ? A primitive triplet is a solution to this equation in which (x, y) = 1. Theorem 13.2.1 (Theorem 5.5, Niven) The positive, primitive Pythagorean triplets (with y even) are parameterized by: x = r2 − s2 , y = 2rs, z = r2 + s2 , where r > s > 0, (r, s) = 1, and r and s have opposite parity. nb. For any primitive (x, y, z), exactly one of x and y is even. Proof : We give two sketches. 1. We may factor y 2 = (z − x)(z + x), hence y 2 2 = z+x z−x x−z · , with ( x+z 2 , 2 ) = 1. 2 2 2 z−x 2 By Euclid’s lemma, we must have that z+x 2 =r , 2 =s . 2 2 2. We have xz + yz = 1, and so we seek to find the rational points q of the unit circle. The line joining any rational point q to (−1, 0) has rational slope; conversely, any line through (−1, 0) with rational slope intersects the circle in a rational point: y = mx + b, m ∈ Q ⇒ x2 + (m(x + 1))2 = 1 ⇔ (x + 1)((m2 + 1)x + (m2 − 1)) = 0. So, all rational points on the circle have the form 1 − m2 2m , , m ∈ Q. 1 + m2 1 + m2 The approach of proof (2) generalizes to arbitrary conic sections. 84 13.3 Lecture Thirty-Seven Final exam review At least half of the problems on the final will be taken from homework problems. No calculators are permitted. Below is a brief overview of the important topics covered. Chapter One – Divisibility • The Euclidean algorithm: calculating the gcd, Bézout’s identity, calculating inverses modulo m. • The Fundamental theorem of arithmetic. • Euclid’s theorem Chapter Two – Congruences • The Chinese remainder theorem. • Euler’s theorem; Fermat’s little theorem. • The Euler φ-function. • Primitive roots; the structure of Z× n. • Hensel’s lemma. • Solving linear congruences ax ≡ b mod m. • The number of solutions of xn ≡ a mod p. Example problems: Find all n ∈ Z such that 3n ≡ n mod 7. Show that aφ(n) ≡ a2φ(n) mod n for all a ∈ Z, n ∈ N. Prove that a squarefree integer n is a Carmichael number if and only if (p − 1)|(n − 1) for every p|n. Chapter Three – Quadratic Reciprocity and Quadratic Forms • Sums of two squares. • The law of quadratic reciprocity. • Jacobi symbols, Legendre symbols; special known values of the same. • Quadratic residues and nonresidues. • Euler’s criterion. • Binary quadratic forms Example problem: In Z× n , prove that at most half of the elements are quadratic residues, and that exactly half of them are quadratic residues if and only if n has a primitive root. Chapter Four – Some Functions of Number Theory • Multiplicative functions, totally multiplicative functions. • Dirichlet convolution. • Möbius inversion. Chapters Six and Seven – Farey Fractions and Irrational Numbers; Simple Continued Fractions • Dirichlet’s theorem on Diophantine approximation. 85 • Farey fractions. • Diophantine approximations to rational and algebraic numbers. • Continued fractions. • Pell’s equation. 86