MAT3166: Elementary Number Theory Fall 2011 Course Notes

advertisement
MAT3166: Elementary Number Theory
Fall 2011
Course Notes, University of Ottawa
Prof: Monica Nevins
These course notes have been developed over several years, using material from many sources, regretfully
only some of which are directly acknowledged.
Last update: September 16, 2011
i
Contents
1 Introduction
1
2 Integers and divisibility
2.1 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 The greatest common divisor . . . . . . . . . . . . . . . . . .
2.3 The Fundamental Theorem of Arithmetic . . . . . . . . . . .
2.4 Some applications of the Fundamental Theorem of Arithmetic
2.5 Applications of the GCD: Linear Diophantine Equations . . .
2.6 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7 Further notes on primes . . . . . . . . . . . . . . . . . . . . .
2.8 Alternate proof of the infinitude of primes: the zeta function
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
4
5
6
7
8
10
11
13
14
3 Modular Arithmetic
3.1 The ring Z/nZ . . . . . . . . . . . . . . . . .
3.1.1 The set Z/nZ . . . . . . . . . . . . . .
3.1.2 Arithmetic on Z/nZ . . . . . . . . . .
3.1.3 Cancellation and invertibility in Z/nZ
3.2 Exercises . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
15
16
18
20
ii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Introduction
Number theory is an ancient area of mathematics. Carl Friedrich Gauss (1777-1855), whose work defines
where elementary number theory turned into modern number theory, is quoted (by G.H. Hardy, in “A
Mathematician’s Apology” (1940)) as saying “Mathematics is the queen of the sciences and number theory
is the queen of mathematics.”
The famous quote by Kroenecker “God made integers, all else is the work of man” captures where number
theory begins: with the integers, and questions arising from arithmetic. What makes number theory so
fascinating, and such a vibrant area of mathematical research, is how seemingly simple questions can end up
being incredibly difficult, or incredibly deep; how much new mathematics has been generated in the pursuit
of proofs in number theory.
In 1900, the world-renowned mathematician David Hilbert proposed a list of 23 important problems in
mathematics. History has proven Hilbert’s wisdom and insight: these were in many cases fundamental
problems to be tackled, and in all cases interesting problems to try to resolve. Four of them concerned
number theory, the most celebrated of which is Hilbert’s 8th problem, which included as a first step the
Riemann hypothesis (that the nontrivial zeros of the Riemann zeta function all have real part equal to
1
2 ), as well as Goldbach’s conjecture (that every even integer > 2 is the sum of two primes) and the twin
primes conjecture (that there are infinitely many pairs of prime numbers whose difference is two). Hungarian
mathematician George Pólya quotes Hilbert as saying, in response to what he’d ask upon awakening, if he
went to sleep for a thousand years: “I’d ask if anyone had proved the Riemann hypothesis.” So far: no one
has.
In 2000, the Clay Mathematics Institute drew up a set of 7 Millenium Prize problems, offering a million-dollar
prize for their solutions. (One problem, the Poincaré conjecture, was solved in 2003 by Grigori Perelman,
but he declined the prize. He also declined the Fields medal.) Two of these problems are in number theory:
one is the Riemann hypothesis, and another is the BSD conjecture, about the number of rational points on
an elliptic curve.
So why the excitement about conjectures in number theory? In part: prime numbers seem to at once possess
incredible regularity and exhibit so many special properties, and yet they defy any patterns or formulas;
it’s completely tantalizing. For another: so many conjectures are easy to state and (with computers) easy
to verify for the first million, billion or more cases, and yet that proves absolutely nothing! Think of the
famous Pólya conjecture (below).
Conjecture 1.1 (Pólya conjecture). For each n > 1, the number of numbers less than n with an odd number
of prime factors is always greater than or equal to the number of numbers less than n with an even number
of prime factors.
1
• n = 5: 2, 3 have 1 prime factor, 4 has two.
• n = 6: 2, 3, 5 have 1 prime factor, 4 has two.
• n = 10: 2, 3, 5, 7, 8 have an odd number of prime factors, 4, 6, 9 have two.
This conjecture was postulated in 1919 but proven false in 1958. The first proof was of the existence of a
counterexample, but did not calculate one explicity; an explicit one was found 2 years later and in 1980 it
was proven that the smallest counterexample is n = 906, 150, 257.
On the other hand, there are many conjectures which haven’t been proven but which “everyone” believes
are true, and so they are called a “hypothesis”. Examples include the Riemann hypothesis, the hypothesis
that P = N P , and the continuum hypothesis (whose resolution as being unresolvable was yet another
incredible outcome from Hilbert’s list of problems). Mathematicians prove results based on this hypothesis,
understanding that if ever the hypothesis was proven false, their results would be meaningless.
Other conjectures and results in number theory are absolutely tantalizing but completely useless; they’re
just really fascinating. A great example is “Fermat’s Last Theorem”, which was known by this name for
over 350 years even though it was only proven in 1995 (by Andrew Wiles).
Theorem 1.2 (Fermat’s Last Theorem). For any integer n > 2, the equation
an + bn = cn
has a no integer solution (a, b, c) ∈ Z3 with abc 6= 0.
It’s a result that begs you to try your hand at finding a counterexample, and thousands of attempts were
made at its proof over the years. In the end, it was proven using incredibly sophisticated mathematics; its
proof was brilliant mathematics which has advanced many areas of number theory and beyond, even though
the result itself has “no” mathematically interesting consequences!
What’s also exciting about number theory is that although it’s just about numbers and primes and patterns and things that really don’t seem to have anything to do with the real world — in fact, G.H. Hardy
(1877-1947), a brilliant number theorist (and pacifist), celebrated number theory as being truly “pure mathematics”, with no applications to the real world — it has turned out to provide the key building block for
our very modern real world, through its use in public-key cryptography, which is the backbone to all secure
connections on the internet.
Our goal in this course is to explore the basic constructs, and some famous and interesting problems,
of elementary number theory (where you can roughly define “elementary number theory” as “everything
discovered before Gauss stepped in”). This also sets the stage for you to continue exploring related ideas in
algebra, or in any of the modern subfields into which number theory has evolved (combinatorial, analytic,
algebraic, etc).
More precisely, we’ll cover much of the first 9 chapters of our text [KR], with a few topics from chapters
thereafter as time permits. Our starting point is the integers, picking up where you left off in Grade 5 or so.
2
Chapter 2
Integers and divisibility
The integers:
Z = {· · · , −3, −2, −1, 0, 1, 2, 3, · · · }.
1
They form a ring under addition and multiplication.
• positive integers have been understood since ancient times
• zero has two roles: a place-holder in our number system, and as a number representing the absence of a
quantity; we credit Indian mathematicians in the 7th century AD (including principally Brahmagupta)
with this discovery, which was spread westward by arab mathematicians and eventually made it to
Europe for use by Fibonacci around 1200AD. (Independently, the Mayans of south Central America
had discovered zero as part of their own base-20 number system.)
• negative numbers predate zero (the concept of debt and credit, for example); Chinese mathematicians
were using negative numbers as early as 200 BC, and Indian mathematicians set out rules for arithmetic
with negative numbers also in the 7th century AD.
From the integers, the next reasonable step was to construct the rational numbers
Q={
m
| m, n ∈ Z, n 6= 0}/ ∼
n
where ab ∼ dc whenever ad = bc; this is an abstraction where we have created a multiplicative inverse of every
nonzero integer and extended the operations of arithmetic from Z to Q in the only consistent way. From
now on we simply write ab = dc , thinking of the fraction symbol as the equivalence class of this relation. Q
is a field2 .
The next historical step is a giant leap: by thinking of (positive) rational numbers as geometric quantities
(lengths), one interpolates and thus infers the real numbers, R. This was put on firm footing with the advent
of analysis. (In fact, are were other ways to complete the rational numbers, leading to the field of p-adic
numbers for each prime p, and these are quite interesting for number theory.)
Now the question: Q and R are so easy to work with; what’s so interesting about Z? Here let’s mention just
two.
1 A (commutative) ring (with unity) is a set R endowed with two operations, say + and ×, such that (R, +) is an abelian
group, and (R \ {0}, ×) is a commutative monoid, such that distributivity holds (a(b + c) = ab + ac). A monoid is a set which
is closed under an operation such that there is a unit for the operation in that set, and the operation is associative (and in this
case, commutative). A group is a monoid such that every element has an inverse in that set.
2 A field is a commutative ring F with unit such that additionally (F \ {0}) is an abelian group.
3
• One of the key properties of the integers is the Well-Ordering Principle:
Every nonempty set of positive integers contains a least element.
The corresponding statement in Q or R is false. Notice that it’s the Well-Ordering Principle which
allows mathematical induction to work: if we have a statement Pn which we conjecture is true for
every positive integer n, then either it’s true or else there’s a least n at which it is false, and induction
is the process of proving there is no such least counterexample.
• Divisibility is an interesting question in Z: sometimes we can solve ax = b and sometimes we can’t.
Once we are in the context of R or Q, divisibility is trivial: the solution to ax = b, when a 6= 0, is
x = b/a. Questions of divisibility form the core of elementary number theory.
2.1
Divisibility
Definition 2.1. Let a and b be integers, with a 6= 0. We say that a divides b if there exists an integer c
such that ac = b. We write a|b in this case and say “a is a divisor of b” and “b is divisible by a.”
Example 2.2. So every integer divides 0, and −2 divides 12. We excluded zero as a divisor; clearly zero
does not divide any nonzero number, but we don’t want to accept the statement “0 divides 0” because it’s
a weird case, the only one where the quotient is not unique.
Lemma 2.3. Let a, b, c, d, x, y ∈ Z. Then the following statements are true.
(a) a divides b iff −a divides b.
(b) If a|b and c|d then ac|bd.
(c) Transitivity: if a|b and b|c then a|c.
(d) If a|b and a|c then a|(bx + cy).
Proof. (a) Suppose a|b. Then there exists c ∈ Z such that ac = b. We know −c ∈ Z; and it follows that
(−a)(−c) = ac = b, so (−a)|b. The converse holds since −(−a) = a.
(b) If a|b then there exists x ∈ Z such that ax = b. If c|d then there exists y ∈ Z such that cy = d. Then
bd = (ax)(cy) = (ac)(xy), so ac|bd.
(c) If a|b then there exists x ∈ Z such that ax = b. If b|c then there exists y ∈ Z such that by = c. Then
c = by = (ax)y = a(xy) so a|c.
(d) If a|b then there exists r ∈ Z such that ar = b. If a|c then there exists s ∈ Z such that as = c. Then for
any x, y ∈ Z, we have bx + cy = (ar)x + (as)y = a(rs + sy) so a|(bx + cy).
Lemma 2.4. If a|b and b 6= 0 then |a| ≤ |b|.
Proof. If a|b then there is some c ∈ Z such that ac = b. Therefore |b| = |ac| = |a| |c|. Now c ∈ Z and c 6= 0
so |c| ≥ 1. Thus |b| ≥ |a|.
What can we say when a 6 |b?
4
Lemma 2.5 (Division theorem). Suppose a, b ∈ Z and a 6= 0. Then there exist unique q and r in Z such
that
b = qa + r
and
0 ≤ r < |a|.
Proof. WLOG we may assume a > 0; the case a < 0 is left as an exercise.
Consider the set S = {b − qa | q ∈ Z, b − qa ≥ 0}. If 0 ∈ S, then b = qa for some q ∈ Z and we may set
r = 0. Otherwise, we first claim that S is nonempty. If b > 0 then since b ∈ S we are done; otherwise, since
a 6= 0, we have |ba| > |b|, so choosing q = −b gives a positive value b − qa ∈ S.
Since S is a nonempty set of positive integers, it contains a least element r, which by definition is equal to
b − qa for some q ∈ Z. We have only to show that r < |a| = a. For suppose r ≥ a. Then r − a ≥ 0 so
b − (q + 1)a = b − qa − a = r − a ≥ 0.
Thus b − (q + 1)a ∈ S, but by construction b − (q + 1)a < r, a contradiction of the minimality of r. Hence
r < a.
Now suppose there were two elements r1 and r2 in S satisfying 0 ≤ r1 < r2 < a. We could then write
b = aq1 + r1 = aq2 + r2
whence
a(q1 − q2 ) = r2 − r1 .
Thus a|(r1 −r2 ) which means, since r1 −r2 > 0, that a < r1 −r2 . But this is impossible, since by construction
the difference r1 − r2 is strictly less than a. Hence we must have r1 = r2 , which implies also q1 = q2 , and
unicity follows.
We are immediately led to the notion of prime numbers.3 .
Definition 2.6. An integer p > 1 is called a prime number if its only positive factors are 1 and p.
End of lecture # 1
2.2
The greatest common divisor
Given two nonzero integers a, b, c, we say that c is a common divisor of a and b if c|a and c|b. By Lemma 2.4,
we have that such a c satisfies |c| ≤ |a| and |c| ≤ |b|. So the set of common divisors is bounded above, and
so has a maximal element. We call this value, say g, the greatest common divisor of a and b and we write
g = (a, b).
Lemma 2.7. Let a, b be nonzero integers and g = (a, b). Then g is the least element of the set S = {ax + by |
x, y ∈ Z, such that ax + by > 0}.
3 In ring theory, we define a prime to be a nonunit p with the property that whenever p divides a product, it divides one
of the factors. We show that in many rings, this is equivalent to the notion of an irreducible, which is a nonunit p with the
property that whenever p = ac, one of a or c is a unit. These two notions agree over the integers (which we’ll show eventually),
and so for example 5 and −5 are prime. For tradition (and ease of use), in number theory we restrict our definition to positive
integers.
5
Proof. Since 1 is a common divisor of a and b, clearly g ≥ 1.
S is a nonempty set of positive integers so has a least element s. Say
s = ax0 + by0
Dividing s into a by the division algorithm gives us
a = qs + r
with 0 ≤ r < s. Thus r = a − qs = a − q(ax0 + by0 ) = a(1 − qx0 ) − b(qy0 ). If 0 < r < s then r ∈ S and
r < s, contradicting the defn of s; therefore r = 0 and s divides a). Similarly s divides b, so s is a common
divisor of a and b. Thus by definition s ≤ g.
On the other hand, since g|a and g|b, g|(ax0 + by0 ) so g|s. By Lemma 2.4, this gives g ≤ s so g = s.
Corollary 2.8. If d is a common divisor of a and b, then d|(a, b).
Proof. Set g = (a, b). Then there exist x, y such that g = ax + by. Since d|a and d|b, d|(ax + by) so d|g.
How do we compute the gcd of two numbers?
Lemma 2.9. If a = bq + r where 0 ≤ b < r, then (a, b) = (b, r).
Proof. Let d = (a, b) and e = (b, r). Then d|r and d|b so d|e. Similarly, e|a and e|b so e|d. Hence since gcds
are positive, d = e.
Corollary 2.10. One can compute the gcd of a and b by repeated application of the division algorithm. We
call this the Euclidean Algorithm. [KR, Algorithm 2.5.16]
Example: 81 = 57 + 24, 57 = 2 × 24 + 9, 24 = 2 ∗ 9 + 6, 9 = 6 + 3, 6 = 2 × 3 + 0 so gcd is 3.
Example: (15, 11) = 1 and can find x, y so that 15x + 11y = 1.
The Extended Euclidean Algorithm is the process of recovering the equation ax + by = (a, b) by reversing
the steps of the Euclidean algorithm. (See [KR, Algorithm 2.6.4] for an very efficient implementation.) The
simple (but non-efficient interpretation of the extended Euclidean algorithm is: at least set of the division
algorithm, you have an equation of the form ri = qi+1 ri+1 + ri+2 , and therefore can solve for ri+2 in terms
of ri and ri+1 . Repeating this process must eventually yield an equation for ri+2 in terms of a and b. In
particular this holds when ri+2 = (a, b), that is, is the last remainder in the repeated division algorithm.
2.3
The Fundamental Theorem of Arithmetic
We begin with a definition and a lemma.
Note that if p is prime then (p, n) = p if p|n and is 1 otherwise. This motivates the definition: two nonzero
integers a and b are relatively prime if (a, b) = 1.
Lemma 2.11. If p is prime and p|ab then p must divide one of a or b or both.
6
Proof. Suppose p|(ab). If p|a then we are done. So now suppose p does not divide a. Then (p, a) = 1. This
means there exist x, y such that
1 = px + ay
and so
b = pbx + aby.
Now p|p and p|(ab) so p|b, and we’re done.
End of lecture # 2
Theorem 2.12 (Fundamental Theorem of Arithmetic). Every positive integer can be factored as a product
of primes in a unique way (up to ordering of the factors).
(We accept that n = 1 factors as an empty product of primes.)
Proof. Let S be the set of integers n > 1 which cannot be factored into a product of prime numbers. By
the Well-Ordering Principle, if S is nonempty we may let N be the least element of S. So N cannot be
prime, since then it admits the factorization N = N , a contradiction since N ∈ S. Thus N must have
some factorization N = qr, with 1 < q, r < N (using Lemma 2.4 to deduce that both factors are less than
N ). But then neither q nor r are in S, since N was the least element of S; thus each of q and r admit a
factorization into prime factors, whence N does (as the product of all the factors of q and r). This contradicts
the definition of N , whence we conclude that S was empty.
Thus every integer can be factored into a product of primes.
Now suppose to the contrary that the set T of all positive integers admitting at least two distinct factorizations
into prime numbers is nonempty, and let N be the smallest element of T .
Since N ∈ T , there exist primes p1 ≤ p2 ≤ . . . ≤ pk and q1 ≤ q2 ≤ . . . ≤ q` , such that these lists of primes
are distinct and
N = p1 · · · pk = q1 · · · q` .
(2.1)
So p1 divides N . By repeated application Lemma 2.11 we deduce p1 must divide qj for at least one j. But
qj is also prime, so p1 = qj . It follows that N/pi = N/qj is a positive integer, strictly smaller than N , which
admits two different factorizations into primes, and hence lies in T , contradicting minimality the of N . So
T is empty, and factorization is unique.
√
Corollary 2.13. If N is not prime then there exists a factor q of N with 1 < q ≤ N .
Proof. If N √
is not prime, let N
p1 · · · pk be a prime factorization of N , with k ≥ 2, and where p1 ≤ p2 · · · ≤
√ =√
pk . If p1 > N then p1 p2 > N N = N , which is a contradiction.
2.4
Some applications of the Fundamental Theorem of Arithmetic
The Fundamental Theorem of Arithmetic has a number of useful applications. From now on we make the
convention that if we say
n = pe11 pe22 · · · pemm
is a prime factorization of n, then this means that the pi are distinct and each ei ≥ 1 (unless specified
otherwise).
Lemma 2.14. Suppose n = pe11 pe22 · · · pemm is a prime factorization of n > 1. Then n is a perfect square if
and only if each exponent ei is even.
7
Proof. We say n is a square if there exists some m ∈ Z such that n = m2 . Clearly if n = m2 and q1f1 · · · qrfr
is a prime factorization of m, then n = m2 = q12f1 · · · qr2fr is a prime factorizaton of n (hence, by the F.T.A.,
the unique such factorization) such that each prime occurs with even exponent.
1 2f2
2fm
in which every exponent is even, we see that
Conversely, given a prime factorization n = p2f
1 p2 · · · pm
f
f
2
1
fm
2
n = a , where a = p1 p2 · · · pm .
Lemma 2.15. Suppose n, m ≥ 1 have prime factorizations
n = pe11 pe22 · · · perr
m = pf11 pf22 · · · pfrr ,
and
where this time we have only that each ei , fi ≥ 0 so that we may assume the set of primes {p1 , · · · , pr } is
the same.
1. We have that n|m if and only if for each i, ei ≤ fi .
2. For each i, set gi = min{ei , fi } ≥ 0. Then
(m, n) = pg11 pg22 · · · pgrr .
The proof is an exercise.
This lemma gives an alternate means of computing the gcd, but this is very inefficient in practice, unless the
prime factorization is already known.
2.5
Applications of the GCD: Linear Diophantine Equations
An equation of the form ax + by = n for which a, b, n are integers and we seek an integer solution (x, y) ∈ Z2
is called a linear Diophantine equation. These were first systematically solved by Brahmagupta.
We begin with a lemma, generalizing a result we noted for prime numbers earlier.
Lemma 2.16. Let a, b, c ∈ Z. If c|ab and (b, c) = 1, then c|a.
Proof. One could argue from prime factorization; alternately (and perhaps more elegantly):
Since (b, c) = 1, there exist x, y ∈ Z such that bx + cy = 1. Thus abx + acy = a. Since c divides each term
of the left side, it divides the right; that is, c|a.
Lemma 2.17. Suppose d = (a, b). Then (a/d, b/d) = 1.
Proof. We have d = ax + by for some x, y ∈ Z; since d|a and d|b we may divide to obtain
1 = (a/d)x + (b/d)y
and thus (a/d, b/d) = 1, by minimality of this expression.
Theorem 2.18. The equation ax + by = n has a solution if and only if (a, b)|n. In this case, if (x0 , y0 ) is
any solution, then all other solutions are of the form
x = x0 +
b
t,
(a, b)
y = y0 −
for t ∈ Z.
8
a
t
(a, b)
Proof. Certainly if d|n then there’s a solution; and if there is a solution then d|n.
Given one solution, any other satisfies
a(x − x0 ) + b(y − y0 ) = 0
or
a(x − x0 ) = −b(y − y0 )
We divide through by d to get
a
b
(x − x0 ) = (y0 − y)
d
d
but since these coefficients are relatively prime by our lemma, our first lemma yields
a
|(y0 − y)
d
b
|(x − x0 )
d
and
Let t be such that y0 − y = t ad ; by simplifying the equation above we get x − x0 = t db . This yields the
result.
Example 2.19. Here’s a problem posed by Euler:
A farmer lays out the sum of 1770 crowns in purchasing horses and oxen. He pays 31 crowns for each horse
and 21 crowns for each ox, and buys more horses than oxen. How many horses and oxen did the farmer
buy?
The equation we wish to solve is
31x + 21y = 1770
which we know from linear algebra has infinitely many solutions in the real numbers. By Brahmagupta’s
theorem, since (31, 21) = 1, we know this equation has infinitely many solutions over Z. However, the only
correct solutions would be those in N × N.
Solution:
We can either apply the extended Euclidean algorithm, or else just look really hard, to find
31(−2) + 21(3) = 1
is one solution to 31x + 21y = 1. Therefore, a solution (x0 , y0 ) to our sought equation is
x0 = −2(1770) = −3540,
y0 = 3(1770) = 5310.
By the theorem, all solutions have the form
y = y0 − 31k.
x = x0 + 21k,
We want nonnegative solutions; so we need
−3540 + 21k ≥ 0
21k ≥ 3540
k ≥ 168.6
as well as
5310 − 31k ≥ 0
31k ≤ 5310
k ≤ 171.2
9
We conclude that the valid answers are for 169 ≤ k ≤ 171.
These give:
k = 169: x = 9, y = 71
k = 170: x = 30, y = 40
k = 171: x = 51, y = 9
and we deduce the unique solution is this last one (since there were more horses than oxen).
2.6
Prime numbers
So how many primes are there?
Theorem 2.20 (Euclid, ∼ 300BC). There are infinitely many prime numbers.
Proof. Suppose to the contrary that there were only finitely many prime numbers. Then we could enumerate
them all as
p1 , · · · , pn .
Consider N = p1 p2 · · · pn + 1. Since N ∈ Z and N > 1 it can be factored as a nontrivial product of primes.
Let pk be a prime factor of N . Then pk |N and by construction pk |(p1 · · · pn ), so pk |(N − p1 p2 · · · pn ). Thus
pk |1, implying pk = 1, a contradiction. Hence there must be infinitely many primes.
Since ancient times, people have been trying to find a pattern in the set of prime numbers — a formula
that could produce a prime number, or predict if a number were prime. But primes stubbornly refused to
present any regularity, until Gauss, at age 14, got a book of logarithms and of prime numbers. He tried to
find patterns; and his key idea was to consider their distribution as a probability. That is, he defined the
following function.
Definition 2.21. Let x ∈ R and define
π(x) = #{n ∈ N | n ≤ x, n is prime}
to be the number of primes up to x.
For example:
x
100
103
104
106
108
1012
π(x)
25
168
1229
78498
5,761,455
37,607,912,018
log10 x × π(x)
50
504
4916
470,988
46,091,640
∼ 4.5 × 1011
Gauss noticed that
π(x) ∼
x
log(x)
10
ln(x) × π(x)
∼ 115
∼ 1160
∼ 11319
∼ 1, 084, 489
∼ 1.06 × 108
∼ 1.04 × 1012
where here log is the natural logarithm (base e). More precisely, he conjectured (that
1
x
x
< π(x) < 2
2 log(x)
log(x)
for all x.
Chebyshev proved this result (and a slight refinement of it, with
1850.
1
2
and 2 replaced by some constants) in
Remark 2.22. In fact, Gauss refined his conjecture by defining the logarithmic integral
Z x
1
Li(x) =
dt
2 log(t)
and suggesting that π(x) ' Li(x). This is a better estimate than the above quotient if you are considering
the difference |π(x) − Li(x)| as x → ∞ (rather than the quotient of π(x) by its estimate).
End of lecture # 3
2.7
Further notes on primes
Theorem 2.23 (Prime Number Theorem). (Proved by Hadamard and de la Vallée-Poussin in 1896; reproved by “elementary” methods in 1949 by Selberg and Erdös)
lim
x→∞
π(x)
=1
x/ log(x)
The interpretation of this theorem is, for example: consider x = 109 . Then log(x) = 21 so about 1 out of
every 21 numbers near x is prime.
Unfortunately (or fortunately), this gives you no information whatsoever about the primality or factorization
of any given number near 109 .
Eratosthenes (around 200BC) discovered a sieve method for producing a list of all prime numbers up to N ,
now called the sieve of Erathosthenes: given a table of all numbers from 2 to N , circle 2 and cross off all
multiples
√ of 2; then choose the smallest noncrossed entry, circle it and cross off all of its multiples; repeat
until N ; circle all remaining uncrossed numbers. The circled numbers are all the primes up to N .
This sieve method is quite effective for finding primes, up to a point. Nowadays, you can find lists of prime
numbers on the internet, up to quite large numbers.
Another strong interest is to produce formulas or sequences which generate prime numbers. Some exist (see
[R]) but are of limited use (since they depend on real constants who exact value is unknown, for example,
or because the effort of using them to compute new primes is worse than using the sieve of Eratosthenes).
We discuss some interesting failed examples next.
Mersenne primes
Proposition 2.24. Let n ≥ 2, a ≥ 2. If an − 1 is prime then a = 2 and n is prime.
11
Proof. We have that
an − 1 = (a − 1)(an−1 + · · · + a + 1);
so if an − 1 is prime, one of these factors must be 1. If n ≥ 2 the second factor is not 1, so we have
a−1=1
or a = 2.
Now suppose n = rs is composite. Setting a = 2r in the expression above we have that
2rs − 1 = (2r − 1)(2r(s−1) + · · · + 2r + 1)
which, for r, s > 1, is a nontrivial factorization of 2n − 1. So if 2n − 1 is prime then necessarily n is prime.
Mersenne guessed: Mp = 2p − 1, for p a prime, would always be prime. We see that Mp is prime for
p ∈ {2, 3, 5, 7}. But M11 is composite; and so far there are only some 47 known Mersenne primes.
Definition 2.25. A number of the form a = 2n − 1, n ≥ 1, is called a Mersenne number and if a is prime
then it’s called a Mersenne prime.
Fermat primes
Proposition 2.26. Let n ≥ 1, a ≥ 2. If an + 1 is prime, then a is even and n = 2r for some r ≥ 0.
Proof. Suppose an + 1 is prime; then (since an ≥ 1) it is odd, so an must be even, so a is even.
A variant on our usual telescoping formula for factorization yields, for odd n, the factorization:
an + 1 = (a + 1)(an−1 − an−2 + an−3 − · · · − a + 1)
so if an + 1 is prime, then n is even. Furthermore, if n = 2q m for some odd m, then we could apply the
q
above identity with a replaced by a2 and n replaced by m to again derive a contradiction; hence n = 2q for
some q.
n
Fermat guessed that all Fn = 22 + 1 were prime. We see that Fn is prime for n ∈ {0, 1, 2, 3, 4}. No other
prime Fn have been found to date.
Definition 2.27. Fn is called a Fermat number; when Fn is prime it is called a Fermat prime.
Other formulas
The number n2 + n + 41 is prime for all n up to 39, but n = 40 gives a composite answer. In fact, no
polynomial in one variable can always yield primes (since for any n, p, h, we have f (n + ph) = f (n) + px for
some integer x).
So we don’t have formulas for primes, but:
Finding huge primes isn’t too tough these days: by the Prime Number Theorem, you know about how many
odd numbers of a given size you ought to generate at random to ensure that at least one is prime; and then
there are many efficient probabilistic algorithms (even a deterministic polynomial time algorithm!) that tell
you if a given number is prime.
We’ll come back to this a bit later, when we see some of the ways that prime numbers essentially self-identify
on tests.
12
2.8
Alternate proof of the infinitude of primes: the zeta function
Definition 2.28. The zeta function is defined on all real numbers s with s > 1 by:
ζ(s) =
∞
X
1
.
ns
n=1
We know from analysis that this series converges for all s > 1 but when s = 1 it is the harmonic series
∞
X
1
n
n=1
which diverges.
In some cases its explicit value has been computed; for example Euler proved that ζ(2) = π 2 /6.
The zeta function is related to primes by the following theorem of Euler:
Theorem 2.29. For any s > 1,
∞
Y X
1
1
=
.
ns
1 − p−s
n=1
p prime
Proof. Recall the geometric series:
n
X
xn =
n=0
−s
whenever |x| < 1. Setting x = p
1
1−x
< 1, we have
1
= 1 + p−s + p−2s + · · · ≥ 1.
1 − p−s
So the right hand side is
Y
1 + p−s + p−2s + · · ·
p prime
but what is an infinite product of an infinite sum? Well, denote the kth prime by pk (so p1 = 2, p2 = 3,
etc). Then we define this expression as:
lim
m
Y
m→∞
−2s
1 + p−s
+ ···
k + pk
k=1
Since the series on the right are absolutely convergent, we can manipulate at will:
m
Y
−2s
1 + p−s
+ ··· =
k + pk
X
2s
p1−i1 s p−i
· · · pk−ik s
2
i1 ,...,ik ∈N
k=1
=
X
(pi11 pi22
i1 ,...,ik ∈N
1
· · · pikk )s
=
X 1
ns
n∈Nm
where Nm is the set of all positive integers whose unique factorization includes only the first m primes.
Now (a) since the elements of Nm are distinct and (b) since the numbers 1 up to m are all in the set Nm ,
we have the two inequalities:
m
X 1
X
1
≥
.
ζ(s) ≥
s
ns
n
n=1
n∈Nm
13
which yields
ζ(s) ≥
m
Y
k=1
m
X
1
−2s
1 + p−s
+
p
+
·
·
·
≥
k
k
ns
n=1
so that upon taking the limit as m → ∞, we have the sought equality by the squeeze theorem.
Corollary 2.30. There are infinitely many primes.
Proof. If the number of primes is finite, then the Euler product E(s) is a finite product, and hence computes
to a finite number, for any s ∈ R. The geometric series used in the proof of the above argument is convergent
even for s = 1, so our argument holds, and we could conclude that ζ(1) = E(1) < ∞. But the zeta function
diverges at s = 1, being a harmonic series; hence a contradiction. E(1) must also diverge.
Riemann (1826-1866) lived about 100 years after Euler (1707-1783). He considered the function ζ(s) for
s ∈ C. The formula for ζ(s) converges for all s ∈ C such that re(s) > 1. One can show that ζ can be
extended (analytically continued) to a complex-valued function on all of C in the sense that there exists
a meromorphic function on C which agrees with ζ(s) wherever ζ(s) converges. We call this meromorphic
function the Riemann zeta function.
2.9
Exercises
1. Let S be a set of integers which is bounded above, that is, there exists a such that for all x ∈ S, x ≤ a.
Deduce from the Well Ordering Principle that S has a maximal element.
2. Find the prime factorization of 13!.
3. Show that if n = pa1 1 · · · pakk with p1 < p2 < · · · < pk and a1 a2 · · · ak > 0 (for example, 12 = 22 31 ) then
the total number of positive divisors of n is ν(n) = (a1 + 1)(a2 + 1) · · · (ak + 1).
4. Prove Lemma 2.15.
5. Define the least common multiple of a and b to be the least n ≥ 1 such that a|n and b|n. Show that
lcm(a, b) divides any other common multiple of a and b, and that lcm(a, b)|ab.
6. Prove that the sieve of Eratosthenes works, that is, prove that all the circled numbers at the end of
the algorithm are prime, and that these exhaust all primes up to N .
7. Prove that for any k > 0, there exist k consecutive composite integers.
8. Show that for any n, p, h, if f is a polynomial with integer coefficients we have f (n + ph) = f (n) + px
for some integer x. The binomial theorem is useful here. Use this to deduce that f (m) cannot be prime
for all m.
14
Chapter 3
Modular Arithmetic
A key tool in elementary number theory is to use modular arithmetic. For material in this chapter, see [KR,
Ch 3].
3.1
3.1.1
The ring Z/nZ
The set Z/nZ
Definition 3.1. Let a, b ∈ Z, and n ≥ 1. We say a is congruent to b mod n, and write a ≡ b (mod n),
whenever n|(a − b).
Thus, for example, 1 ≡ 5 (mod 4), and a ≡ 0 (mod n) exactly when n|a. For all a, b ∈ Z, a ≡ b (mod 1), so
we often exclude this trivial case.
Lemma 3.2. Congruence is an equivalence relation, that is, it satisfies:
1. (reflexivity) a ≡ a (mod n)
2. (symmetry) if a ≡ b (mod n) then b ≡ a (mod n).
3. (transitivity) if a ≡ b (mod n) and b ≡ c (mod n) then a ≡ c (mod n).
The proof is an exercise.
Since congruence is an equivalence relation, we deduce that we can partition the integers into equivalence
classes according to this relation. Fix n and write a for the class of a (mod n). Then for example if n = 10
then 1 = {. . . , −19, −9, 1, 11, 21, . . .}; and this is 11 = −9 as well.
We define
Z/nZ = {a | a ∈ Z}.
Suppose a ∈ Z; then we can apply the division algorithm to write a = qn + r with 0 ≤ r < n. Since n|(a − r),
we have that a ≡ r (mod n), and so a = r.
We conclude that every equivalence class contains a smallest nonnegative representative r, with 0 ≤ r < n.
No two of these are in the same congruence class. Hence we may write
Z/nZ = {0, 1, . . . , n − 1}.
15
End of lecture # 4
3.1.2
Arithmetic on Z/nZ
Let us consider some properties and operations on Z, and see which ones descend to define properties and
operations on Z/nZ. Throughout, let us assume n > 1 to avoid the trivial case.
Proposition 3.3. Let a, b, c, d ∈ Z and suppose n ≥ 2. If a ≡ b (mod n) and c ≡ d (mod n) then
• a + c ≡ b + d (mod n), and
• ac ≡ bd (mod n).
Consequently, the operations of addition and multiplication mod n are well-defined, that is, we may define
a + b := a + b
and
a · b := ab
because this is independent of the representatives (a and b) chosen.
The proof is an exercise. So we write
4 + 5 = 2 ∈ Z/7Z,
or 1 + 1 ≡ 0 (mod 2).
Corollary 3.4. Let a, b, c ∈ Z and n ≥ 2. The following properties of arithmetic hold in Z/nZ.
1. addition is commutative: a + b = b + a
2. addition is associative: (a + b) + c = a + (b + c)
3. there is a zero element: a + 0 = a
4. every element has an additive inverse: a + −a = 0, where −a = −1 · a
5. multiplication is commutative: a · b = b · a
6. multiplication is associative: (ab)c = a(bc)
7. there is a multiplicative unit: a1 = a
8. multiplication distributes over addition: a(b + c) = ab + ac.
However, in general not every nonzero element of Z/nZ has a multiplicative inverse, nor does the cancellation
property of multiplication necessarily hold.
Proof. These properties all hold for integer arithmetic, and hence hold in modular arithmetic since we may
check both sides using any representative of each equivalence class.
To prove the final assertions, it suffices to give examples.
So for example, 2 6= 0 mod 4 has no multiplicative inverse since this would require the existence of an integer
a such that 2a ≡ 1 (mod 4), meaning 4|(2a − 1). But 2a − 1 is odd; impossible.
We also have in Z/15Z that 5 · 4 = 20 = 5 = 5 · 1; hence since 4 6= 1, cancellation fails.
16
This corollary identifies Z/nZ as a ring, which is a well-behaved algebraic object, and will be pleasant to
work with (particularly since it is finite, a nice advantage over Z).
Example 3.5. Let n = 9. Then we calculate
10 ≡ 1
mod 9
100 ≡ 1
mod 9
1000 ≡ 1
mod 9
Hence
486 = 4(100) + 8(10) + 6
=4+8+6
= 18
= 10 + 8
=1+8
=9=0
which is just the familiar rule of casting nines.
One can formulate the rules for divisibility by 2 and 5 in the same way.
Example 3.6. Let n = 4. Then 100 ≡ 0 (mod 4), so
234248972 ≡ 72
mod 4
whence divisibility by 4 is determined by the last two digits.
More generally, one can use the properties to simplify what at first seem monstrous calculations.
Example 3.7. Compute 232 mod 11. By this we mean: find the least nonnegative representative of the
mod 11 congruence class of 232 .
One option: compute 23 2 (around 4 billion) and then divide by 11 to find the remainder.
Second option: we note that 25 = 32 ≡ −1 (mod 11) and so
2
2
210 = 25 = −1 = 1
whence
3
23 2 = 210 22 = 14 = 4
so the answer is 4.
We can also use modular arithmetic to change questions about all integers to questions about a finite set,
which can then be answered exhaustively.
Example 3.8. Prove that if n is an odd integer then n2 − 1 is a multiple of 8.
One option: consider prime factorization.
Another option: Let’s rephrase this as a question in modular arithmetic. We wish to show that for any odd
integer n, n2 ≡ 1 (mod 8). Thus it suffices to show that (in Z/8Z, n2 = 1 holds for n ∈ {1, 3, 5, 7}. It does,
and so we are done.
17
Example 3.9. Show that all integer solutions to x2 + y 2 + z 2 = xyz are such that x, y and z are divisible
by 3.
Solution: It suffices to show that if equality holds mod 3, then x,y and z are all 0.
We can fill in a table of all possible combinations. If any of x,y or z are 0, then the right size is 0 and the
left side is a sum of 0, 1 or 2 nonzero squares. But the only squares mod 3 are 0 and 1, so equality can only
hold if all are 0.
Now suppose than none are 0; then the right side is nonzero while the left side is 1+1+1 = 0, a contradiction.
Hence the only solution was all 0, as required.
Exercise: Show that if m ≡ 3 (mod 4), then m cannot be the sum of two squares.
Exercise: Find a formula for 24k mod 5.
Remark 3.10. Let us conclude this section by listing some properties which do not descend to properties
of Z/nZ. That is, the following statements and symbols are undefinable:
• a is prime
• a is even/odd (unless n is even)
• (a, b)
• ab
And there are many more.
3.1.3
Cancellation and invertibility in Z/nZ
So far we have held n fixed and considered operations on the set of class mod n.
Lemma 3.11. If a ≡ b (mod n) and d|n then a ≡ b (mod d).
This is clear from the definition, and the transitivity of divisibility.
Proposition 3.12. Suppose a, b, c ∈ Z with c 6= 0, and n ≥ 1. Then
ac ≡ bc(mod n) ⇒ a ≡ b(mod
n
)
(n, c)
Proof. Set d = (c, n) and suppose ac ≡ bc (mod n). Then n|(ac − bc) which implies n|c(a − b). Consequently,
n c
| (a − b)
d d
since these fractions are integers. Now by a previous lemma, (n/d, c/d) = 1, which allows us to conclude
that
n
|(a − b)
d
whence the result.
We deduce that the cancellation law holds for those c for which (n, c) = 1 holds. (Exercise: check that
this property is independent of the choice of representative c; this is implicit in the proof of the preceding
lemma.)The subset of such c is of particular importance to us.
18
Definition 3.13. Let n ≥ 2 and set
Un = {a ∈ Z/nZ|(a, n) = 1};
we call this the group of units of Z/nZ, and it is sometimes denoted Z/nZ∗ .
For example,
U8 = {1, 3, 5, 7}
whereas
U5 = {1, 2, 3, 4}.
Definition 3.14. Let n ≥ 2. The number of elements in Un is denoted φ(n). We set φ(1) = 1. The function
φ is called Euler’s totient function, or sometimes just Euler’s phi function.
For example, we have
φ(2) = 1, φ(3) = 2, φ(4) = 2, φ(5) = 4, φ(6) = 2, φ(7) = 6, φ(8) = 5, . . .
Is there an easier way to compute this function, rather than listing the elements of Un ?
Lemma 3.15. We have
• If p is prime, then φ(p) = p − 1 and for any k ≥ 1, φ(pk ) = pk − pk−1 .
• If m, n ≥ 1 and (m, n) = 1 then φ(mn) = φ(m)φ(n).
Proof. First suppose that p is prime. Then p is relatively prime to all positive integers less than p, whence
|Up | = p − 1 = φ(p).
Now let k ≥ 1. A number x < pk fails to be relative prime to pk if and only if it admits p as a prime factor.
There are exactly pk−1 multiples of p less than pk (including 0), so φ(pk ) = pk − pk−1 .
Finally, suppose m, n ≥ 2 and (m, n) = 1. (The case that either is 1 is trivial.) Note that (a, mn) = 1 iff
(a, m) = 1 and (a, n) = 1. Form a matrix of all the integers from 0 to mn − 1, as follows:
0
m
2m
3m
..
.
1
m+1
2m + 1
3m + 1
..
.
2
m+2
2m + 2
3m + 2
..
.
···
···
···
···
..
.
m−1
2m − 1
3m − 1
4m − 1
..
.
(n − 1)m
(n − 1)m + 1
(n − 1)m + 2
···
nm − 1
Consider the jth column, whose entries are
j, m + j, 2m + j, · · · , (n − 1)m + j
• We see that either each of these numbers is relatively prime to m, or else none of them are. Therefore
there are φ(m) columns in which all elements are relatively prime to m.
• Also, no two of these elements can be congruent modulo n, since km + j ≡ `m + j (mod n) implies
km ≡ `m (mod n) which, since (m, n) = 1, implies k ≡ ` (mod n). But since 0 ≤ k, ` < n, this is
impossible unless k = `.
Hence the n elements in the jth column represent the n different congruence classes modulo n. Of
these, there are φ(n) which are relatively prime to n (but we can’t say in which order they appear in
the column).
19
We have identified φ(m) columns in which all elements are relatively prime to m, and within these, φ(n)
elements are also relatively prime to n. This gives exactly φ(m)φ(n) elements in the table which are relatively
prime to both m and n, and hence to mn.
Example 3.16. φ(49) = 49 − 7 = 42; φ(15) = φ(5)φ(3) = 8.
Exercise: find φ(360) and φ(100).
Now that we know how many elements are in the group of units Un , what else can we say about this set?
Proposition 3.17. For n ≥ 2, Un is a group under the multiplication in Z/nZ. That is, it is closed under
multiplication, contains 1, and each element a of Un has a multiplicative inverse.
We note that commutativity and associativity of multiplication in Un follow from that of multiplication in
Z/nZ.
Proof. Clearly (1, n) = 1 so 1 ∈ Un .
Closure: if (a, n) = 1 and (b, n) = 1 then (ab, n) = 1, whence if a, b ∈ Un , we have ab ∈ Un .
Multiplicative inverse: If (a, n) = 1 then there exist x, y ∈ Z such that ax + ny = 1. This means that modulo
n we have
ax + ny = 1
whence
ax = 1.
The equation ax + ny = 1 assures us that (x, n) = 1, so x is the multiplicative inverse we sought.
Example 3.18. In U5 , we have 1−1 = 1 and 2−1 = 3; the rest follow from the identity (a−1 )−1 = a.
In U31 we have 21−1 = 3, by a previous calculation.
In U8 , we have 5−1 = 5; in fact all elements are self-inverse.
3.2
Exercises
1. Give examples to show that each of the properties in Remark 3.10 and each of the symbols are undefined,
by showing that these concepts depend on the choice of representative of the given class.
2. Is the sum of three consecutive cubes always divisibly by 9?
3. If a ≡ b (mod n), does it follow that (a, m) = (b, m)?
Q
4. Prove the formula for n > 1: φ(n) = n p|n (1 − p1 ). That is, take all primes dividing n, compute
Q
1 − 1/p for each of these, multiply them all together, and multiply the result by n.
is the product
symbol and the subscript indicates the product runs over the set of prime divisors of n.
20
Bibliography
[KR] Ramanujachary Kumanduri, Cristina Romero. Number theory with computer applications, Prentice
Hall 1998.
[R] Paulo Ribenboim, The Little Book of Big Primes, Springer, 1991.
21
Download