Computational discrete mathematics with Python

advertisement
Computational discrete mathematics with
Python
Fred Richman
Florida Atlantic University
March 17, 2013
Contents
0 Python
2
1 Primes
1.1 The smallest prime factor . . . .
1.2 Using an if-statement . . . . . .
1.3 Using a for-statement . . . . . .
1.4 Testing the algorithm . . . . . . .
1.5 De…ning a function . . . . . . . .
1.6 Speeding the algorithm up . . . .
1.7 Goldbach’s conjecture . . . . . .
1.8 Sophie Germain primes . . . . . .
1.9 There are in…nitely many primes
1.10 Largest prime factor . . . . . . .
1.11 Complete factorization . . . . . .
1.12 More on recursive functions . . .
1.13 Working with lists . . . . . . . .
2 The
2.1
2.2
2.3
.
.
.
.
.
.
.
.
.
.
.
.
.
3
4
6
6
7
7
8
9
11
12
16
17
18
19
Euclidean algorithm
The Euler '-function . . . . . . . . . . . . . . . . . . . . . . .
The extended Euclidean algorithm . . . . . . . . . . . . . . .
Orders of units modulo n . . . . . . . . . . . . . . . . . . . . .
20
21
24
25
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Sieving for primes
27
3.1 Counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4 Mersenne primes
31
4.1 The Lucas-Lehmer test . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Repunits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3 Emirps and palindromic primes . . . . . . . . . . . . . . . . . 34
5 Fermat’s theorem
34
5.1 Carmichael numbers . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 The Miller-Rabin test . . . . . . . . . . . . . . . . . . . . . . . 36
6 The sum of the squares of the digits of a number
38
7 Finding orbits for sums of e-th powers
41
8 Aliquot sequences
41
9 Ciphers
42
10 Stu¤
43
0
Python
You can’t do computational mathematics without writing programs. In this
course, all of the programs are to be written in Python. There is a lot of
material on Python on the web. The particular version I’m going to use is
Python 2.7.3. The …rst thing you need to do is to download Python 2.7.3.
You can do this at
http://www.python.org/getit/
After you have downloaded Python, open IDLE, the Python GUI (graphical user interface). This gives you an interactive window in which you can
play with Python. The prompt line looks like
>>>
You can try various arithmetic operations …rst. If you type 2+7, and then
“Enter”, you should see
2
9
>>>
Try typing 2*7 and 2/7 and 2**7. The last one should give you 27 which is
128. Now try 2**100. The L at the end of the number means that it is a
“long integer”.
Now type range(10). You should get a list of the …rst ten numbers,
starting with 0. Lists in Python are enclosed with square brackets, and the
entries are separated by commas. Now try range(5,10) and range(-5,10).
Try range(a,b) with various values of a and b until you understand how it
works. When you understand, write down a description of what range(a,b)
is for arbitrary integers a and b.
Now type range(0,100,7) and range(-20,20,3). What do these look
like? Try a few more and then write down a description of what range(a,b,c)
is for arbitrary integers a, b, and c.
1
Primes
We will be working mostly with the integers:
:::
5; 4; 3; 2; 1; 0; 1; 2; 3; 4; 5; : : :
A prime number is an integer that is greater than 1, but is not the product
of two integers that are greater than 1. The two smallest prime numbers are
2 and 3. The number 4 is not prime because 4 = 2 2; the number 51 is not
prime because 51 = 3 17. The …rst twenty prime numbers are
2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71
Your …rst project regarding primes is to familiarize yourself with the Python
operator %. Get into the Python interactive mode (shell or console). One
way to do this is just to open IDLE from the start menu. You should see the
prompt:
>>>
If you type 2+3, and then “Enter”, you should see
5
>>>
3
If you type 100%17, and then “Enter”, you should see
15
>>>
That’s because when you divide 100 by 17 you get a remainder of 15. That
is, 100 = 5 17 + 15. I have no idea why they use the percent sign to indicate
that function— the same thing is done in the language C.
What happens when you type 100%-17 or -100%17 or -100%-17? Your
…rst assignment is to come up with a description of a%b when a and b are
arbitrary integers.
The second part of that assignment is to describe b/a for arbitrary integers a and b. If you type 100/17, and then “Enter”, you should see
5
>>>
That’s because when you divide 100 by 17 you get a quotient of 5 (and a
remainder of 15). So 100 is equal to (100/17)*17+(100%17). Type that last
expression into Python, in the interactive mode, and press “Enter”. What
do you get?
The third part of the assignment has to do with the division algorithm.
One way of stating the division algorithm is as follows:
If a and b are integers, and a 6= 0, then b = qa + r for unique
integers q and r, with 0
r < jaj. The integer q is called the
quotient, and the integer r, the remainder.
Your question is, what is the relationship between q and a/b, and between
r and b%a? You can only …nd out by experimenting with Python. You are
trying to describe how Python behaves. Experiment with both negative and
positive values of a and b.
1.1
The smallest prime factor
We want to write a Python program that computes the smallest prime factor
p of a given number n. That is, we want to implement Proposition 31 of
Book VII of Euclid’s Elements which says that any number greater than 1
is divisible by a prime.
You can do this in the interactive mode, but it is less harrowing to do it
in the IDLE mode. Here we work with a text editor on a …le containing a
4
Python program. You can create such a …le from the IDLE interactive mode:
open a new window by clicking on the “File”button and then choosing “New
Window”. An IDLE editing window comes up. You are now set to write some
Python code which you can save afterwards by clicking on the “File”button
and then choosing “Save As”. At that point you will have to choose where
you want to save your …le, and what its name will be. Call it something like
“…ddle.py”, or something more descriptive, but with the “.py”extension.
In the IDLE editing window, type the short version of the classic …rst
program:
print "Hello"
If you are writing a program in the IDLE editing window, you have to use
the word print in order to get any output. The quotation marks indicate
that the word Hello is text. Without the quotation marks, Python would
assume that it was a variable. You can use single quotes or double quotes.
The single quote on my keyboard is on the same key as the double quote. I
can’t seem to get an appropriate single quote with the word processor I’m
using. The best I can do is
print ’Hello’
To run the program, press the F5 key. If you haven’t named the program
…le before, you will be asked to do so now. Otherwise, Python will simply
ask you if it is okay to save the …le, which you will normally agree to. The
output should then appear in another window called "Python shell". This
window is essentially the interactive window you worked with above. So you
will normally have two windows going: one where you write your program,
and one where you see your output. The output for this program should be
Hello.
Try running the program after removing the quotation marks. You should
get an ugly error message in red saying that Hello is not de…ned. Without
putting back the quotation marks, insert the line
Hello = 3
at the beginning of the …le, before the print command. This will set the variable named Hello equal to 3. Run the program again to see what happens.
The smallest prime factor of n is the smallest positive integer p greater
than 1 which divides n. We might as well assume that n 0 because if n < 0,
we could look at n instead. If n = 0, then I guess its smallest prime factor
5
is 2; what do you think? If n = 1, then it has no prime factors. (Nobody
considers 1 to be a prime, although one can argue that it is.) Those are the
only exceptional cases: The number 0 is exceptional because it is smaller
than its smallest prime factor; the number 1 is exceptional because it has no
prime factor.
So, for the computation, you may assume that n > 1. You want to try
to divide n successively by 2; 3; 4; 5; : : : until you are successful. You can do
that with the function n % m. So you want to see when n % m is …rst equal
to zero. The condition that n % m is equal to zero is written
n % m == 0
with two equal signs (as in C). One equal sign is used in an assignment
statement: x = a + b says to set the value of x equal to a + b. The
expression x == a + b doesn’t say to do anything— it has the value True if x
is equal to a + b, and the value False if x is not equal to a + b. An expression
that takes on the values True or False is called a boolean expression.
Another boolean expression is x < y.
1.2
Using an if-statement
The typical use of boolean expressions is in an if statement. The statement
if n % m == 0: print ’Hello’
prints out ‘Hello’if m divides n. Note the colon after the if-statement. You
need that.
1.3
Using a for-statement
Suppose n > 1 is an integer. We want to run through the integers m, starting
at 2, to see if any of them divide the positive integer n. When we come to the
…rst one that does, we print it out. There is at least one that does, namely
n itself. We can do this as follows for the integer n = 97:
n = 97
for m in range(2,n+1):
if n % m == 0:
break
6
print m
Note the colons after the for-statement and after the if-statement. The
“break” instruction says “get out of this for-statement”. The indentations
are very important in Python. They tell you that the for-statement goes on
until the statement print m. The list range(2,n+1) consists of those integers
m such that 2 m < n + 1. Note that 2 is an allowable value of m, but n + 1
is not. In general, the set range(a,b) consists of those integers m such that
a m < b. The value of m when we exit the for-loop will be the smallest
integer greater than 1 that divides n.
Write this program in your …le …ddle.py, or write it in a another …le with
a name like temp.py. Run this program using di¤erent values of n. What
can you say about n if the program prints out n?
1.4
Testing the algorithm
You should test the program to see if it works. Of course you just could try
it on a few selected numbers, and you should do that. Another thing you
could do is print out what it does on the …rst 100 numbers greater than 1
(for example) and eyeball the results to see if they look okay. This, of course,
would be done with another for-loop, which would start:
for n in range(2,102):
Inside that for-loop we have to put our original for-loop. Moreover, you
should not just print out m, you should print out both n and m, so you will
get a list of pairs: the number, and its smallest prime factor. Something like
print n,m
which prints out n followed by a space followed by m.
1.5
De…ning a function
Better yet, you can de…ne a function that takes a natural number n > 1 and
returns its least prime factor. Then you can call that function instead of
having to worry about nested for-loops. You already have the required lines
of code, you just have to write it down as a function:
def spf(n):
for m in range(2,n+1):
7
if n % m == 0:
return m
The …rst line says that we are de…ning a function called spf that will act
on a variable n. The command “return m” makes m the result of applying
the function spf to n. You don’t have to write break to get out of the loop
because a return statement takes you out of the whole function.
To test this function, write a for-loop that runs n from 2 to 50, say, and
print out n together with the smallest prime dividing n. Your print statement
will look something like
print n, spf(n)
1.6
Speeding the algorithm up
This is a dumb algorithm. For one thing, it’s pointless to see if m divides
n when m2 is greater than n. Why is that? Well, if m2 > n and m divides
n, then n=m < m also divides n so we would have already broken out of the
loop when we tested the smaller number n=m. Once m2 is greater than n, we
already know that n is prime, so we should just return n. Notice that if n is
around 1000000, this means that you will be doing around 1000 tests rather
than 1000000 tests, so this improvement is well worth doing. It doesn’t take
much, just (essentially) one more line:
def spf(n):
for m in range(2,n+1):
if n % m == 0:
return m
if m*m > n:
return n
You can write this in fewer lines by putting the return-statements up on the
same lines as the if-statements:
def spf(n):
for m in range(2,n+1):
if n % m == 0: return m
if m*m > n: return n
8
The upper limit n + 1 is only necessary when n = 2, otherwise we could use
the upper limit n (why is that?). We could add a line after the for-loop that
returns 0, so that the function will return something even if n is less than 2,
but we are really only interested in what happens when n 2.
1.7
Goldbach’s conjecture
Goldbach’s conjecture is that you can write any even number greater than 2
as the sum of two primes. Here is a short table showing how that works:
4=2+2
6=3+3
8=3+5
10 = 3 + 7 12 = 5 + 7
14 = 3 + 11 16 = 3 + 13 18 = 5 + 13 20 = 3 + 17 22 = 3 + 19
24 = 5 + 19 26 = 3 + 23 28 = 5 + 23 30 = 7 + 23 32 = 3 + 29
Some even numbers can be written as sums of two primes in more than one
way. For example, 10 = 3 + 7 = 5 + 5 and 22 = 3 + 19 = 5 + 17 = 11 + 11.
Goldbach’s conjecture is one of the big unsolved problems in mathematics.
Nobody knows whether it is true or false.
About twenty years ago, Apostolos Doxiadis wrote a novel called Uncle
Petros and Goldbach’s conjecture. The main character of this novel was
determined to settle Goldbach’s conjecture. The publishers o¤ered a prize of
one million dollars to anyone who proved Goldbach’s conjecture within two
years. As I recall, Uncle Petros goes o¤ the deep end when he …nds out that
it is conceivable that Goldbach’s conjecture is true, yet not provable. That’s
from a famous result in logic by Kurt Gödel, who was a bit crazy himself.
Be that as it may, we can write a program that tries to write even numbers
as sums of two primes. The number 4 is an anomaly, as it is the only even
number that can be written as the sum of two primes where one of the
primes is 2. In fact, Goldbach’s conjecture is often stated in the form: any
even number greater than 4 is the sum of two odd primes.
So we want to run through some even numbers and try to write each one
as the sum of two odd primes. An even number is a multiple of 2, so we want
to run through numbers of the form 2n for n = 2; 3; 4; 5; 6; : : :. Let’s start
out by running n from 2 to 16, so that we will duplicate the results in the
table above. We can do that with the line
for n in range(2,17):
If 2n is the sum of two primes, then the smaller prime is at most n, so we
want to run through the primes p with p n. We also want 2n p to be a
9
prime. So we run through the numbers p with p n, and test to see if p and
2n p are both primes. That’s why we write the second line as follows:
for n in range(2,17):
for p in range(2,n+1):
To test whether p and 2n p are both primes, we use an if-statement and
our function spf. Here is what a third line might look like:
for n in range(2,17):
for p in range(2,n+1):
if spf(p) == p and spf(2*n-p) == 2*n-p:
print 2*n,"=",p,"+",2*n-p
Notice the use of our function spf to check whether p and 2n p are primes.
Not also the use of the word “and” to make the if-statement dependent on
two conditions. Finally, notice that we have to write 2*n, not 2n to indicate
the number 2n.
Run this program. If everything is working right, it should print out all
possible ways of writing each even number as the sum of two primes. Try
using di¤erent numbers for 17.
An interesting exercise would be to write a program that, instead of
printing out the di¤erent ways each even number can be written as the sum
of two primes, it would print the number of ways each even number can be
written as the sum of two primes. So next to the number 90 it would print
the number 9, because 90 can be written in 9 ways as the sum of two primes.
At least according to my calculations.
What is the smallest even number that can be written as a sum of two
primes in 11 ways? What is the smallest even number that can be written
as the sum of two primes in exactly 11 ways?
It’s a little cryptic to use the function spf to see if a number is prime.
Here is an function that does this directly.
def prime(n):
if n < 2: return False
for m in range(2,n):
if n % m == 0: return False
return True
The purpose of the …rst line is to avoid returning True for negative numbers
and for the numbers 0 and 1. The next two lines test to see if n is divisible
by a number m such that 2 m < n, and returns False if it is. Note that if
10
n = 2, then there are no such numbers m. If n does not get eliminated this
way, then the last line returns True. A re…nement on this function is to add
a line that returns True as soon as m2 > n, because in that case we already
know, without further testing, that n must be prime. So we could write
def prime(n):
if n < 2: return False
for m in range(2,n):
if n % m == 0: return False
if m*m > n: return True
return True
This re…nement helps if we are testing a very large prime.
1.8
Sophie Germain primes
Sophie Germain, a French mathematician who lived from 1776 to 1831,
worked on one of the most famous problems in mathematics: Fermat’s last
theorem. One of her results concerning this problem involved primes p such
that 2p + 1 is also a prime. These primes have come to be known as Sophie
Germain primes. The …rst few Sophie Germain primes are 2, 3, 5, 11, and
23. You can see that 2 2 + 1 = 5 is a prime, as are 2 3 + 1 = 7, 2 5 + 1 = 11,
2 11 + 1 = 23, and 2 23 + 1 = 47. The primes 7 and 13 are not Sophie
Germain primes because 2 7 + 1 = 15 and 2 13 + 1 = 27 are not primes.
Fermat’s last theorem says that if p is an odd prime, then there are no
solutions to the equation
xp + y p = z p
with x, y, and z positive integers. The so-called “…rst case”of Fermat’s last
theorem is to show that there are no solutions with x, y, and z, not divisible
by p. Sophie Germain proved that the …rst case of Fermat’s last theorem is
true if p is a Sophie Germain prime.
Fermat’s last theorem was not the last theorem that Fermat proved, or
the last theorem that Fermat claimed to be true. For a long time it was
the only theorem of Fermat that remained unproved, among the theorems
that he had claimed were true. In 1995, Andrew Wiles published a proof of
Fermat’s last theorem, 358 years after it was stated.
If p is a Sophie Germain prime, then the prime q = 2p + 1 is said to be
a safe prime. This term has its origin in cryptography. It refers to the fact
that q 1 does not have many prime factors— in fact it has only two: 2 and
11
p. That makes the prime q useful for certain encryption algorithms. A prime
q is safe exactly when the number (q 1) =2 is prime.
The largest known Sophie Germain prime is
18543637900515 2666667
1
A Cunningham chain (of the …rst kind) is a sequence of primes such that
each one is twice the preceding one plus 1. So the sequence 2, 5, 11, 23, 47 is
a Cunningham chain. This chain cannot be extended because 2 47 + 1 = 95
is not a prime. Another Cunningham chain is the sequence 89, 179, 359, 719,
1439, 2879. The longest known Cunningham chain consists of 17 primes, the
…rst of which is 2759832934171386593519.
A Cunningham chain is complete if it cannot be extended, in either direction, to form a larger Cunningham chain. So the …rst term in a complete
Cunningham chain is not a safe prime, and the last term is not a Sophie
Germain prime. Note that in the Cunningham chain 89, 179, 359, 719, 1439,
2879, the number 89 is not safe because (89 1) =2 = 44 is not a prime, and
2879 is not a Sophie Germain prime because 2 2879 + 1 = 5759 = 13 443
is not a prime.
Here is a simple function that returns True when applied to numbers that
are prime but not safe (unsafe primes), and False otherwise.
def unsafe(n):
if prime(n) and not prime((n-1)/2): return True
return False
Write a program to …nd the longest Cunningham chain whose …rst prime
is less than 1000. This chain will necessarily be complete.
One can think of the primes as being partitioned into complete Cunningham chains. The chains of length one are the primes that are neither Sophie
Germain primes nor safe primes. The …rst such chain is 13. What is the …rst
chain of length two?
It would be interesting to write a program that lists all the Sophie Germain primes p less than 1000 that are not safe primes, together with the
length of the complete Cunningham chain starting at p.
1.9
There are in…nitely many primes
In Book VII, Proposition 37, Euclid proves that any number greater than 1
is divisible by a prime number. The function spf(n) is an implementation of
12
that proposition: you give it a number n > 1, and it returns a prime number
that divides n. Of course what spf(n) actually does is to return the smallest
number p > 1 that divides n. This number p is necessarily prime because if
p had a nontrivial factor, that factor would be smaller than p and would also
divide n.
There is a great presentation of Euclid’s elements by David Joyce on the
web, complete with Java applets to illustrate the propositions. The url is
aleph0.clarku.edu/~djoyce/java/elements/elements.html
Click on the number of the book that you want, then click on the proposition
that you are interested in.
In Book IX, Proposition 20, Euclid proves that prime numbers are more
than any assigned multitude of prime numbers. By that he meant that if
you are given any (…nite) list of prime numbers, then you can construct a
prime number that is not on that list. This is a positive formulation of the
statement that there are in…nitely many primes. We can implement this
proposition of Euclid also.
Euclid’s proof says to multiply all the primes in the given list together to
get a product P , and then consider the number P + 1. The number P + 1
cannot be divisible by any of the primes on the list, because P is divisible by
all those primes. But Proposition 37 of Book VII says that P + 1 is divisible
by some prime. So that prime cannot be on the list.
This procedure can be used to generate a sequence of primes, starting
from nothing. To get a prime, we have to start with a list of primes, but we
can take the initial list to be empty! What is the product P of the primes on
an empty list? As we shall see, when we actually write down the program,
the product of an empty list of numbers is equal to 1. That might seem a
little surprising, but it turns out to be very natural. So P + 1 is equal to 2,
the …rst prime generated by Euclid’s construction.
So our list starts out empty, which in Python is the list [ ]. Then we
get the list [2]. Applying Euclid’s construction to that list gives P = 2 and
P +1 = 3. So the next prime we get is 3, and our list of primes becomes [2,3].
Continuing in this way, you can see that we get the prime 7 = 2 3 + 1, then
the prime 43 = 2 3 7+1. The next number we consider is 2 3 7 43+1 = 1807.
But 1807 = 13 139 is not prime. So, since we are taking the smallest prime
dividing P + 1, the next prime on our list is 13.
Here is a program that multiplies all the numbers on a list L together:
13
P=1
for i in L: P = P*i
After this program has executed, the variable P will be the product of the
numbers in the list L. For example, if L = [2,3,5], then P will start out as 1,
then become P i = 1 2 = 2, then P i = 2 3 = 6, then 6 5 = 30. You
can see that if the list L is empty, then P remains equal to 1, its initial value,
because there are no numbers i in L.
So how do we implement Book IX, Proposition 20? First we decide how
many times we want to repeat the basic construction. Let’s call that number
m, and set it equal to 5 at …rst. So our …rst line will be m = 5. Then we set
up the list L which is initially empty. So our second line will be L = [ ]. Then
have some variable run through the numbers from 0 to m 1. It makes no
di¤erence what we call that variable, the idea is simply to repeat the basic
construction m times. Then we multiply all the numbers in the list L, as
before. So our …rst few lines will be
m=5
L = []
for n in range(m):
P=1
for i in L: P = P*i
Now what? The construction says to …nd a prime that divides P + 1. So our
next line should be to compute spf(P+1). Call that prime p and print it out
so we can see what’s happening. Then, and this is crucial, we have to tack
p onto the end of the list L. Perhaps the easiest way to do that is with the
command L.append(p). Another way is L = L + [p], or even L += [p]. We
can use a plus sign to put two lists together. I …nd that easier to remember,
but you have to be sure you are putting two lists together, so we need to
write [p] rather than p. So our program becomes
m=5
L = []
for n in range(m):
P=1
for i in L: P = P*i
p = spf(P+1)
print p
L.append(p)
14
Notice that this program calls on the function spf, so the de…nition of that
function has to be in our …le also.
When I try to run this program for m = 9, Python, or my computer,
gets angry because it can’t hold a list of the size required in the function spf.
Because of that, I modi…ed the program for spf so that it doesn’t use the
range operation, which is what creates the big list. I rewrote it this way:
def spf(n):
i=2
while i*i < n+1:
if n%i == 0: return i
i = i+1
return n
Do you see what that does? I got rid of the for-statement and replaced it
by a while-statement. The indented commands below the while-statement
keep repeating as long as the condition i2 < n + 1 holds. So we have to
change i by hand, so to speak, whereas before the for-statement changed it
automatically. If we get to the point where the condition i2 < n + 1 fails to
hold, then we know that n is prime because we have tried to divide n by on
all numbers whose squares are at most n.
With this new de…nition, I managed to get the program to run with
m = 14. The fourteenth prime that it generated was 5471. With a little more
patience on my part, it ran with m = 15. The …fteenth prime generated was
52662739. What is the eighth prime in this sequence?
If your program is taking a long time to factor some large number, there
is help on the web. A good site is www.alpertron.com.ar/ECM.HTM which
will factor very large numbers in very little time.
There are at least two other ways to implement Book IX, Proposition 20.
The …rst is to multiply the …rst few primes together, add one, and …nd a
prime factor of the result. So we would start with 2. As 2 + 1 = 3 is prime,
the next prime we get is 3. As 2 3 + 1 = 7 is prime, the next prime we get
is 7. As 2 3 5 + 1 = 31 is prime, the next prime we get is 31. See how this
algorithm di¤ers from our …rst in which we looked at 2 3 7+1 = 43, omitting
the prime 5. The next prime we get in this new sequence is 2 3 5 7+1 = 211,
which is also prime. If 211 were not a prime, we would calculate its smallest
prime factor. Write a program that computes this sequence. Do you always
get primes without factoring, or do we sometimes get a composite number
which we have to factor to get our new prime?
15
Another way to show there are in…nitely many primes is to show that
for each number n, prime or not, there is a prime greater than n. In a way,
this is the simplest algorithm: we simply …nd a prime factor of n! + 1. That
number cannot be divisible by a prime i n, because if i n, then i divides
n!. The …rst few numbers we get that way are 1! + 1 = 2, 2! + 1 = 3, and
3! + 1 = 7. The next number is 4! + 1 = 25, which is not a prime but all of
its prime factors are bigger than 4. The next number is 5! + 1 = 121 which (I
think) is a prime. Write a program that uses this procedure to …nd a prime
greater than n for as many numbers n as you can. Print out n, followed by
the prime that is greater than n, for each number n.
1.10
Largest prime factor
Suppose we want to calculate the largest prime factor of a number n. Here
is a plan. First compute the smallest prime factor p of n. We already have a
program for that. If p = n, then p is the largest prime factor of n. Otherwise,
the largest prime factor of n is equal to the largest prime factor of n=p. That’s
because p is the smallest prime factor of n, and n = p n=p.
How to we turn this plan into a program? The …rst couple of lines are
clear:
def Lpf(n):
p = spf(n)
if p == n: return n
Now what? What we want to do now is …nd the largest prime factor of n=p.
That is the function that we are in the middle of de…ning! In fact, we can
use that function as part of its own de…nition. This is known as a recursive
de…nition. So we simply add one more line:
def Lpf(n):
p = spf(n)
if p == n: return n
return Lpf(n/p)
This works because n=p is always smaller than n. What actually happens
when we compute Lpf(50)? The …rst line of the de…nition calculates the
smallest prime factor p = 2 of 50. As 2 is not equal to 50, we go to the
16
third line which has us compute Lpf(25). Now we are at the …rst line of the
program with n = 25, which calculates the smallest prime factor p = 5 of 25.
As 5 is not equal to 25, we go to the third line which computes Lpf(5). To
do that, we …nd the smallest prime factor p = 5 of 5. But then n = p, so,
in accordance with the second line of the de…nition, we return 5, the largest
prime factor of 50.
1.11
Complete factorization
How do we get a program to tell us what all the prime factors of a number
are? We want to enter a number n and have the output be a list of the primes
dividing n. Probably we want to allow repetitions on this list, so when we
enter the number 72, the list that gets returned is [2; 2; 2; 3; 3] rather than
just [2; 3]. Note that a list is Python is written as a sequence of elements,
separated by commas, and enclosed in square brackets. If we “add”two lists
in Python, we get the concatenation of the two lists. Thus
[1] + [2] = [1; 2]
[1; 2; 10] + [2; 13] = [1; 2; 10; 2; 13]
[1] + [] = [1]
The natural way to handle this problem is via a recursive function, one
that calls on itself. Let’s denote the function we want to de…ne by allpf (n).
Then what we want to do is …rst compute
m = spf (n)
and then form a list allpf (n) whose …rst element is m and whose remaining
elements form the list allpf (n=m). The list allpf (n) can be constructed in
Python as the sum of two lists:
allpf (n) = [m] + allpf (n=m)
This last equation is the recursion: we de…ne allpf in terms of itself. How
does it work in practice? If n = 12, then m = spf (12) = 2, and n=m = 6, so
we have
allpf (12) = [2] + allpf (6)
Proceeding, we get
allpf (6) = [2] + allpf (3)
17
so
allpf (12) = [2] + [2] + allpf (3)
Finally,
allpf (3) = [3] + allpf (1) = [3] + [] = [3]
so
allpf (12) = [2] + [2] + [3] = [2] + [2; 3] = [2; 2; 3]
This works because of two things. One is that the number n=m that we
apply the function allpf to on the right, is always smaller than the number
n that we apply allpf to on the left. The other is that if n is small enough
(equal to 1), then we know directly what allpf (n) is.
So here is the general idea. If n = 1, then return the empty list. If n > 1,
then set m = spf (n), and return the list [m] + allpf (n=m). Here is a formal
de…nition in Python:
def allpf(n):
if n == 1: return []
m = spf(n)
return [m] + allpf(n/m)
1.12
More on recursive functions
The prototype recursive function is one that computes the factorial function,
the product of the integers from 1 to n:
n! = n (n
1) (n
2)
3 2 1
The key fact is that n! is equal to n times (n 1)!, at least if n is positive, as
you can see by looking at the product. Thus if we could compute (n 1)! we
would know how to compute n!. The computation for 4! would go as follows:
4!
3!
2!
1!
=
=
=
=
4
3
2
1
3!
2!
1!
0!
but now we can’t go further, we have to know what 0! is. Well, 0! is equal
to 1. That’s our exit condition. Then we work our way backwards through
18
the equations to …nd that
1!
2!
3!
4!
=
=
=
=
1
2
3
4
0! = 1
1! = 2
2! = 3
3! = 4
1=1
1=2
2=6
6 = 24
The program would look like:
def fac(n):
if n == 0: return 1
return n*fac(n-1)
What happens when it runs on n = 4? When we run fac(4), it tries to
return 4*fac(3), so it has to run fac(3). We now have two programs running
at once: fac(4) and fac(3), with fac(4) waiting for the result from fac(3).
Similarly fac(3) tries to return 3*fac(2), so it has to run fac(2). We now have
three programs running at once. And so on. We can indicate this by a table:
program
fac(4)
fac(3)
fac(2)
fac(1)
fac(0)
returns
4*fac(3)
3*fac(2)
2*fac(1)
1*fac(0)
1
At the end we have …ve fac programs running at once. But fac(0) knows
what to do without calling fac any more: it returns 1. Now fac(0) = 1 is
passed to the fac(1) program giving the result fac(1) = 1*1 = 1. Then fac(1)
= 1 is passed to the fac(2) program giving the result fac(2) = 2*1 = 2. This
result is passed to the fac(3) program, yielding fac(3) = 3*2 = 6, and …nally
this result is passed to the fac(4) program yielding fac(4) = 4*6 = 24.
1.13
Working with lists
A list is a sequence indexed by the integers 0; 1; 2; : : : ; n 1. The n elements
of the list a are denoted a[0], a[1], a[2], . . . , a[n-1]. The number n is the
length of the list and can be accessed by writing len(a).
A string s can be thought of as a special type of list whose elements are
alphanumeric characters. Its length is also denoted len(s), but a string is
handled a little bit di¤erently from a list.
19
The simplest way to create a list is by the statement
a = []
which creates a list with no elements— an empty list. An empty list has
length 0. So, if after de…ning the list a above, you printed out len(a), you
should see 0.
You can put an element m onto the end of an array a by writing a.append(m).
If a is an empty list, then a.append(17) changes a into the list [17]. If you
want a list containing the squares of the numbers from 0 to 9, you can get
one by …rst setting a = [] and then running the loop
for i in range(0,10): a.append(i*i)
We can stick an element m onto the beginning of the list a by writing
a.insert(0,m). If take the list a generated by the loop above, and then write
a.insert(0,22), the list a would become [22,0,1,4,9,16,25,36,49,64,81].
Actually, I have trouble remembering how to use append and insert. I …nd
it easier to use addition of lists. So instead of writing a.append(m), I write
a = a + [m], and instead of writing a.insert(0,m), I write a = [m] + a. The
general idea is that when you add two lists, the result is a list consisting of
the two lists put together. The list
[22,0,1,4,9] + [16,25,36,49,64,81]
is the list
[22,0,1,4,9,16,25,36,49,64,81]
2
The Euclidean algorithm
The Euclidean algorithm is one of the oldest and best algorithms we have.
It computes the greatest common divisor of two positive integers a and b.
You can …nd it in Book VII Proposition 2 of Euclid’s Elements. Euclid also
showed that any number that divides both a and b divides their greatest
common divisor. I think of that fact as being the “algebraic gcd property”.
Here is a short program that computes the gcd:
def gcd(a,b):
while a!=0: a,b = b%a,a
20
return abs(b)
The Euclidean algorithm is most naturally written recursively. Euclid
pointed out that the common divisors of a and b (those numbers that divide
both a and b) are the same as the common divisors of a and b a. So if
0 < a b, we can reduce the problem of …nding gcd (a; b) to that of …nding
gcd(a; b a). The latter is a simpler problem because the numbers are smaller
(certainly their sum is smaller). We can re…ne that observation a bit for our
purposes by noticing that we can repeatedly subtract a from b until we get
a number that is smaller than a. Actually, Euclid noted that also: he talked
about repeatedly subtracting the smaller number from the larger number. If
you repeatedly subtract a from b until b becomes smaller than a, you end up
with the remainder you get when you divide b by a. (Recall that division of
positive integers can be thought of as repeated subtraction.) So using the
Python % notation for the remainder, the key equation is
gcd (a; b) = gcd (b % a; a)
Here the b % a is written …rst because it is (strictly) smaller than a. Of
course b % a might be equal to zero, in which case we are done because
gcd (0; a) = a. This gives us our exit condition from the recursion: if a is 0,
then gcd (a; b) = b. The usual convention is that gcd (0; 0) = 0 even though,
strictly speaking, there is no greatest common divisor of 0 and 0. However,
zero is obviously the algebraic gcd of 0 and 0 because any number that divides
both 0 and 0 divides zero (and zero is the only number with that property).
Rather than making your algorithm foolproof, at least at …rst, just make
it work when it is handed integers a and b with 0
a
b. Use the exit
condition
gcd(0; b) = b
and the recursion displayed above. Test your algorithm on a few small pairs
a and b where you know what the answer is.
There is a built-in Python function that will return gcd (a; b). You can
import it from the module fractions. If you put the line from fractions import
gcd at the top, then you can use their function gcd(a,b) in your programs.
2.1
The Euler '-function
Two positive integers a and b are said to be relatively prime, or coprime,
if they have no common divisors except for 1. Notice that 1 is relatively prime
21
to any positive integer. It’s easy to see that two numbers a and b are relatively
prime exactly when gcd (a; b) = 1, so the Euclidean algorithm provides a test
for when a and b are relatively prime. The Euler '-function is de…ned, for
n > 1, by setting ' (n) equal to the number of positive integers less than n
that are relatively prime to n. Thus ' (10) = 4 because the numbers less
than 10 that are relatively prime to 10 are 1, 3, 7, and 9. Usually people
set ' (0) = 0 and ' (1) = 1, but we really don’t care about those values,
especially ' (0). We can justify ' (1) = 1 by modifying the de…nition of
' (n) to read “the number of positive integers less than or equal to n that
are relatively prime to n.”This, in fact, is the usual de…nition. It also gives
' (0) = 0.
Once you get your gcd function working, you can use it to compute the
' function. For each m just run through the numbers 1; 2; : : : ; m, checking
each one to see if it is relatively prime to m, and counting how many times
that happens. After writing such a function, test it by printing out, in two
columns, the values m and ' (m) for m = 2; : : : ; n.
The …rst few values of the Euler '-function are
n 0 1 2 3 4 5 6 7 8 9 10
' (n) 0 1 1 2 2 4 2 6 4 6 4
Verify this table.
As a …nal check on your ' function, you can compute it in a di¤erent
way. The formula is
Q
1
' (n) = n
1
p
p2P
where P is the set of primes that divide n. So
1
2
' (10) = 10 1
1
1
5
and
1
1
1
2
3
If you write it that way, you will be dealing with real numbers, at least as
intermediate values. In order to deal only with integers, you can rewrite the
formula as
' (12) = 12 1
n
Q
p2P
1
1
p
=n
Q
p2P
p
1
p
22
=Q
n
p2P
Q
p p2P
(p
1)
So
Q for n = 10, the set P is f2; 5g and the product
1) is 1 4 = 4 so
p2P (p
' (10) =
p2P
p is 10. The product
10
4=4
10
For
Q n = 12, the set P is f2; 3g and the product
1) is 1 2 = 2 so
p2P (p
' (12) =
Q
Q
p2P
p is 6. The product
12
2=4
6
You can use allpf(n) to …nd the set P . What you need to do is to eliminate the
duplicates from the list that allpf(n) returns. It’s probably better to modify
the function allpf(n) so that it doesn’t return any duplicates. When you …nd
the smallest prime dividing n, keep dividing n by that prime until you can’t
anymore, then go on to …nd the smallest prime dividing what’s left. You
should probably give the modi…ed function a di¤erent name.
Let’s do it. We’ll call the function allpd(n) for “all prime divisors”.
def allpd(n):
if n == 1: return []
#the number 1 has no prime divisors
i=2
#the number 2 is the smallest prime
while i*i <= n:
#looking for the smallest prime divisor of n
if n%i == 0:
#found it!
while n%i == 0: n = n/i
#keep dividing n by i
return [i] + allpd(n)
#put i in front of the list
i = i+1
#increase i and keep trying
return [n]
#n must be prime, so the list is [n]
Once you get a list A whose entries are the elements of the Q
set P with
no
duplicates,
you
can
easily
compute
the
two
required
products
p2P p and
Q
1). Now you can get a third column to your output consisting of
p2P (p
the values of ' (n) computed by this formula, and you can see if it agrees
with the values of ' (n) that you got by counting.
Euler’s Theorem. Fermat’s theorem says that if n is a prime, and n does
not divide a, then an 1 1 (mod n). When n is a prime, then ' (n) = n 1,
and n does not divide a exactly when a and n are relatively prime. So we
can rewrite Fermat’s theorem as “if n is a prime, and a and n are relatively
prime, then a'(n) 1 (mod n). Euler showed that this is true even when n is
23
not prime. So, for example, 26 1 (mod 9), as you can easily check. Indeed,
you should write a program that checks Euler’s theorem for the numbers n
from 2 to 100.
2.2
The extended Euclidean algorithm
The extended Euclidean algorithm is a re…nement of the Euclidean algorithm
that …nds integers s and t so that sa + tb divides both a and b. If sa + tb is
positive, which we can always arrange, it follows that sa + tb is the greatest
common divisor of a and b, which is why this is a re…nement of the Euclidean
algorithm The equation
sa + tb = gcd (a; b)
is known as Bezout’s equation. Usually either s or t is negative.
An interesting aspect of Bezout’s equation is that if the left-hand side,
sa + tb, divides both a and b, then sa + tb must be the greatest common
divisor of a and b. Indeed, if d is any common divisor of a and b, then d must
divide sa + tb (by the distributive law) so, if everything is positive, d must
be smaller (or equal to) sa + tb.
We will assume that 0
a
b. If that’s not the case, we can always
replace a and b by their absolute values, and interchange them if necessary.
The simplest, if not the most understandable, algorithm is a recursive one.
The “exit strategy” is that if a = 0, then we will choose s = 0 and t = 1
because gcd (0; b) = b = 0 0 + 1 b. (Is gcd (0; 0) = 0?)
The idea behind the Euclidean algorithm is that if b = qa + r, where
0 r < a, then gcd(a; b) = gcd(r; a). Thus we can reduce the computation
of gcd (a; b) to the computation of gcd (r; a), and repeat. In Python, or in
C, the number r is equal to b%a. Because r + a < a + b, these reductions
eventually end with gcd (0; b) = b.
The same thing happens with the extended Euclidean algorithm. Suppose
again that b = qa + r with 0 r < a. If
s0 r + t0 a = gcd (r; a)
then, substituting r = b
qa into this equation gives
s0 (b
qa) + t0 a = gcd (r; a)
so
(t0
s0 q) a + s0 b = gcd (r; a) = gcd (a; b)
24
Thus taking s = t0 s0 q and t = s0 solves the Bezout equation. That gives
us the following (informal, not Python) algorithm that accepts two integers
0 a b and returns a pair of integers s and t such that sa + tb = gcd (a; b):
bez (a; b)
If a = 0 return (0; 1)
Set r = b mod a
Set q = (b r) =a
Set (s; t) = bez (r; a)
return (t sq; s)
Of course you have to write that in Python. A pair of integers can be
implemented by a list of length 2 so that you can return the pair at one go.
Thus the function bez takes two integers, a and b, and returns an integer list
of length 2. To return the pair (0; 1), you simply return the list [0,1]. For the
…nal return, you return [t - s*q, s]. What are s and t? If you set a variable y
equal to bez (r; a), then s is y[0] and t is y[1].
Watch out for the “If a = 0” because “a = 0” in Python, as in C, sets
the variable a equal to zero, it does not test for whether a is equal to zero.
That’s done with “a == 0”. Everybody makes that mistake, more than
once. Fortunately, this gives a syntax error in Python so you know when you
make this mistake.
How do you test your function bez once you write it? Run through a
bunch of numbers a and b, and print them out together with the corresponding s and t, and also sa + tb. Maybe run a from 5 to n and b from a to n for
modest values of n. See if the answers look correct.
You can automate the checking also. Compute sa + tb and check to see
if it divides both a and b.
2.3
Orders of units modulo n
If gcd (a; b) = 1, we say that a and b are relatively prime. Now …x a number
n, say n = 15. The numbers that are less than n and relatively prime to
n are 1, 2, 4, 7, 8, 11, 13, and 14. Those are called the units modulo 15.
If we multiply two of them, and reduce modulo n, we get another. The
25
multiplication table for n = 15 is
1
2
4
7
8
11
13
14
1 2 4 7 8 11
1 2 4 7 8 11
2 4 8 14 1 7
4 8 1 13 2 14
7 14 13 4 11 2
8 1 2 11 4 13
11 7 14 2 13 1
13 11 7 1 14 8
14 13 11 8 7 4
13
13
11
7
1
14
8
4
2
14
14
13
11
8
7
4
2
1
If the number a is relatively prime to n, then the order of a modulo n is the
least exponent m 1 such that am 1 (mod n). So the order of 2 modulo
15 is 4 as you can see by the sequence of powers of 2
2, 4, 8, 1
The order of 2 modulo 19 is 18 as you can see by the sequence of powers of 2
2, 4, 8, 16, 13, 7, 14, 9, 18, 17, 15, 11, 3, 6, 12, 5, 10, 1
What is the order of 2 modulo 11?
Here is a Python program order(a,n) that is supposed to compute the
order of a modulo n. We need a to be relatively prime to n, so we check for
that right away. You don’t have to do that if you make sure that you never
invoke order(a,n) except when a is relatively prime to n, but it is safer to
guard against that.
def order(a,n):
if gcd(a,n) != 1: return 0
#the order of a unit is never 0
b = a%n
#b will be the various powers of a
e=1
#a to the e is equal to b
while b != 1:
#we quit when b = 1
b = (a*b)%n
#multiply b by a (modulo n)
e += 1
#increase the exponent by 1
return e
#when b = 1 for the …rst time, e is the order of a
Here are a few projects to use this function on. Find the smallest two
numbers that have over 30 units of order 2. Find the smallest two numbers
26
that have over 20 units of order 3. Find out how many units of order 2 there
are modulo n if n is the product of the …rst k odd primes, for k = 1; 2; 3; 4; 5; 6.
Those numbers are 3; 3 5; 3 5 7; 3 5 7 11; : : :. For what numbers n is
the order of 2 equal to ' (n)? For what numbers n is there a number a with
order ' (n)?
3
Sieving for primes
One way to construct a list of the primes from 1 to 1000 is to test each
number from 1 to 1000 to see if it is a prime. A more e¢ cient way is to use
the “Sieve of Eratosthenes”. This is the same Eratosthenes who, around 240
BC, calculated the circumference of the Earth.
The idea is to list the numbers from 2 to 1000
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 : : :
then eliminate all multiples of 2 (except 2 itself) which is the …rst number
23 5 7 9
11
13
15
17
19
21
23
25 : : :
then eliminate all multiples of 3 which is the next number remaining
23 5 7
11
13
17
19
23
25 : : :
then eliminate all multiples of 5 which is the next number remaining
23 5 7
11
13
17
19
23
:::
If we continue in this way until we eliminate all multiples of 31, then we are
left with exactly the primes from 1 to 1000.
One way to implement this sieve in a program is to create a list with
999 entries, indexed by the numbers from 2 to 1000. Fill this array with the
boolean value True, indicating that, as far as we now know, it is true that
each of these numbers is prime. Now we change the entries whose indexes
are (proper) multiples of 2 to the boolean value False, because we know that
4; 6; 8; 10; 12; : : : are not primes. Next we change the entries whose indexes
are multiples of 3 to False, because we know that 6; 9; 12; 15; : : : are not
primes. Notice that 6 and 12 get eliminated twice, but that is harmless. And
so on.
27
You can de…ne an empty list A with the line
A = []
in your program. Then, to get a list of length 1001 (indexed by the numbers
from 0 to 1000), each entry being True, you could have a line like
for i in range(1001): A.append(True)
Alternatively, you get the desired list at one go with the line
A = [True]*1001
Now set A [0] and A [1] equal to False, because 0 and 1 are not primes.
That’s the initial set up. The core of the program eliminates all the composite
numbers (they fall through the sieve), by setting some entries equal to False.
You will have some prime p, which initially is 2, and you will set A[ip] equal
to False for i = 2; 3; 4; : : :. that is, you will set A[ip] equal to False for
i = 2; 3; 4; : : :. That line might look like
for i in range(p*p,1001,p): A[i] = False
The third argument to range tells it to only include every p-th number:
p2 ; (p + 1) p; (p + 2) p; : : :, where number is less than 1001. Notice that we
can start with p2 because we have already set A [ip] equal to False for i < p.
What do we
p want p to range over? It su¢ ces to look at primes that are
smaller than 1001, because if a number lesspthan 1001 is
p divisible by a
prime, then it is divisible by a prime less than 1001. Now 1001 is a little
over 31, so we can run p from 2 to 31. So we start with the line
for p in range(2,32):
We don’t have to work with p unless it is prime, so our next statement checks
A[p]. The full code for sieving A is
A = [True]*1001
A[0] = False
A[1] = False
for p in range(2,32):
if A[p]:
for i in range(p*p,1001,p): A[i] = False
28
Computing
the number 32 by hand is a little hokey. The Python expression
p
for 1001 is 1001**0.5, but we need an integer to replace 32, not a real
number. The simplest thing to use is int(1001**0.5) + 1, which, in fact, is
32. The Python function int rounds a positive real number down, so we add
1 to make sure the end of the range is large enough.
A more ‡exible arrangement would be to replace both occurrences of 1001
by n+1, and set n = 1000 at the top. Then we can sieve up to whatever we
want, simply by setting n equal to whatever we want at the top. Like 20000,
or 1000000.
3.1
Counts
Once we have the boolean list A set up, we can do various counts. The
simplest is just to count the number of primes between 1 and n. This is
accomplished by counting how many of the entries of A are equal to True.
The following function does that:
def nprimes():
count = 0
for b in A:
if b: count+=1
return count
You may not even want a function here. Just use the middle three lines and
then print count.
Twin primes are two primes that di¤er by 2, like 17 and 19, or 71 and
73. While we have known since Euclid that there are an in…nite number of
primes, no one has succeeded in proving that there are an in…nite number of
twin primes. We can count the number of (pairs of) twin primes in the same
way we counted the number of primes:
def ntwins(n):
count = 0
for m in range(2,len(A)):
if A[m] and A[m-2]: count+=1
return count
The conditional statement count+=1 is executed exactly when both A[m]
and A[m-2] are True, that is, when m 2; m is a pair of twin primes.
29
The gap between consecutive primes p and q is q p 1. The gap
between 2 and 3 is zero (there are no nonprimes between 2 and 3) while the
gap between 7 and 11 is three (there are three nonprimes, 8; 9; 10, between
7 and 11). The gap between twin primes is 1.
Our next task is to count the number of gaps of di¤erent sizes between
consecutive primes from 1 to n. Our result should be an array G such that
G [i] is the number of gaps of size i. It will be convenient to focus on the
larger of the two consecutive primes. So our main loop will run i from 3 to
n. But we have to compare i with the value of the previous prime, so we will
need a variable to take care of that: say lastprime. Since we are starting at
i = 3, we initially set the value of lastprime equal to 2. So our function will
contain the lines:
def gaps(n):
lastprime = 2
for i in range(3,n-1):
return G
Of course we have to set up G as a list at the beginning, and we will only do
something in the loop when i is prime. Also, when we …nd the next prime i,
and we do something in the loop, we have to set lastprime equal to i before
we go looking for the next prime after i. So our function will contain the
lines:
def gaps(n):
G = []
lastprime = 2
for i in range(3,n-1):
if(A[i]):
lastprime = i
return G
Now what do we want to do with the numbers lastprime and the next prime
after it, which will be i? The gap g between those two consecutive primes
is i - lastprime - 1. We want to increase G[g] by 1. A problem here is that
G[g] may not be de…ned because this may be the …rst gap of size g or more
30
that we have encountered. So we test to see if G[g] is de…ned. If it isn’t, we
append zeros to G until it is. So our function becomes:
def gaps(n):
G = []
lastprime = 2
for i in range(3,n+1):
if A[i]:
g = i - lastprime - 1
while g >= len(G): G.append(0)
G[g]+=1
lastprime = i
return G
A prime triple (or triplet) is a sequence of three primes p0 < p1 < p2
such that p2 = p0 + 6. Here are the …rst six examples:
(5; 7; 11); (7; 11; 13); (11; 13; 17); (13; 17; 19); (17; 19; 23); (37; 41; 43)
You can see a list of the …rst elements of the prime triples of the form
(p; p + 2; p + 6), like (5; 7; 11), from 5 to 5477 at oeis.org/A022004, and one
of the …rst elements of the prime triples of the form (p; p + 4; p + 6), like
(7; 11; 13), from 7 to 5437, at oeis.org/A022005.
Write a program that counts the number of prime triples of the form
(p; p + 2; p + 6) in the …rst two thousand numbers. Compare your results
with those of Thomas R. Nicely on his site at http://www.trnicely.net/
triplets/t3a_0000.htm.
4
Mersenne primes
A Mersenne prime is a prime of the form 2n 1. So 3 = 22 1 is a Mersenne
prime, as are 7 = 23 1 and 31 = 25 1. Of course 15 = 24 1 is not a
prime at all, let alone a Mersenne prime. In fact it is not di¢ cult to show
that if 2n 1 is a prime, then so is n. Essentially the argument is
2ab
1 = (2a
1) 2a(b
1)
+ 2a(b
2)
+
if you can call that an argument. It shows that 2a
of 2ab 1 if a is a nontrivial factor of ab.
31
+ 2a + 1
1 is a nontrivial factor
So the question is, for what primes p is 2p 1 a prime. We denote 2p 1 by
Mp , in honor of Mersenne. We have seen that M2 , M3 , and M5 are primes.
What about M7 = 27 1 = 127. Yes, it’s also prime. Marin Mersenne
said, in 1644, that for the primes p
157, the ones for which Mp is prime
are 2; 3; 5; 7; 13; 17; 19; 31; 67; 127; 157. In 1750, Euler showed that M31 was a
prime. In 1883, Pervouchine showed that M61 is prime, so Mersenne missed
one.
Mersenne primes are interesting because, among other things, they are
tied up with perfect numbers. Euclid talked about perfect numbers, and
showed that an even number is perfect exactly when it is of the form (2n 1) 2n 1 ,
and 2n 1 is prime. So we get a perfect number for n = 2; 3; 5; 7; 13; : : :.
These numbers are 6, 28, 496, 8128, 33 550 336, . . . . The …rst four of these
numbers have been known since antiquity. St. Augustine made a big deal
out of the fact that 6 was a perfect number, saying that God created the
world in six days because of this.
No one knows whether or not there are any odd perfect numbers.
Write a program that …gures out for which primes p < 60 the number Mp
is a prime. When Mp is not prime, this program should print out its smallest
prime factor. However, that there is technical problem with the function spf
when applied to very large numbers. Here is the function spf as we de…ned
it before:
def spf(n):
for m in range(2,n+1):
if n % m == 0: return m
if m*m > n: return n
The problem is that the list, range(2,n+1), is just too big. What we need to
do is to de…ne the function spf without forming that list. For this purpose
we use a while-loop instead of a for-loop. The …rst three lines would be
def spf(n):
m=2
while m*m <= n:
First we set m equal to 2, then we run a block of code for as long as m2 does
not exceed n. Of course that might be forever, so we have to be careful when
writing the block that m2 eventually exceeds n. To do that, we will increase
m by 1 at the end of the block, so that the whole thing acts like a for-loop.
32
We are looking for the smallest nontrivial factor m of n, so we test to see if
m divides n, as before. Here is the whole de…nition:
def spf(n):
m=2
while m*m <= n:
if n % m == 0: return m
m = m+1
return n
Notice there are two things we have to do with a while-loop that were taken
care of automatically by the for-loop. We have to de…ne the starting value
of m outside the loop, and we have to increase m inside the loop.
4.1
The Lucas-Lehmer test
There’s a really good test that tells you whether or not Mp is prime for a
given prime p. It is mainly because of this test that the largest known prime
at any given time has always been a Mersenne prime. Lucas himself used
it to show that M127 was prime in 1876. This number is 170 141 183 460
469 231 731 687 303 715 884 105 727.
Here’s how Lucas’s test, as modi…ed by Lehmer, goes. To see if Mp is
prime, let s0 = 4, and sk = s2k 1 2 for k > 0. Compute sp 2 . Then Mp is
prime exactly when it divides sp 2 . In order to keep the numbers reasonably
small, we do all the computations sk = s2k 1 2 modulo Mp , rather than
waiting until the end to divide a monstrous number by Mp . So Mp is prime,
exactly when we end up with sp 2 = 0. Using the Lucas-Lehmer test, you
should be able to …gure out for which primes p < 1000 the number Mp is a
prime. Do it. That will also tell you what all the even perfect numbers are
that are less than . . . what?
4.2
Repunits
A repunit is a number whose digits are all 1’s. That is, they are the numbers
1, 11, 111, 1111, 11111, etc. Sometimes these numbers are written as Rn , so
R7 = 1111111. It is easy to see that
Rn =
10n 1
9
33
because 10n 1 is an n-digit number consisting of only 9’s. These numbers
were named by Albert H. Beiler in his 1964 book, Recreations in the theory
of numbers.
What repunits are primes? The …rst two are 11 and 1111111111111111111,
that is 12 and 119 . In analogy to the notation Mp for Mersenne primes, these
are sometimes written as R2 and R19 . The next three repunit primes are
R23 , R317 and R1031 . These are the only repunits that have been proven
to be primes, although the repunits R49081 and R86453 are almost certainly
primes, as are R109297 and R270343 . Just as for Mersenne primes, the number
Rp can be prime only if p is prime. In fact, the Mersenne primes are just the
repunit primes in base 2, rather than in base 10.
Write a program that …nds the prime factorization of all repunits with
fewer than 19 digits. Observe that if m divides n, then Rm divides Rn .
4.3
Emirps and palindromic primes
An emirp is a prime which forms a di¤erent prime when its digits are reversed. The …rst four emirps are 13, 17, 31, and 37. A prime which reads
the same backwards and forwards, like 11 and 353, or, for that matter, 2, 3,
5, and 7, is called a palindromic prime, not an emirp. Write a program to
…nd all the emirps and palindromic primes that are less than 2000.
5
Fermat’s theorem
Fermat’s theorem says that if n is a prime, and a is an integer not divisible
by n, then
an 1 1 (mod n)
This is sometimes referred to as “Fermat’s little theorem” as opposed to
“Fermat’s great theorem”better known as “Fermat’s last theorem”because
it was the last of Fermat’s theorems to be proved (not by Fermat, as far as we
know). Fermat’s last theorem says that if n > 2, then there are no nonzero
integer solutions x; y; z to the equation xn + y n = z n . Fermat’s last theorem
is very di¢ cult to prove, and remained unproved for over three hundred years
after Fermat stated it. Fermat’s little theorem is fairly easy to prove and you
can use Google to …nd lots of proofs of it on the web.
Our …rst project with Fermat’s theorem will be to check it on lots of
examples. So we want to take some primes n, and some numbers a with
34
1 a < n, and see if an 1
1 (mod n). Of course this equation holds for
a = 1, so we need only look at numbers a such that 1 < a < n.
Just checking the equation on primes n is a little boring. It is also a little
dangerous because the output of the program will just be, “yes, it checks”,
so we won’t have any evidence as to whether the program is working right or
not. More interesting would to check the equation not only on primes n but
also on composites. In fact, we will be more interested in what happens with
composites than with primes because we are going to use Fermat’s theorem
to test whether or not n is a prime. The test will be: choose a number a
from f2; 3; : : : ; n 1g and compute an 1 modulo n. If the result is 1, then
n might be a prime, or it might not. But if the result is not 1, then n is
de…nitely not a prime.
So our …rst project will be to run through the integers n from 2 to 100,
and for each n run through the integers a from 2 to n 1 to see whether
or not an 1 1 (mod n). What do we want the output to be? For each n,
we will count the number of integers a from 2 to n 1 such that an 1 is not
equal to 1 modulo n. So if n is prime, this count should be zero according
to Fermat’s theorem. We want to print out the number n together with the
count. A check on the program will be to see if the numbers n where we get
a count of zero are exactly the primes. The …rst few lines of output should
be:
2.
3.
4.
5.
6.
7.
8.
9.
0
0
2
0
4
0
6
6
We can get a probabilistic test for the primality of a number n by choosing,
at random, a number a from 1 to n 1 and seeing if an 1 1 (mod n). To
choose a random number from 1 to n 1 in Python, we put the line
from random import randint
at the top of the …le. Then we can use the Python function randint(c,d) which
returns a random integer a such that c a d. To set a equal to a random
integer between 1 and n 1 we write
35
a = randint(1,n-1)
Try running this test 100 times on the number 15906120889 and counting
the number of times you get 1. Then try factoring this number with spf.
5.1
Carmichael numbers
A Carmichael number is a number n so that an 1 1 (mod n) whenever
gcd(a; n) = 1. So a Carmichael number will pass the Fermat test unless
you happen to choose a number a that has a common factor with n. If
that were easy to do, then you could just choose such a number a, compute
gcd (a; n), which is a quick computation, and you would have a factor of
n. Carmichael numbers are sometimes called pseudoprimes. The …rst few
Carmichael numbers are 561, 1105, 1729, 2465, 2821, 6601, 8911, 10585,
15841, 29341. Two bigger ones are 294409 = 37 73 109 and 56052361 =
211 421 631. Also interesting are 6733693 = 109 163 379 and 17236801 =
151 211 541. The probability that a number less than 56052361 is relatively
prime to 56052361 is 210 420 630=56052361 = 0:99132. So this is the
probability that 56052361 passes the Fermat test. The probability that it
will pass ten Fermat tests is 0:9913210 = 0:91651, pretty good odds.
For all Carmichael numbers less than 1012 , see http://www.kobepharmau.ac.jp/~math/notes/note02.html
See also http://math.fau.edu/richman/carm.htm
Some other big Carmichael numbers are 492559141 = 367 733 1831,
413138881 = 617 661 1013, 16157879263 = 1667 2143 4523, 25749237001 =
1901 2851 4751, 58094662081 = 2633 4513 4889.
What do all these numbers have in common? They are all products of
three primes, n = pqr, and n 1 is divisible by p 1, q 1, and r 1. For
example, 29341 = 13 37 61 and 29340 = 22 32 5 163 which is divisible by
12, 36, and 60. This re‡ects Korselt’s criterion which says that a number
n > 1 is a Carmichael number exactly when n is square free (is not divisible
by the square of a prime) and if p is any prime divisor of n, then p 1 divides
n 1.
5.2
The Miller-Rabin test
The most commonly used probabilistic primality test is the Miller-Rabin
test. This test, which is a variant of Fermat’s test, cannot be fooled by
36
Carmichael numbers. If the numbers a that we use to test are chosen at
random, then the probability that a composite n will pass the test is less
than 1=2. So if we test 20 times in succession, choosing the numbers a
independently, then the probability of a composite passing these 20 tests is
less than 1=220 , that is, less than 1 in a million.
There is one new idea in the test: If the square of a number is 1 modulo
n, and n is a prime, then the number has to be either 1 or 1 modulo n.
That’s because z 2 1 = (z 1) (z + 1), so if z 2 1 is divisible by a prime n,
then n has to divide either z 1 or z + 1.
How do we run across a number whose square is equal to 1 modulo n?
When the number n passes Fermat’s test. In that case, an 1 is equal to 1
modulo n, and an 1 is the square of a(n 1)=2 . We know that n 1 is even,
because n were even, we would know that it wasn’t a prime. So we can check
to see if a(n 1)=2 is equal to 1 or 1 modulo n. That already is a better test
than just Fermat’s test.
Here is the Miller-Rabin test. You are given a number n that wish to test
for primality. Write n 1 = 2s d where d is odd. This is always possible:
you just keep dividing n 1 by 2 until you get an odd number. For the test,
you choose a number a at random so that 0 < a < n. Then you compute
x = ad (mod n). The test is to consider, modulo n, the sequence
s
x; x2 ; x4 ; x8 ; : : : ; x2
There are s + 1 terms in this sequence (because the exponents of x are
s
s
20 ; 21 ; 22 ; 23 ; : : : ; 2s ), and the last is x2 = ad2 = an 1 . Of course the last
term should be 1— that’s Fermat’s test.
Now each term in our sequence is the square of the previous term. So if
a term is 1, and the last term must be 1 or we would know that n was not a
prime by Fermat’s test, then the previous term must be 1. If that doesn’t
happen, we know that n is not a prime. So the good sequences look like
1; 1; 1; 1; 1; 1; 1; 1
1; 1; 1; 1; 1; 1; 1
; ; ; 1; 1; 1; 1
where
is a number di¤erent from
1. The bad sequences look like
; ; ; 1; 1; 1; 1
; ; ; ; ; ; 1
; ; ; ; ; ;
37
The bad sequences either don’t end in 1, so the Fermat test fails, or they
contain a 1 immediately following a .
If we denote the sequence by t0 ; t1 ; t2 ; : : : ; ts , then for the sequence to be
good we must have ts = 1 and there must not be an index i such that ti 6= 1
and ti+1 = 1. You don’t need a list to perform this test, but you may …nd
it convenient. Just compute x, keep squaring it, looking for the forbidden
s
z 6= 1 and z 2 = 1, and make sure that x2 = 1. While developing this
program you might want to print out the sequence t0 ; t1 ; t2 ; : : : ; ts .
Try this test out on the largest Carmichael numbers above.
If you …nd that a number passes the Fermat test, but fails the MillerRabin test, then you will have computed a number z so that z 6= 1 (mod n)
but z 2 = 1 (mod n). So n divides (z 1) (z + 1), but n does not divide
either z 1 or z + 1. In that case, gcd (z 1; n) has to be a proper factor of n, showing directly that n is not prime. This gives a quick way of
factoring because the Euclidean algorithm is very fast. For example, for
the number 56052361, and the random number 40202214, the sequence is
55252883; 1; 1; 1, so gcd (55252882; 56052361) = 88831 is a factor of 56052361.
6
The sum of the squares of the digits of a
number
We are interested in the function f that takes a number to the sum of the
squares of its digits. So
f (340779) = 32 + 42 + 02 + 72 + 72 + 92 = 9 + 16 + 0 + 49 + 49 + 81 = 204
After we …gure out how to compute this function, we will want to iterate it
and see what happens. That is, after computing f (340779) = 204, we will
compute f (204) = 20, then f (20) = 4, then f (4) = 16 and so on. The
sequence we get starting with 340779 is
340779, 204, 20, 4, 16, 37, 58, 89, 145, 42, 20
and now, of course, the sequence starts repeating. We have found a periodic
point, 20, of the function f . That is,
f 8 (20) = 20
38
where f 8 indicates the function f applied 8 times, that is
f 8 (m) = f (f (f (f (f (f (f (f (m))))))))
We call f 8 the 8-th iterate of f , and say that 20 has period 8. We want
to investigate the sequence m, f (m), f 2 (m), f 3 (m), . . . for various values
of m. We just did it for m = 340779. Notice that not only is 20 a periodic
point of f , but so are 4, 16, 37, 58, 89, 145, and 42. They each have period
8.
A happy number is a number m so that f k (m) = 1 for some k. Note
that f (1) = 1, so the sequence of iterates is 1 from k on. The number 1
is called a …xed point of f ; it is a periodic point of period 1 Of course 1
is a happy number. The next happy number is 7, which gives the sequence
7; 49; 97; 130; 10; 1. The number 2 is not a happy number because f (2) = 4
which is a periodic point of period 8, so the number 1 will not appear in the
sequence.
If we want to use a program to study these sequences, we will have to
write a function that computes f (m). How do we get hold of the digits of m
so that we can square them and add up the results? Consider m = 340779.
The easiest digit of m to get hold of is the last one, 9, because 9 = m%10.
To get at the next-to-the-last digit, we arrange for it to be the last digit of
another number. To do this, we subtract 9, the last digit, from m to get
340770. Now, dividing by 10, we get 34077, and the next digit we want is the
last digit of this number. We continue this computation as in the following
table
m m%10
m m%10 m m%10
10
340779
9
340770
34077
34077
7
34070
3407
3407
7
3400
340
340
0
340
34
34
4
30
3
3
3
0
0
Notice that we generate the digits in reverse order, the last one …rst. This
makes no di¤erence for the computation of f because we are just going to
square the digits and add up the results, and it doesn’t matter in which order
we add.
39
There is a recursion for f , which you can pretty much see from our computations, namely
f (m) = (m%10)2 + f
m
m%10
10
This tells us what to do with any nonzero m, reducing the computation of
f (m) to computing f of a number smaller than m. What about f (0)? You
should now be able to write a recursive Python function that computes f .
Write it and test it out on the numbers from 0 to n where, say, n = 100.
There is a Python trick that gives a slick way to compute the
sum of the squares of the digits of a number written in base 10.
It is sum(int(i)**2 for i in str(n)). The number n is converted to
a string of characters str(n). So 123 would be converted to the
string "123". Then int(i) converts each character in i in str(n)
into an integer, so we get the integers 1 2 3. Each one of those is
squared by **2, resulting in 1 4 9, and then sum adds them up.
This only works in base 10. Try it.
But, of course, what we want to see are not the sequences f (m), where
m goes from 0 to n, but the sequences f k (m) where m is …xed and k
goes from 1 to n. Write a function g (m; n) that prints out the values
m; f (m) ; f 2 (m) ; : : : ; f n (m) on a single line.
There are two numbers we can change in the de…nition of f (m), the sum
of the squares of the digits of m. One is the number 2 that we use because we
are summing the squares of the digits. We could just as well sum the cubes
of the digits, or the fourth powers of the digits. We can call that number
e, for exponent. For the original f , we have e = 2. It’s easy to change the
recursion for f (m) so that it now computes the sum of the cubes of the digits
of m. How do you do that?
The other number we can change is the number 10. That is the base b
for our system of representing numbers by strings of digits. If we change the
number b = 10 to, say, b = 8, then the digits we get are the digits in the
base-8 representation of m, that is, we are thinking of writing m in octal.
The octal representation of the number 340779 is 1231453, or 12314538 if we
want to indicate that the number is written in base 8. So the sum of the
squares of its digits is
1 + 4 + 9 + 1 + 16 + 25 + 9 = 65 = 1018
40
and if we continue summing the squares of the digits we get 2, 4, 16 = 208 ,
4 and we have run into a periodic point, 4, with period 2. You can easily
change your program for computing f (m) to one that computes f (m; e; b),
the sum of the e-th powers of the digits of m, where m is written in base b.
Your …rst function f (m) was f (m; 2; 10).
7
Finding orbits for sums of e-th powers
We are interested in the function fe that takes a number to the sum of the eth powers its digits. We examined f2 in the previous section. The function fe
maps the positive integers into the positive integers. That gives us a discrete
dynamical system. We are particularly interested in the …xed points of such
a system: numbers m such that fe (m) = m. The …xed points of f3 are 1,
153, 370, 371, and 407. We are also interested in periodic points. These
come in groups called orbits. An orbit is a …nite set S of positive integers
fs1 ; s2 ; : : : ; sk g so that fe (si ) = si+1 , for i < k, and fe (sk ) = s1 . If k = 1,
then the orbit S contains exactly one number, and that number is a …xed
point of fe .
It turns out that if we choose any positive integer whatsoever and keep
applying fe to it, then eventually we end up in an orbit.
8
Aliquot sequences
One very much studied discrete dynamical system is given by the function
s (n) which returns the sum of the proper divisors of n. Here we include the
divisor 1 even though it is rather uninteresting. But we don’t include n itself.
So s (2) = 1, s (3) = 1, s (4) = 1 + 2 = 3, s (5) = 1, and s (6) = 1 + 2 + 3 = 6.
Notice n is prime exactly when s (n) = 1. The number 6 is called perfect
because s (6) = 6. It is a …xed point s.
Technically, s (1) = 0 because 1 doesn’t have any proper divisors. I guess
we could say that s (n) is the sum of the positive divisors of n that are less
than n.
The study of perfect numbers goes back to antiquity. A related concept
is that of a pair of amicable numbers. These are distinct numbers m
and n such that s (m) = n and s (n) = m. That is, the pair fm; ng is
an orbit of size 2. Actually, we say that it is an orbit of period 2. The
41
smallest amicable pair is f220; 284g. Bigger orbits are called sets of sociable
numbers. There is no orbit of period 3. An example of an orbit of period 4
is f1264460; 1547860; 1727636; 1305184g.
Unlike the other functions we have looked at, there is no guarantee that
if we keep applying the function s, we will eventually reach a …nite orbit. In
fact, this is a major unsolved problem. The …rst problematic number is 276.
9
Ciphers
A standard method for enciphering a piece of text is to replace the letters by
other letters according to a …xed scheme. One famous example is attributed
to Julius Caesar. The idea was to replace the letter a by the letter d, the
letter b by the letter e, the letter c by the letter f , and so on until the letter
z is replaced by the letter c. The scheme can be summarized in the following
table
a b c d e f g h i j k l m n o p q r s t u v w x y z
d e f g h i j k l m n o p q r s t u v w x y z a b c
where the letters in the second row are substituted for the letters in the …rst
row. So the word “caesar”would become “fdhvu”. If we think of the letters
of the alphabet as being the numbers 0; 1; 2; : : : ; 25, this amounts to replacing
the number n by the number n + 3 (mod 26).
More brie‡y, we can think of the plain alphabet as being the string
"abcdefghijklmnopqrstuvwxyz" and what we need to do is to specify a cipher
alphabet which is some permutation of these 26 letters. In the Caesar cipher
case, we choose the cipher alphabet to be the string "defghijklmnopqrstuvwxyzabc".
Suppose p is the plain alphabet and c is the cipher alphabet. How do we
encrypt a message t? We want to form a new string r which is the encrypted
message. The i-th character in the message t is t [i].
We need to know the position j of t [i] in the plain alphabet p. (Remember
that positions start at 0.) Then we can set r [i] equal to c [j]. There is a
Python function associated with the string p that does just that. It is called
p.…nd(). If we write p.…nd(’u’), we will get the position of the …rst occurrence
of the letter u in the string p. If the letter u does not appear in the string p,
then we get the number 1. So we want to set r[i] = c[j] where j = p.…nd(t[i]).
To decipher a message, we interchange the roles of p and c.
42
There is a quick way to generate a cipher alphabet from a word, called
a keyword. It’s much easier to remember a keyword than to remember a
whole cipher alphabet. Here is how you generate a cipher alphabet from the
phrase “boca raton”. First you eliminate the space and all the duplicate
letters, so you are left with “bocartn”. Then you form the following table,
with the letters bocartn at the top, and the rest of the alphabet written in
rows underneath:
b o c a r t n
d e f
g h i j
k l m p q s u
v w x y z
Finally, you read o¤ the cipher alphabet by going down each column:
bdkvoelwcfmxagpyrhqztisnju
What happens when you take the keyword to be the one-letter word “a”?
The one-letter word “b”? The one-letter word “z”?
Can you develop a formula, like n + 3 (mod 26) for the Caesar cipher, for
how to encipher using the two-letter keyword “ab”?
10
Stu¤
What are the possible last two digits of a power of 2? Here are the …rst few
powers of 2 modulo 100:
2; 4; 8; 16; 32; 64; 28; 56; 12; 24; 48; 96; 92; 84; 68; 36; 72; 44; 88; 76; 52; 4
Multiplying by 2 modulo 100 is a discrete dynamical system. We have just
found one of the orbits, namely:
4; 8; 16; 32; 64; 28; 56; 12; 24; 48; 96; 92; 84; 68; 36; 72; 44; 88; 76; 52
This orbit has size 20. The numbers in the orbit are all periodic points of
period 20. Notice that these numbers are all divisible by 4— it is not di¢ cult
to show that this must be the case. However not all two-digit numbers that
are divisible by 4 appear in this list. For example, the number 40 doesn’t
appear in this list. Does that mean that a power of two cannot end in the
digits 40?
43
Download