Prime numbers, Large integer arithmetic and a 2PF task farm

Iain Bethune
•  Prime Numbers
•  Trial division
•  Primality Proving
•  Distributed Prime Search Projects
Prime Numbers
What is a prime number?
Prime Numbers
•  The Wikipedia definition:
A natural number
1, 2, 3, 4, 5, 6, ...
is called a prime or a prime number if it is greater than 1 and has
exactly two divisors, 1 and the number itself.
•  i.e. the first few primes are : 2,3,5,7,11, …
•  4 is not prime since 4 = 2 x 2
•  Non-prime integers are called composite
Prime Numbers
•  Why are primes (maybe) interesting?
•  Practical reasons
–  Some cryptographic algorithms e.g. RSA rely on the fact that given a
product of two primes n=pq, it is computationally impractical to find
the p and q given only n
–  Currently, 1024 to 2048-bit RSA keys are the norm (2048 believed to
be safe until ~2030 with current factoring algorithms)
•  Primes in nature
–  Cicidas emerge from dormancy every 13 or 17 years
–  Brood XIX (on a 13 year cycle) is currently active in the US
–  Broods on different cycles only co-occur once every 221 years!
Prime Numbers
•  Mathematical reasons
–  Primes are the building blocks of integer arithmetic. The
fundamental theorem of arithmetic states that any integer > 1 may
be expressed as a unique product of prime factors e.g. 2051983 =
31x37x1789, as it happens 251983 is prime!
–  There are infinitely many primes (proven by Euclid c. 300 BC)
–  (Large) primes are rare, the prime number theorem states the
number of primes less than n is:
–  So the probability a given integer is prime decreases with the
magnitude of n
–  Studied by Fermat, Euler, Mersenne, Legendre, Gauss, Riemann,
Proth, Lucas …
Prime Numbers
•  Up to 1951 (pre-computer era), the largest prime number
known was (2148+1)/17 =
8936657440586486151264256610222593863921 (44 digits)
•  Generated using a mechanical calculator and Proth’s
•  A record of the largest proven prime numbers is kept by Prof.
Chris Caldwell (Univ. Tennessee) – known as the Top 5000
Prime Numbers
Graph from
Prime Numbers
•  $150,000 offered by the EFF for the discovery of the first 100
million digit prime number (~2015?)
•  $250,000 for first billion digit prime (~2024?)
•  Current largest known has 12,978,189 digits (2008)
Trial Division
•  So we know what a prime is, how might we find out if a given
number is prime?
•  Simply divide the candidate n by all primes up to √n
–  If no divisor is found, then n is prime!
–  Assumes we know all the smaller primes
–  From the prime number theorem this will require
divisions, plus the overhead of generating a table
of primes, e.g. by the Sieve of Eratosthenes
Trial Division
•  Problems occur for large n e.g. for RSA n might be ~21024
–  There are around 10151 candidate primes up to √n
–  Given a computer that can do 109 divisions per second, it would take
~10142 seconds to prove n prime (or find its factors)
–  Clearly better algorithms needed!
•  These numbers are also much bigger than can be stored as
an integer (e.g. 64 bits)
–  require arbitrary precision arithmetic -> even slower than assumption
•  Still useful for finding small factors of large n
–  50% of integers have 2 as a factor, 33% have 3 etc.
–  92% of integers have a factor under 1000
Primality Proving
•  So how might we prove a large n to be prime or composite?
•  Several methods for arbitrary integers developed since
1900s – but fairly slow
–  2002: AKS (Agrawal, Kayal, Saxena) test runs in O(p6) where
p=log(n) is the number of digits
•  But highly efficient methods for special number forms are
–  Lucas-Lehmer test for Mersenne Primes Mp = 2p – 1 is O(p2 log(p)log
–  Lucas-Lehmer-Reisel test for Reisel Primes N = k2n -1 (2n > k)
–  Brillhart–Lehmer–Selfridge test for Proth Primes N = k2n +1 (2n > k)
Lucas-Lehmer Test
bool isPrime(p)
s = 4
P-2 iterations
m = 2p − 1
do i=1, p − 2
s = ((s × s) − 2) mod m
end do
if s = 0
return true
return false
1 multiplication
(squaring) and 1
subtraction modulo m
per iteration
Does not give factors
Lucas-Lehmer Test
•  s is a large number – much bigger than 264 e.g. M ≈
•  So store s as an array of digits in some base e.g. 210=1024
•  Multiply using primary school long multiplication method:
Lucas-Lehmer Test
1 2
3 4
2 x 4
•  We compute the multiple of each digit
with every other digit – O(p^2)
•  Sum per-digit, then perform the
1 x 4
2 x 3
1 x 3
•  In 1971, Schönhage and Strassen
observed the analogy with the
convolution operation (e.g. in signal
•  Convolution can be calculated in O(p
log(p)) using Fast Fourier Transform
4 0 8
Lucas-Lehmer Test
•  So multiplication of two p-digit numbers X and Y becomes:
Pad X and Y to 2p digits
X’ = FFT(X)
Y’ = FFT(Y)
XY’ = X’ * Y’
XY = FFT-1(XY’)
•  Still need to propagate carries, do the subtraction and the
modulo operation - O(n)
•  Instead of doing an integer DFT, we do a floating-point FFT,
and round back to integer (subject to errors)
Lucas-Lehmer Test
•  In 1994, Crandall and Fagin introduced an even faster
method for FFT multiplication modulo Mersenne Numbers –
the Irrational Base Discrete Weighted Transform (IBDWT)
–  Express the large integer in an ‘irrational base’ - i.e. divide the p bits
into digits of approximately the same length
–  Multiply each digit by an element of a weight vector
–  Perform the FFT multiplication as before
–  Divide out the weight vector
•  This has the property than we only require p bits (no
padding), and the result is already correct mod (2p-1)
Lucas-Lehmer Test
•  This method is implemented in the George Woltman’s
gwnum library, and is used by the Prime95 program for
testing Mersenne numbers for primality
•  LL tests for current Mersenne Prime candidates take several
months to run on a single CPU core
•  Prime95 is also included in ‘torture test’ suites for hardware
error checking
Lucas-Lehmer Test
•  Parallelisation
–  Threaded LL tests are possible, but efficiency < 100%
–  Since the range of exponents to test is ∞, better to use cores to run
multiple tests, rather than a single test in parallel
•  GPUs
–  Double precision GPUs (CUDA Compute capability 1.3/2.0+) required
to avoid rounding errors in FFT
–  Faster than a CPU but much less efficient (~4x speedup)
–  Only efficient for large exponents
•  Primality testing every potential prime
could be very time-consuming
•  Better to ‘sieve’ out composite numbers by trial dividing
candidates by a range of ‘small’ primes
–  No time to discuss algorithms here, but time per candidate much
smaller than deterministic tests
–  Also can be very efficiently accelerated by GPUs
•  A (partial) sieve can only prove a number composite, not
prime, so have to test the remaining candidates via
deterministic test
Distributed Prime Search Projects
•  In 1996, George Woltman set up the GIMPS
(Great Internet Mersenne Prime Search)
project, using his Prime95 software
–  First known DC project (predating e.g.
SETI@Home, 1999)
•  Client PrimeNet added in 1997 to manage work distribution/
•  Average performance of 61 TFLOP/s over last 30 days (4633
users, 27349 computers)
–  ~5 times HPCx sustained LINPACK performance
•  Searches only for Mersenne Primes
–  Has held No.1 on largest prime list since 1996 (11 new records since)
Distributed Prime Search Projects
•  PrimeGrid launched in 2005, using the BOINC distributed
computing client (from Uni. Calif. Berkeley), derived from
•  Run by Rytis Slatkevicius + a team of volunteer admins and
•  Currently averaging ~1.5-2 PF performance – the highest of
all BOINC projects (7504 users, 15864 computers)
–  ~5 times HECToR sustained LINPACK performance
Distributed Prime Search Projects
Mid 2010 – GPU Apps added
Distributed Prime Search Projects
•  Range of prime search sub-projects
–  Cullen & Woodall primes n2n±1 (discover of largest of each forms)
–  Proth primes k2n+1
–  Sophie Germain primes 2p+1 (currently searching for a 200,000+ digit
SG prime, largest known has 79911 digits)
–  Solving the Riesel, Sierpinski and Prime Sierpinski conjectures
–  Odd integers k exist such that k2n-1 is composite for all n (Riesel)
–  The conjectures are that k=509203, 78557, 271129 are the
smallest such ks
–  57, 6 and 11 ks remain for which a prime has not yet been found
–  Twin Prime Search (largest known n±1 pair of primes, 100,355 digits)
–  AP26 (world record 26 primes in arithmetic progression, April 2010)
–  Largest producer of Top5000 primes
•  Primes are enigmatic - infinite yet rare, easily defined but
difficult to predict
•  Primality proofs for very larger numbers (10 million+ digits)
possible using large integer arithmetic techniques on
commodity PCs
•  GPU acceleration still at early stage
–  Possible MSc dissertation project here!
•  Distributed prime search projects have been generating
record primes for ~15 years
•  Chris Caldwell’s Top5000 -
•  Prime Numbers: A Computational Perspective, Crandall &
Pomerance, 2nd ed., 2005
•  Richard Crandall, Barry Fagin: Discrete weighted transforms
and large-integer arithmetic, Mathematics of Computation 62,
205, 305-324, January 1994
•  GIMPS -
•  PrimeGrid -
•  Wikipedia J
