Prime numbers, Large integer arithmetic and a 2PF task farm Iain Bethune EPCC ibethune@epcc.ed.ac.uk Overview • Prime Numbers • Trial division • Primality Proving • Distributed Prime Search Projects Prime numbers, large integer arithmetic and a 2PF task farm 2 1! Prime Numbers What is a prime number? Prime numbers, large integer arithmetic and a 2PF task farm 3 Prime Numbers • The Wikipedia definition: A natural number 1, 2, 3, 4, 5, 6, ... is called a prime or a prime number if it is greater than 1 and has exactly two divisors, 1 and the number itself. • i.e. the first few primes are : 2,3,5,7,11, … • 4 is not prime since 4 = 2 x 2 • Non-prime integers are called composite Prime numbers, large integer arithmetic and a 2PF task farm 4 2! Prime Numbers • Why are primes (maybe) interesting? • Practical reasons – Some cryptographic algorithms e.g. RSA rely on the fact that given a product of two primes n=pq, it is computationally impractical to find the p and q given only n – Currently, 1024 to 2048-bit RSA keys are the norm (2048 believed to be safe until ~2030 with current factoring algorithms) • Primes in nature – Cicidas emerge from dormancy every 13 or 17 years – Brood XIX (on a 13 year cycle) is currently active in the US – Broods on different cycles only co-occur once every 221 years! Prime numbers, large integer arithmetic and a 2PF task farm 5 Prime Numbers • Mathematical reasons – Primes are the building blocks of integer arithmetic. The fundamental theorem of arithmetic states that any integer > 1 may be expressed as a unique product of prime factors e.g. 2051983 = 31x37x1789, as it happens 251983 is prime! (http://www.mathsisfun.com/prime-factorization-tool.php) – There are infinitely many primes (proven by Euclid c. 300 BC) – (Large) primes are rare, the prime number theorem states the number of primes less than n is: – So the probability a given integer is prime decreases with the magnitude of n – Studied by Fermat, Euler, Mersenne, Legendre, Gauss, Riemann, Proth, Lucas … Prime numbers, large integer arithmetic and a 2PF task farm 6 3! Prime Numbers • Up to 1951 (pre-computer era), the largest prime number known was (2148+1)/17 = 209889366574405864861512642566102225938639212098 8936657440586486151264256610222593863921 (44 digits) • Generated using a mechanical calculator and Proth’s Theorem • A record of the largest proven prime numbers is kept by Prof. Chris Caldwell (Univ. Tennessee) – known as the Top 5000 Prime numbers, large integer arithmetic and a 2PF task farm 7 Prime Numbers Graph from primes.utm.edu Prime numbers, large integer arithmetic and a 2PF task farm 8 4! Prime Numbers • $150,000 offered by the EFF for the discovery of the first 100 million digit prime number (~2015?) • $250,000 for first billion digit prime (~2024?) • Current largest known has 12,978,189 digits (2008) Prime numbers, large integer arithmetic and a 2PF task farm 9 Trial Division • So we know what a prime is, how might we find out if a given number is prime? • Simply divide the candidate n by all primes up to √n – If no divisor is found, then n is prime! – Assumes we know all the smaller primes – From the prime number theorem this will require divisions, plus the overhead of generating a table of primes, e.g. by the Sieve of Eratosthenes Prime numbers, large integer arithmetic and a 2PF task farm 10 5! Trial Division • Problems occur for large n e.g. for RSA n might be ~21024 – There are around 10151 candidate primes up to √n – Given a computer that can do 109 divisions per second, it would take ~10142 seconds to prove n prime (or find its factors) – Clearly better algorithms needed! • These numbers are also much bigger than can be stored as an integer (e.g. 64 bits) – require arbitrary precision arithmetic -> even slower than assumption above • Still useful for finding small factors of large n – 50% of integers have 2 as a factor, 33% have 3 etc. – 92% of integers have a factor under 1000 Prime numbers, large integer arithmetic and a 2PF task farm 11 Primality Proving • So how might we prove a large n to be prime or composite? • Several methods for arbitrary integers developed since 1900s – but fairly slow – 2002: AKS (Agrawal, Kayal, Saxena) test runs in O(p6) where p=log(n) is the number of digits • But highly efficient methods for special number forms are available – Lucas-Lehmer test for Mersenne Primes Mp = 2p – 1 is O(p2 log(p)log (log(p)) – Lucas-Lehmer-Reisel test for Reisel Primes N = k2n -1 (2n > k) – Brillhart–Lehmer–Selfridge test for Proth Primes N = k2n +1 (2n > k) Prime numbers, large integer arithmetic and a 2PF task farm 12 6! Lucas-Lehmer Test bool isPrime(p) s = 4 P-2 iterations m = 2p − 1 do i=1, p − 2 s = ((s × s) − 2) mod m end do if s = 0 return true else return false 1 multiplication (squaring) and 1 subtraction modulo m per iteration Does not give factors Prime numbers, large integer arithmetic and a 2PF task farm 13 Lucas-Lehmer Test • s is a large number – much bigger than 264 e.g. M ≈ 2>10,000,000 • So store s as an array of digits in some base e.g. 210=1024 • Multiply using primary school long multiplication method: Prime numbers, large integer arithmetic and a 2PF task farm 14 7! Lucas-Lehmer Test 1 2 x 3 4 2 x 4 • We compute the multiple of each digit with every other digit – O(p^2) • Sum per-digit, then perform the carries 1 x 4 2 x 3 1 x 3 8 4 6 3 • In 1971, Schönhage and Strassen observed the analogy with the convolution operation (e.g. in signal processing) • Convolution can be calculated in O(p log(p)) using Fast Fourier Transform 4 0 8 Prime numbers, large integer arithmetic and a 2PF task farm 15 Lucas-Lehmer Test • So multiplication of two p-digit numbers X and Y becomes: Pad X and Y to 2p digits X’ = FFT(X) Y’ = FFT(Y) XY’ = X’ * Y’ XY = FFT-1(XY’) • Still need to propagate carries, do the subtraction and the modulo operation - O(n) • Instead of doing an integer DFT, we do a floating-point FFT, and round back to integer (subject to errors) Prime numbers, large integer arithmetic and a 2PF task farm 16 8! Lucas-Lehmer Test • In 1994, Crandall and Fagin introduced an even faster method for FFT multiplication modulo Mersenne Numbers – the Irrational Base Discrete Weighted Transform (IBDWT) – Express the large integer in an ‘irrational base’ - i.e. divide the p bits into digits of approximately the same length – Multiply each digit by an element of a weight vector – Perform the FFT multiplication as before – Divide out the weight vector • This has the property than we only require p bits (no padding), and the result is already correct mod (2p-1) Prime numbers, large integer arithmetic and a 2PF task farm 17 Lucas-Lehmer Test • This method is implemented in the George Woltman’s gwnum library, and is used by the Prime95 program for testing Mersenne numbers for primality • LL tests for current Mersenne Prime candidates take several months to run on a single CPU core • Prime95 is also included in ‘torture test’ suites for hardware error checking Prime numbers, large integer arithmetic and a 2PF task farm 18 9! Lucas-Lehmer Test • Parallelisation – Threaded LL tests are possible, but efficiency < 100% – Since the range of exponents to test is ∞, better to use cores to run multiple tests, rather than a single test in parallel • GPUs – Double precision GPUs (CUDA Compute capability 1.3/2.0+) required to avoid rounding errors in FFT – Faster than a CPU but much less efficient (~4x speedup) – Only efficient for large exponents Prime numbers, large integer arithmetic and a 2PF task farm 19 Sieving • Primality testing every potential prime could be very time-consuming • Better to ‘sieve’ out composite numbers by trial dividing candidates by a range of ‘small’ primes – No time to discuss algorithms here, but time per candidate much smaller than deterministic tests – Also can be very efficiently accelerated by GPUs • A (partial) sieve can only prove a number composite, not prime, so have to test the remaining candidates via deterministic test Prime numbers, large integer arithmetic and a 2PF task farm 20 10! Distributed Prime Search Projects • In 1996, George Woltman set up the GIMPS (Great Internet Mersenne Prime Search) project, using his Prime95 software – First known DC project (predating e.g. SETI@Home, 1999) • Client PrimeNet added in 1997 to manage work distribution/ submission • Average performance of 61 TFLOP/s over last 30 days (4633 users, 27349 computers) – ~5 times HPCx sustained LINPACK performance • Searches only for Mersenne Primes – Has held No.1 on largest prime list since 1996 (11 new records since) Prime numbers, large integer arithmetic and a 2PF task farm 21 Distributed Prime Search Projects • PrimeGrid launched in 2005, using the BOINC distributed computing client (from Uni. Calif. Berkeley), derived from SETI@Home • Run by Rytis Slatkevicius + a team of volunteer admins and developers • Currently averaging ~1.5-2 PF performance – the highest of all BOINC projects (7504 users, 15864 computers) – ~5 times HECToR sustained LINPACK performance Prime numbers, large integer arithmetic and a 2PF task farm 22 11! Distributed Prime Search Projects Mid 2010 – GPU Apps added Prime numbers, large integer arithmetic and a 2PF task farm 23 Distributed Prime Search Projects • Range of prime search sub-projects – Cullen & Woodall primes n2n±1 (discover of largest of each forms) – Proth primes k2n+1 – Sophie Germain primes 2p+1 (currently searching for a 200,000+ digit SG prime, largest known has 79911 digits) – Solving the Riesel, Sierpinski and Prime Sierpinski conjectures – Odd integers k exist such that k2n-1 is composite for all n (Riesel) – The conjectures are that k=509203, 78557, 271129 are the smallest such ks – 57, 6 and 11 ks remain for which a prime has not yet been found – Twin Prime Search (largest known n±1 pair of primes, 100,355 digits) – AP26 (world record 26 primes in arithmetic progression, April 2010) – Largest producer of Top5000 primes Prime numbers, large integer arithmetic and a 2PF task farm 24 12! Summary • Primes are enigmatic - infinite yet rare, easily defined but difficult to predict • Primality proofs for very larger numbers (10 million+ digits) possible using large integer arithmetic techniques on commodity PCs • GPU acceleration still at early stage – Possible MSc dissertation project here! • Distributed prime search projects have been generating record primes for ~15 years Prime numbers, large integer arithmetic and a 2PF task farm 25 References • Chris Caldwell’s Top5000 - http://primes.utm.edu/ • Prime Numbers: A Computational Perspective, Crandall & Pomerance, 2nd ed., 2005 • Richard Crandall, Barry Fagin: Discrete weighted transforms and large-integer arithmetic, Mathematics of Computation 62, 205, 305-324, January 1994 • GIMPS - http://www.mersenne.org/ • PrimeGrid - http://www.primegrid.com/ • Wikipedia J Prime numbers, large integer arithmetic and a 2PF task farm 26 13!