A Solution to POJ1811 Prime Test

advertisement
A Solution to
POJ1811 Prime Test
STYC
Problem Description
Given an integer N which satisfies the
relation 2 < N < 254, determine
whether or not it is a prime number.
If N is not a prime number, find out
its smallest prime factor.
Key Concepts
Prime numbers: A prime number is a
positive integer p > 1 that has no
positive integer divisors other than 1
and p itself.
Framework of the Solution
Determine whether the given N is
prime or not.
If N is prime, print “Prime” and exit.
Factorize N for its smallest prime
factor.
The Brute-force Way
Trial Division
If N is even, then 2 is its smallest prime
factor.
Try dividing N by every odd number k
between 2 and N1/2. The smallest k by
which N is divisible is the smallest prime
factor of N. If such k does not exist,
then N is prime.
Complexity: O(N1/2) for time, O(1) for
space
Modified Brute-force
Construct a table that stores all prime
numbers not greater than N1/2. Try dividing
N only by prime numbers.
Complexity: O(N1/2logN) for time, O(N1/2)
for space using Sieve of Eratosthenes
Estimation of space consumption: 226 bits =
223 bytes = 8,192 kilobytes
Much time is used in the process of sieving
Modified Brute-force 2
Embed a table of prime numbers smaller
than Nmax1/4 into the source. Extend the
table to N1/2 by runtime calculation.
Complexity: O(N3/4/logN) for time,
O(N1/2/logN) for space
Estimation of time consumption: Finding all
7,603,553 primes smaller than 227 takes
approx. 1.32 x 1011 divisions or 73 minutes
on a Pentium 1.5 GHz.
Brute-force with Trick
Start from N1/2 rather than 2. Do
factorization recursively once a
factor is found.
Efficient in handling N = pq where p
and q relatively close to each other.
POJ accepts this! - westever’s
solution
Brute-force with Trick 2
Wheel Factorization
Test whether N is a multiple of 2, 3
or 5. If it is, the problem has been
solved.
If not, do trial division using only
integers which are not multiples of 2,
3, and 5.
Saves 7/15 of work.
Key Concepts (cont.)
Prime factorization algorithms:
Algorithms devised for determining
the prime factors of a given number
Primality tests: Tests to determine
whether or not a given number is
prime, as opposed to actually
decomposing the number into its
constituent prime factors
Primality Tests
Deterministic: Adleman-PomeranceRumely Primality Test, Elliptic Curve
Primality Proving…
Probabilistic: Rabin-Miller Strong
Pseudoprime Test…
Rabin-Miller
Strong Pseudoprime Test
Given an odd integer N, let N = 2rs + 1
with s odd. Then choose a random
integer a between 1 and N - 1. If as =
1 (mod N) or a2^j s = -1 (mod N) for
some j between 0 and r - 1, then N
passes the test. A prime will pass the
test for all a.
Rabin-Miller
Strong Pseudoprime Test
Requires no more than (1 + o(1))logN
multiplications (mod N).
A number which passes the test is not
necessarily prime. But a composite number
passes the test for at most 1/4 of the
possible bases a.
If n multiple independent tests are
performed on a composite number, the
probability that it passes each test is 1/4n
or less.
Rabin-Miller
Strong Pseudoprime Test
Smallest composite numbers passing the RMSPT
using the first k primes as bases: 2,047; 1,373,653;
25,326,001; 3,215,031,751; 2,152,302,898,747;
3,474,749,660,383; 341,550,071,728,321,
341,550,071,728,321; at most
41,234,316,135,705,689,041…
341,550,071,728,321 = 244.957…,
41,234,316,135,705,689,041 = 265.160…
Tests show that randomized bases may fail
sometimes.
Pseudocode of RMSPT (Sprache)
function powermod(a, s, n)
{
p := 1
b := a
while s > 0
{
if s & 1 == 1 then p := p * b % n
b := b * b % n
s := s >> 1
}
}
Pseudocode of RMSPT (cont.)
function rabin-miller(n)
{
if n > 2 AND powermod(2, n - 1, n) != 1 then return FALSE
if n > 3 AND powermod(3, n - 1, n) != 1 then return FALSE
if n > 5 AND powermod(5, n - 1, n) != 1 then return FALSE
if n > 7 AND powermod(7, n - 1, n) != 1 then return FALSE
if n > 11 AND powermod(11, n - 1, n) != 1 then return FALSE
if n > 13 AND powermod(13, n - 1, n) != 1 then return FALSE
if n > 17 AND powermod(17, n - 1, n) != 1 then return FALSE
if n > 19 AND powermod(19, n - 1, n) != 1 then return FALSE
if n > 23 AND powermod(23, n - 1, n) != 1 then return FALSE
return TRUE
}
Prime Factorization Algorithms
Continued Fraction Algorithm,
Lenstra Elliptic Curve Method,
Number Field Sieve, Pollard Rho
Method, Quadratic Sieve, Trial
Division…
Pollard Rho
Factorization Method
Also known as Pollard Monte Carlo
factorization method.
Runs at O(p1/2) where p is the largest
prime factor of the number to be
factored.
Two aspects to this method: iteration
and cycle detection.
Pollard Rho
Factorization Method
Iteration:
Iterate the formula xn+1 = xn2 + a
(mod N). Almost any polynomial
formula (two exceptions being xn2 and
xn2 - 2) for any initial value x0 will
produce a sequence of numbers that
eventually falls into a cycle.
Pollard Rho
Factorization Method
Cycle detection:
Keep one running copy of xi. If i is
power of 2, let y = xi, and at each
step, compute GCD(|xi - y|, N). If the
result is neither 1 nor N, then a cycle
is detected and GCD(|xi - y|, N) is a
factor (not necessarily prime) of N.
If the result is N, the method fails,
choose another a and redo iteration.
Pseudocode of RRFM (Sprache)
function pollard-rho(n)
{
do
{
a := random()
}
while a == 0 OR a == -2
y := x
k := 2
i := 1
Pseudocode of RRFM (cont.)
}
while TRUE
{
i := i + 1
x := (x * x + a) % n
e := abs(x - y)
d := GCD(e, n)
if d != 1 AND d != n then return d
else if i == k then
{
y := x
k := k << 1
}
}
Overall Efficiency Analysis
Time complexity: O(logN) for any
prime; O(N1/4) for most composites
under average conditions, principally
decided by the factorization process
Space complexity: O(1), space demand
is always independent of N
Notes on Implementation
Multiplication (mod N) in RMSPT:
Requires calculation of 64bit * 64bit
% 64bit. Should be computed as
binary numbers using “divide and
conquer” method. Use floating-point
unit for (mod N) operation. Can be
optimized by coding in assembly.
Notes on Implementation (cont.)
GCD(a, b) (a > b) in PRFM: Following
properties of GCD helps avoiding divisions:
If a = b, then GCD(a, b) = a.
GCD(a, b) = 2 * GCD(a/2, b/2) with both a
and b even.
GCD(a, b) = GCD(a/2, b) with a even but b
odd.
GCD(a, b) = GCD(a - b, b) with both a and b
odd.
Time complexity: O(logb)
Notes on Implementation (cont.)
Combination with brute-force
algorithms: Embed a prime table. Do
brute-force trial division for small
divisors.
Minor optimizations: Use 32-bit
integer division instead of the 64-bit
version when possible…
Actual Timing Performance
Platform: Windows XP SP1 on a
Pentium M 1.5 GHz
Algorithms tested: Adapted versions
of westever’s, TN’s, lh’s and zsuyrl’s
solutions and my later fixed version
Timing method: Process user time
returned by the GetProcessTimes
Windows API
Actual Timing Performance
Test data: Original data set on POJ,
two pseudorandom data sets
generated by Mathematica and one
hand-made data set (all given in
ascending order), new data set on
POJ
Verification: Done with Mathematica
Actual Timing Performance
Data0
(20)
Data1
(10)
POJ
Data2
(24)
Data3
(20)
Data5
(23)
westever
7,193
117
7,937
24,462
15,935
31,375
TN
3,545
60
W/A
12,450
7,744
15,935
Wheel
2,143
40
2,453
6,932
4,182
8,929
lh
60
613
375
240(-1)
180
197
zsuyrl
100
704
484
287
203
290
STYC
60
350
312
120
67
133
Conclusions
Brute-force ways work too slow with
integers that have large factors. But they
are good compliments to complex methods
like Pollard Rho.
Original test data on POJ are too weak to
have slow algorithms fail and to prove
wrong solutions incorrect. New data have
been much better but not yet perfect.
Download