21362 >>: Well, welcome back everyone for the final session. ... will be Peter Montgomery. And his title is "ECM -...

advertisement
21362
>>: Well, welcome back everyone for the final session. The first speaker this afternoon
will be Peter Montgomery. And his title is "ECM - Then and Now."
>> Peter Montgomery: Thank you and "Then" refers to 1985 when the curve method
was discovered by Henrik Lenstra. And most of you know what it means to factor a
number into primes and uniqueness plays a role because the algorithms find one factor
and then go ahead and look for no matter which one we start with. And so we'll think
about the attempts to factor if it was 25 years ago or more and we're given some N,
maybe something from the Cunningham table which has the small basis to 12 moderate
power plus or minus 1 and there'll be two main classes of methods that we're after then.
Some work on the product itself or and others may work modular on the product but the
time may depend upon the size of the factor we hopefully find. So that discounts the
time from multiple precision [indiscernible] ends. So for depending primarily on the size,
this occurred just as continued fraction was being replaced by quadratic sieve and the
algorithm compare and both the fraction and the quadratic sieve find many values of Y
which are congruent to squares and are smooths, meaning only small prime factors. If
we get several X squares congruent to Y, multiply all the X's together, square that and
the product of all the Y's. So the square of the product of all the X's is automatically a
square for less side and the product of the Y has to be chosen so every prime appears
to an even power which turns out there's a linear algebra problem on 2. So, sample the
flavor of quadratic sieve takes our room number and subtracts 1919 from many different
squares near there. Let's see. That's not quite right. And now the closest square to
1919 is 1936, 14 squared 1936 and you get a 17 difference. We just save the 1's where
we have factors from a factor based on the right. Notice that we have no 3's. That's
because a particular polynomial X squared minus 1919 never takes on a value devisable
by 3. When we get enough of them we notice that 29 and 37 on the right looks
somewhat alike in terms of whether the powers are odd or even. Two minuses and two
5's, two and 7's and two 11's. The product of those two is a square. And you can use
that to, if we're lucky, get the factorization. So 770 squared is the product of -- the two
that we decide to combine our product of square which happens to be 770 squared.
And from the derivation of these is F of something or square minus 1919 and say this is
29 squared minus N and 37 squared minus N and when we follow the congruence, we
get two squares that are congruent mod N and if we're lucky as we are here, we get two
numbers that are incongruent and 303 you've got a factor of 101 as 1919 [indiscernible]
how much different flavor than that.
So that's among the one who is sign would depend upon N and [indiscernible] division is
the easiest one to program and the first prime factor 19 is so small, you can recognize it
even by hand. So P minus 1 relies upon Fermat's little theorem. If you take any basis
whole prime to our N 1919 and raise it to 19 -- raise it to our prime power minus 1 where
the prime power or the prime is 1 over 5, so like 19 minus 1 or a 101 minus one. And
the variation in the algebra, we got a B plus 1 due to Hugh Williams -- and the power row
at the top takes the sequence typically replacing X by X squared plus 1 mod N and put in
that to another stage and hoping to get a duplicate mod N somewhere. So that briefly
the both the P minus 1 method, we're given our N and we don't know offhand if it's going
to be 19 and 101 or when we subtract one from that 18 and 100. But we picked a bunch
of powers which hopefully are devisable by both 18 and by 100. Maybe you pick 3,600
or the [indiscernible] my talk. So it raises some base to a power which we picked and
use the binary method of [indiscernible] once on a big product or you can do each prime
power within the range separately. So we're getting some new values out and this has E
as its exponent. If P minus divides E then our B not to the P minus will be 1 mod P and
we don't have to go quite that high -- P minus 1 -- let's see. P minus 1 divides E -divides this exponent and B not to be P minus 1 will have our factor of P and we'll do a
greatest column devisor. We'll be lucky unless two primes turn up at one time which
doesn't happen very much in practice beyond the early stages. So this is a much bigger
example of how the P minus 1 behaves and picked it as Richard Brent got a record on it.
So we don't know if this time at the time, or at the time that we imagine ourselves
30 years ago, knowing what the factorization would be except certainly the six and the
seven-digit 1's could be found by trial division and even these here, the form of the
number endures to each of the primes that are here will be 1 mod 977 so there's only a
few thousand tries for these two. Then we get up a ways, 19 and 32 digits and so much
higher you've gone since then. So when we try it, if didn't already do it by the trial
division then we might guess to put in a thousand for our upper-bound, so you do all the
X1's up to a thousand. You know. We should also be careful here that our base B not is
not 2 or a power of 2 because that one raised to whatever power we pick will give us
back a 1 for all the factors at the same time. But if you start with B not being 3 we
should be fine except that when we get up to a thousand and do our check on the
exponent, suddenly both of them have got the 977 at one time put in. And we'll still get
the product of the six-digit and the seven-digit factors out and all of the big ones will be
in the "do-later list."
And when we do look at the other factorizations for the 19-digit minus 1, the largest
prime factors are about a half a million and it's not that far to worry about going up to at
least today then this is 300-digit numbers to be manipulating and so we're the computing
these. But it's certainly feasible. This one 11 million requires us to go up even higher
and there are enough good ones. I think John Brillhart [ph] who's [indiscernible] picking
plums at waist height. So, this one might be found but until we go through and do repeat
checks which higher values of our exponent this one is unlikely to be found. Repeat
check is a bigger exponent B, we might find it when the old one was unsuccessful. And
now we get that improved by putting in the so-called second step. Little B sub 1 was the
output of the first extensiation [ph]. And if for some factors -- [indiscernible] worry about
finding multiple factors and then going back and continuing on the co-factors if it looks
promising. But after we've our extensiation with all the primes up to little -- big B, one,
we look for that exponent to only need one more factor before popping out with a 1 mod
our prime. And several variations of [indiscernible] but we're looking for one more prime
we can apply after we hit the B sub 1 value. Let's see. The group order -- this is
supposed to be like in Brandon's up here. The group order mod P minus 1, C over PZ
multiply, it has to divide our extra factor Q times the 1's we already applied to the
exponent. And then if we are lucky then our little b1 output to the Qth power -- well, my
definition of the B1 output and put both primes in Q and E together, we'll 1 modulo, our
prime, and figure out our prime by our GCD test. So, if B1 to the Qth power will give us
a 1 then the strategy is find two different powers of B1, not necessarily 1 being the 0th
power which have some factor of Q or -- we set the exponent difference to Q, we want to
be able to have two results of the same mod P. And the strategy I've been using for B
plus or minus 1 code I was developing at the time, just split a bunch of subscripts or
indices beneath our capital B sub 2 and two disjoint sets. We don't have everything in
them. Every potential Q would be devise some difference, I minus J where I was in one
set and J in the other. And we plan on just doing pair-wise comparisons and take the
greatest common divider every time of the difference. So we try to be somewhat clever
in the diselction [ph] of the sets and for going to have any potential difference up to 50,
we know it's prime and we also know it's say bigger than 5. Then noticing every such
prime has to be in these congruents, these classes mod ten and we can write in the next
mod 10 minus 1, 3, 7 or 9. So we're down to 4 times 5. The difference is worry about
manipulating. And so we get one set of values for these four and a second set of values
for these five and put them in a table or arrange it in order if you're gonna store them.
Take the difference of the two powers and just GCD it with N.
And we can do that a bit better if we combine two valuables -- one negated, so B1 to the
I plus B1 to the minus I minus the pair with the J's and this will give us only half as many
GCD's to worry about later on. And we can perhaps spot my hand if these two are
congruent modular prime than the other two are also congruent as long as we allow
denominators that are not devisable by prime. So now we only need each Q to divide
some sum or difference and reduce the size of the table a small bit. So back to that 2 to
the 977 minus 1. We've got the 19-digit factor earlier, we hope, by going up to half a
million on step one.
Now, we're in our maybe our second pass through everything. Maybe we tried 2 million
for our B sub 1 and some bigger value like 20 million for B sub 2 and raising to that
seven-digit power the rate of anything that's not in this order and then we hope two of
our table entries will correspond to the value I minus J that divides this. And we'll luck
out get our 32 digit factor and still have our two huge ones at the bottom to figure out
another day, or maybe at the end of the decade.
So that was found by Richard Brent when he was going through the many of the tables.
And I couldn't find the [indiscernible] old table records and I remember this was a record
that was set for years and I found out it was found in 1984.
And I mentioned it takes a lot of GCD's and they're rather expensive when there's
multiplicative inverse or just greatest common divisor. We want one optimization, if we
got two different to check for a factor with N, set their product against N first and then
also make sure that the product that you went to didn't suddenly become zero and
discard all your old history information. And essentially we'll multiply each to keep a
running product of what you haven't tested yet.
So another topic when we're going to the P plus algorithm we'll be using Lucus functions
also called [indiscernible] polynomials, very similar identities held by double the cosign
function which would have when you cosign the exponentials, it takes you right back to
this form. And so this polynomial just sending [indiscernible] reciprocal to the 72 powers,
the power of the number and the power of the reciprocal. And for computing those
couple of big identities you can apply it to the sum of the initial indices, N plus N is a
polynomial but multiplying the M and the N and subtracting the 1 at their difference using
the formal identities to check that if that holds. And another useful one is a product. It
may not be obvious at first when it's being commuted to the N and the M and the N do a
small bit of algebra and you get it. And the Montgomery ladder was mentioned in that
case that when you make M and N consecutive everywhere, you keep halving but you
can make also make use of a Fibonacci-like chain to try to go up to the high.
So, the B plus 1 method, instead of picking B sub not, which was co-primed to our N, we
picked essentially a value of X plus 1 over X but we'll call it Y not. We don't have our X
and 1 over X explicitly but actually we're hoping that they do not exist in the base field.
So we raise it to a power or a Lukoff power and come back to the -- oh, yes. So if our
Y1 starting value happens to be the form B1 plus 1 over B1, then applying it -- or that
should be a Y0 -- no -- as long as I got 1's or 0's at all three places. So if we have Y1 in
here we get -- we subtract off our 2, we get B1 minus 1 squared in the factorization and
we'll still find the P minus 1 factors if our choice of Y not means there's a solution to the
base B to the mod P. If not we'll go after something else. So if after we've chosen Y not
then we'll see whether the X not is a corresponsive, whether it's in the ring mod P or not,
at least be a solution in the quadratic extension and the product of the two roots is plus
1, so 1 over X not must be the other root. But we also notice that applying [indiscernible]
that we get another root X not to the P. And we have only two roots over a field but
we're working over a ring right now and our X not to the Pth must equal either X not or 1
over X not. And when we look at what our output would be for it, we substitute for Y not
and then through the definition of B sub E, so we get the Pth powers here and then it
factors regardless of which root this was.
And if it was the same as the Pth power, then it's in the base field and we essentially just
found the P minus 1 factor when we're lucky by a few more computations if otherwise
needed. Once the reciprocal, our luck will occur with P plus 1 being nicely smooth
instead of P minus 1.
So let's just say that this P plus 1 divides our E. And so when P minus 1 happens to
divide our E, we're lucky half the time depending upon what choice we made for our Y
not. And when P plus 1 divides the E we're lucky half the time. So they generally have
to be run a few times. So if we give P plus 1 the earlier number, we at least get rid of the
problem we had with the 977 popping out for both of them at the same time but we don't
have anything new. In fact, the 19 and the 32 don't produce this time unless we happen
to pick the Y not so that P minus would be our lucky one.
Yeah, I had go it up to 300 for B1 even though it looks like 79's the biggest because
we've got 17 squared. We might not have been careful to put enough of them in our
exponent. But nothing new coming out.
So to switch the subject slightly again. Goldwasser talked about Pocklington's test
earlier this week and we tried to prove that a given integer is prime and one way of
assuring it is if when we raise it to something to power minus 1 we get 1, but raise that
same X to powers in a divide our first exponent and they all give us something other
than 1. So, next page has some illustration of what's happening.
So we have 67 dividing 2010 earlier. And the algorithms occur we assume that we've
already proven that 2, 3 and 11 are prime. Well, first condition was that X to the N minus
1 had to be 1 capital N and X to the N minus 1 would be 0 here. And X equaled 1 failed
because it said "not congruent to." For the next ones we can pick X equal 2 and we -for Fermat's criteria where we're cheating and we know it's prime. And say that the
order of 2 divides 66 but when we try any of these -- 6 divided by 2, 3, 11, any of those
primes, 2 to that power will not be 1. So we can see, for example, 37 cubed will be 2 to
the 66 or 1 but 37 itself is not a 1. So 37 must have order three. So the other primes -it's a slightly harder argument for prime powers but a big observation is to apply this we
need to do a factor of 66 completely in our assumption. Then we'll notice that we get all
nice powers in the P plus and minus 1 too. That's going to be a similarity coming up.
So if you do a P plus 1 analog then you can get -- some Lukoff sequences have this
form that essentially difference of powers, rather difference of values, stuff like the
definition of the Fibonacci numbers, for example, or the explicit formula for them. And if
these sequences satisfy some tests similar to the Pocklington ones then we can prove
that our exponent is prime.
So we've got it checked that the bottom coefficient of our polynomial was non-zero and
sandy [ph] test of the discriminate polynomial and some other conditions let us prove
even these sequences that we get prime out and here we're needing all the factors of N
plus 1 equal product of different -- Q to different powers. We need to know the full
factorization of P plus or N plus 1 here.
And so some work by Selfridge, Lehmer, Brillhart and probably others came up with
some improved tests where we don't need the full factorization of either one but just
enough of factors of some example given on the Number Theory Network this last week.
The factors defined have to account for one third of the N squared minus 1
algorithmically. And then through simple tests practice complete factorization. So the
big question coming up about the research is if we can make XN plus 1 N minus 1 here,
can we mix them in the B plus one P minus one algorithms? So that's the first bonus
and maybe we can go to P plus or minus 2 or get something else that might be smooth
after we do the P plus or minus 1 or make some fundamental change to the algorithm
where we can allow another step after we get little b sub 1 [indiscernible] searching the
tables for our match. Maybe we can find a way to look even more. So this was the
scene as I remember the world before an Andrew Lisco [ph] sent me Hendrik's write-up.
[Indiscernible] assume it's not characteristic 2 or 3. It will be an easy prime to find.
So the Larry Strauss equation is cubic and short form is cubic term on the right and
linear and constant terms. And the other coordinate systems we'll get to briefly later.
And we haven't had the picture on the board but most of you have probably seen where
we've drawn this little [indiscernible] the picture looks different depending on if we've got
one real root or three real roots on the right. And we put the curve or a straight line
through two points where one adds and one reflected. So we get a billion group out
whose order is approximately P but we have more than this, the two values of P plus 1
or P minus 1 that can pop out. That's almost a random [indiscernible] Las Vegas style.
And, yes, that group order varies with A and B, and there's some strategies for selecting
A and B like the six torsion that Dan Bernstein mentioned yesterday.
So to N2 points on a curve you find a third point on the curve where that same line
intersects the curve. So given the P and Q, draw the line through them and then we
reflect that and along the X axis, this should be more vertical, and that's defined to be
the sum and all the algebraic operations that determining the soap [ph] here getting the
third root of the equation where we tried to substitute or eliminate one of the variables
from the line equation and get our final curve. They're all arithmetic -- plus, minus,
subtract, multiply, divide -- and carry over to finite fields. And, in fact, we'll be wanting to
do it to commute the rings onto the number we're factoring.
So an example the picture doesn't correspond to is two of the points you can check are
on the curve and we see by hand that Y coordinates in the example was two more than
the X coordinates of the line passing through Y equals X plus 2 and then when we try to
eliminate Y from the cubic equation and formula for Y, we get the cubic. We know that
the X coordinate O minus 2 are on here if we haven't made an arithmetic error and then
let this fall through and get the new -- when we get the new X3 from 1 squared minus 2
or minus 0 and minus the negative 2, we get our new X and plug it in to X plus 2 and
negating we got our new Y.
So based upon these curves Hendrik Lenstra announced that the curve factorization
algorithm and we select some curve and a point on it with simple as picking the
coefficient A and Y squared equals plus X cubed plus AX plus B then picking and XY not
and then solving for B and proceeding from there. So we pick our [indiscernible] and
then we multiply the scalery [ph] by our initial point P not and the case offered in
Lenstra's note when we fail and during the extensiation when we try to include the soap
but we divide by zero modulo one prime but not to another prime even though maybe
hard to program accurately it's a lucky charm. And for finite meanings we don't strike
any divide-by-0 for any prime. And we can go back to one or try again on the same
number and pick an X number for the input stream.
And for Step 2 where we were comparing before B1 to the power plus B1 to the negative
of that power, and you get two subscripts where either the sum or the difference, as we
go by our Q, we do the something similar here and hope if the order over one of the
primes divides our prime Q is not known yet but is slightly larger than our first mound but
slightly bigger than our second mound. So scalery multiplication and analog extenuation
gives an output put point which under our assumptions have order Q and my proposal
for doing the comparisons was simply to take I times our output point and J times our
output point plus the values to do the big comparison of done in the B plus or minus 1
method and say we can't do it unless Q divides one of our sums or differences. So
Cunningham tables keep track of all these results and they're a measure of how
successful an algorithm is and [indiscernible] think what you call it a page or the length
seems to vary over the season. The pages have been awfully short recently because it
takes a lot of room to write down 70-digit factors and stuff that are being found in number
fields. Just a month or so after ECM came out, the first page had ten examples where -they were done at Atkin Ripgard [ph] got an early permutation too. P minus 1 was well
into the lead at that time but over the next three months or so, ECM had a bit more and
one of them was done two ways and half but there's still plenty of P minus 1, P plus 1
power rows. The other big surprise is when just a quadratic sieve came out for the
polynomial version and Silverman who pretty fast on it, so [indiscernible] fraction had
been on the previous page. So if small print and my example maintains the tables
before the [indiscernible] ECM was not known, been receiving new factors somewhat
more slowly than the previous years but after those both had been put in, on page 31 the
quadratic sieve curve method had been getting great success so [indiscernible] of
course.
And affecting the elliptic curve on the hardware side during the time since '85, so some
of the big things have been 64-bit hardware where we can do multiple precision cheaper
typically today than back then. And the algorithm can be adapted for multicore. And I'll
show the next page where memory can come into use. So far it's been very nominal
memory that we've had the two tables to compare. And we've gone from megabytes to
gigabytes, typical configuration in an office.
So basically the memory enables fast polynomial arithmetic. The algorithms were
known well before 1985 but they weren't practical in the limited memories where the two
sets we compare might have a million items each. Too big to fit and we need algorithm
times that much data to hold all our intermediates. But my dissertation as well as Paul
Zimmerman's 20 years at ECM has made use of the space. So coordinate systems that
were mentioned, the twisted Edwards coordinates that Dan talked about the torsion
group, ordered six yesterday, instead the [indiscernible] so and if this 2 to the 977 minus
1 got completed a few years ago, 23 years I'm getting the last factor until now in group
order was lucky. So I think I read the step on B1 was 110 million so we're going up fairly
high today. And after Bruce Johnson did this they checked the co-factor and it finally
passes one of the prime tests.
So now we're trying to look for more N factors that size. And there was a record of few
years ago we're we passed the 20,000-bit point for so called special number field curves
where the announcement earlier this year about RSA-768 was [indiscernible] where we
don't have a nice algebraic expression or a number we're factoring. But we aim for the
next record to be back to the N type 1's if it's a factorization to N it would be -- so we're
looking for some exponent on the Cunningham tables below about 1,200 but above a
1,039 for the next run and number fields sieves take so much longer we want to invest
the time now to get rid of anything that might be easy.
This is an EPFL in Switzerland and it's using a PlayStation which is single-instruction
multiple data. Essentially this thing is working on four different streams one at a time.
They all have to branch together at the same place and they should all be indexing the
same address relative to their bases. So things like extensiation, as long as everybody's
doing the W at the same time, everybody doing an add, then it's got some nice patterns
going, we have to be careful for the modular add-type code where some curves need to
subtract off the modulo [indiscernible] and others don't. But after we get the worked out,
the algorithm which is over -- well -- yes -- then so we use the PlayStation 3 and it was in
Switzerland who got six cases like 68-digit number with 240, six-digit left over prime, and
that's it. Okay.
>>: Question?
>>: I just want a quick look at the PlayStation stats.
>>: Question to the PlayStation stats.
>> Peter Montgomery: Well, I don't have the CPU speed and such. Although that's in
some E-print paper.
>>: You spell the factors on the next page.
>>: Yeah, over a nine-month interval so far so it's a total of one half million curves run.
>>: Other questions? Let's thank Peter again. The next talk is at 4:00.
Download