>>: This afternoon's session is, as I say, a... you who never had the good fortune to meet Oliver,...

>>: This afternoon's session is, as I say, a tribute in honor of Oliver Atkin. For those of you who never had the good fortune to meet Oliver, he was sort of a singular point in mathematics, I can. Oliver was quite entertaining, and he was very obsessed, shall I say, with the numerology of number theory. In fact, I was thinking during Kristin's talk that he would have really loved looking at those factorizations of the coefficients that she gave. He probably -- undoubtedly he would have gone off muttering to himself and come up with a good modular explanation of what's going on, but that's only my speculation. Oliver was one of the real pioneers in computing in number theory. I mean, from the time when computers were laughably primitive by today's standards, Oliver was doing some rather non-trivial computation. And he didn't have all that many students, but I think he did have actually a great influence on the field of elliptic curves and modular forms, each of which were his personal friends, I think. He was interested in lots of other things. In particular -- related. In particular, he was interested in primality testing, which the shy and retiring Dan Bernstein will speak about [laughter]. >>Daniel J. Bernstein: All right. Thanks. Let's do a microphone test. Am I audible from the back? I see a thumbs up. Okay. My first encounter with Oliver Atkin was in '95. I was applying for a job at the University of Illinois at Chicago, and I was nervous. I was 23. I was giving a talk and realizing as I was giving the talk, oh, my God, there's all these people in the audience who don't do number theory, and maybe I should define a number field. And so I quickly give a definition of a number field instead of just saying some things about it, that, for instance, q adjoins square root of minus 1 of degree 2 and q adjoin zeta 19 of degree 19. Now, as soon as that second 19 came out of my mouth, instantly, very loudly from the back of the room in a some sort of British accent there was a [inaudible] [laughter]. I said, 18, excuse me, and I continued with my talk, and I got the job. A few months later Oliver had his requirement conference, and as Victor mentioned, he was very funny. He stood up at some point and explained that retirement -- of course he would continue working, retirement simply meant that he would no longer have to talk with students about anything less than cubic reciprocity [laughter]. I have a few of his papers mentioned here. I actually thought I might spend the hour just quoting things he said. I will resist that, but I will give one entertaining quote here. The whole subject of primality and factorization has had an extraordinary fascination for me since the late 1960s when John Brillhart, John Selfridge, Dan Shanks, Dick Lehmer and had in others introduced me to it, both in person and in print. I was no stranger, he wrote, to primes in computation, but these had previously arisen only as the eigenvalues of Hecke operators, and were certainly all less than 1 million. He goes on to say how in primality and factorization the major influences on the subject in the last two decades -- again, he was writing this in '95 -- is from the intelligent primality test offered paper. The major influences on the subject in the last two decades have been the use of elliptic curves by Lenstra, and the increasing number of applications, in particular to cryptography. And then he says these influences increased the audience for the subject and so necessarily decreased the level of judgment and professionalism [laughter]. For some peripheral observers this fact has obscured the novelty, beauty, and often simplicity of the ideas. I figured that I would spend the hour giving a few examples of things that -- maybe in a few cases I'll go a bit beyond what he said in his papers, and I hope that the things that I say that are beyond the papers are things that if he were here, he would have enjoyed. So, first of all, recognizing primes. Oops. Skip the Carmichael for the moment. If you want to prove, for instance, that that 314159265358979323 is composite, you can just apply the contrapositive of Fermat's little theorem computing 2 to the n -- this number is n -- compute 2 to the n modulo n, subtract 2 from it, and you end up with, well, some number which is visibly not zero modulo n. And Fermat tells you that if n were prime, then 2 to the n minus 2 or, in general, w to the n minus w for any integer w would have to be zero modulo n. This is a great way of proving most composite numbers to be composite. But there are some numbers which seem prime from the perspective of Fermat's little theorem, and these are the Carmichael numbers, which, thanks to Alford, Granville and Pomerance we know there are infinitely many of these Carmichael numbers, numbers where w to the n minus w of zero modulo n even though n is in fact not prime. So what do you do to get a more reliable test? Well, you start factoring w to the n minus w. For instance, if w is any integer n is let's say an odd prime -- pretty easy to tell whether an even number is prime -- if n is an odd prime, then either w or w to the n minus 1 over 2 plus 1 or w to the n minus 2 -- n minus 1 over 2 minus 1 or more than one of those has to be zero modulo n. And the proof is simply, well, Fermat's little theorem says w to the n minus w is zero mod n, and w to the n minus w is a product of the three functions that I just mentioned. So at least one of the factors has to be zero modulo n, and, well, that's the conclusion. Now, this is more reliable than Fermat. You can keep going. For instance, if n is 1 mod 4, then you can factor w to the n minus 1 over 4 minus -- sorry, factor w to the n minus 1 over 2 minus 1, which is where I got to a moment ago, factor that into 2 pieces. It's, again, a difference of squares if n is the congruent to 1 mod 4, and then you get a more reliable test. The end result of continuing in this way is from Artjuhov in 1966 who said in general if u is the number of powers of 2 in n minus 1, then you keep factoring n minus 1 into -- well, w to the n minus 1 minus 1 into as far as you can go with powers of 2 in n minus 1. For instance, here's a proof that 2821 is not prime. If you take 2 to the 1410 plus 1, 2 to the 705 plus 1, 2 to the 705 minus 1 modulo 2821, then they're all not zero. But, wait a minute, the product of those is 2 to the 2820 minus 1, which would have to be zero if 2821 were prime. So 2821 is not prime. All right. That's Artjuhov's test, and it's actually very, very reliable. The standard theorem is that if n is an odd prime and you apply Artjuhov's test for a random choice of w between 1 and n minus 1, it's got at least a 75 percent chance of proving that n is not prime. Of course, if you apply it to a prime, it will never prove that n is not prime. Try a bunch of choices of w, enough that the 75 percent chance keeps piling up. If you try, say, log base 2 of n or ceiling of log base 2 of n choices of w, then standard conjecture is that this reliably recognizes primes. If trying all these choices of w in any reasonable pattern, if that fails to prove that n is composite, the only way that can happen is if n is prime. There's all sorts of people -- I'm not going to try to trace who exactly is responsible for the pieces of this. This is the current typical way of checking that a number is prime or proving that it's composite. How long does this take? Well, I've told you to try log n, log base 2 of n choices of w, choices of potential witnesses to n being composite. How long does each of these w's take? Well, you have to do some exponentiation model n. You have to do something like log n bit operations to multiplied mod n, and then you have to do log n multiplications to do a nth power, and then you have to do that log n times for log n values of w. So that's log n cubed time to do all of these exponentiations. You can try to speed that up. Maybe log n cubed is not the fastest way to reliably distinguish prime numbers from composite numbers. For instance, you could try doing only square root of log n choices of w, and that would reduce the time to log n to the 2.5. Quite a lot faster than log n cubed. Except it doesn't work. There are certainly numbers that pass this test with only square root of log n choices of w. The reason is that you can easily write down lots and lots of numbers and where that 75 percent is actually quite realistic. For instance, here's one of the Atkin-Larson examples. And I think they were the first to write this down, although the whole paper was about three pages long essentially saying that all the previous papers on the topic were stupid, but one of the points that they made in this paper was that if you have any n of the form 4k plus 3 times 8k plus 5 where those are both primes, then you will have about a quarter of the possible w's in fact making n seem to be prime when in fact it is clearly composite. If you look at how many n's there are and you think about how many w's you'd have to try to get rid of all of these n's to make the test succeed for all these n's, you see you have to have something at least close to linear in log n for the number of w's to try to exclude all of these composites. So what do you do instead of if you want to try to improve on log n cubed? Well, you could try a quadratic extension of z mod n. Instead of looking at the multiplicative group of z mod n, let's look at, for instance, z mod n adjoin t in middle here, z mod n adjoin t where t is the root of t squared minus wt plus 1. Now, I've put a hypothesis on w here to force this to be a field, mainly w squared minus 4 having Jacoby's symbol minus 1 modulo n, if n is prime, as is the [inaudible] symbols, w squared minus 4 is not a square and then, well, that polynomial, t squared minus wt plus 1 is discriminate w squared minus 4, which is not a square mod n, so this isn't in fact a field extension. The test you do is, well, if n is an odd prime and you compute t to the n plus 1 over 2 in this field, then you will get 1 or minus 1, again, assuming w has the right symbol, w squared minus 4 has the right symbol. And the proof, to be complete about it, first, well, as I just said, from w you know that that extension is in fact the quadratic field extension of z mod n, and now what does that tell you about t to the n? Well, t is certainly a root of this polynomial u squared minus wu plus 1 by construction t squared minus wt plus 1 is zero in this field. Now taking nth powers, t to the n is also a root, but certainly t to the n is different from t because in this field we know all the numbers whose nth are themselves, and t is not one of them. So this polynomial has two roots, t and t to the n. Therefore factors is u minus t times u minus t to the n, and then looking at the constant coefficient you see t to the n plus 1 is 1. And therefore t to the 1 plus 1 over 2 is 1 or minus 1. And that's exactly this test, which is a typical Lucas-style test. Reinterpreting the computation here, this is counter the number of points on a certain curve. I'm working with powers of t which has norm 1. Well, let's look at all the elements of norm 1 in this extension. Let's look at all the y plus xt's that have norm 1, in other words, that have x squared minus wxy plus y squared equals 1. That's some curve. It's a shifted, twisted circle, clock, if you like. On this curve, well, the computation I just did is counting the number of points on this curve. It's exactly n plus 1 under the same assumption about w. The number of points on this group scheme evaluated at z mod n is n plus 1 by this hypothesis on w. So if you multiply n plus 1 by any point, for instance, 10, this is if you take t to the n plus 1 power, you will get the identity on it which is 0, 1. And now that -well, okay, aside from dividing by 2, getting the neutral element or obvious point of order 2, this is exactly the same test that I wrote down here which -- okay, it's fun to have curves running around, but what's the point? Is this actually better than the original test? It's not, actually. It's certainly not faster. It's somewhat slower. It's maybe more reliable? Well, if you look at it, no, it's not more reliable. There are just as many failure cases for this test as there are for the usual tests. So then you say, well, okay that attempt to apply mathematics was not improving the situation. Let's put in some more. Let's have an elliptic curve. For instance, let's take x squared plus y squared equals 1 minus 30 x squared y squared. There's a nice genus 1 curve, and, hey, genus 1 must be better than genus 0. Then assuming you know the number of points, which was -- the critical calculation here was figuring out the number of points on this x squared plus y squared minus wxy equals 1. Assuming you can figure out the number of points on this n curve, modulo n, then you can do the same kind of test and take some random element of this group, this group scheme at z mod n, and multiply it by the known number of points, the known number assuming that n is prime, and then, well, if n were composited, it would have an awfully difficult time having the presumed number of points times some point here coming out to be the identity element. This is what the Chudnovsky brothers and Gordon proposed in the mid '80s, building the elliptic curve e with complex multiplication only in the class No. 1 case. Of course, we now have to do this very efficiently for higher class numbers. I'll come back to exactly how fast that is. But, again, there's no point in doing this. This is not better than z mod n star, it's not more reliable -- well, if you look at how reliable it is, then you see that these elliptic pseudoprimes for doing an elliptic curve primality test are just as frequent as regular pseudoprimes or quadratic pseudoprimes, so there's no point. What do you do to make a better, faster primality test? Well, this is the subject of Atkins '95 paper. You try to combine different tests. You try to say instead of doing a lot of w's for z mod n star or doing a lot of w's for this x squared minus wxy plus y squared equals 1 or for some elliptic curve, you start varying which groups you're working with. The first proposal along these lines was from Baillie and then Pomerance, Selfridge and then Wagstaff who said take one quadratic test and one linear -- well, one z mod n star test. The total time to do those two tests, each one of them takes quadratic time, so doing two of them takes quadratic time essentially. If you compare that to doing two w's in the original test, it's much, much, much more reliable. Even now there are no counterexamples known. There are no examples known of numbers n which are composite and which are not proven to be composite by their tests, filling in the details of exactly which w's they take. If you can find an example, you get $620 of which I believe $20 are from Pomerance because he thinks that there are lots and lots of counterexamples -- I'll come back to that -- and then Atkins said well, okay, okay, linear and quadratic is not enough. Here's a really confidence-inspiring test. Do a linear test, a quadratic extension of z mod n and a cubic extension of z mod n. And he goes to some effort to make a cubic extension which allows really fast computations and offers $2,500 -- this is no longer open, for a counterexample. I mean, we don't know any counterexamples, but if you find one, you don't get $2,500. Pomerance's argument about the linear and quadratic test was published in '84. Actually, it was at, I believe, Arjen Lentsta's Ph.D. defense. He a wrote a little paper saying here's how Arjen can make some money, can make $620 -- at the time it was a slightly smaller amount -- but to get Arjen off on a good financial footing, he could try to construct counterexamples to this test. And Pomerance explained how to do this, and the same explanation just also give lots and lots of counterexamples to Atkins' test. So there should be lots and lots of counterexamples. But, actually, the obviously thing to do is keep going just a little bit, where the little bit grows really, really slowly with n. I think if you take something much smaller than log n -- I'll quantify this a little more precisely in a moment -- if you take something far smaller than log n test, well, log n to the epsilon test where epsilon converges to zero with n, then I believe that this sequence of tests becomes perfectly reliable. So if you take Atkins' intelligent primality test and keep going to a super-intelligent and a cortic, quintic, et cetera then you will something which is a perfectly reliable test for primality that takes only essentially quadratic time instead of essentially cubic time. Now, I'm not sure if this analysis, the analysis I'll show you in a moment, has been done before. I've put new in question marks for this conjecture. It's a pretty easy analysis to do. At the same time, I've seen people who are speculating that the best possible primality recognition algorithm takes essentially cubic time, so the quadratic time conjecture does seem to be new at least to a bunch of people writing papers in this area. Further comment, which I'll also come back to, is that if you want to make this run as quickly as possible, not just get the exponent down to 2 but get the little o of 1 as small as possible, then you certainly should not be doing degree 20, degree 21, et cetera, extensions, you should be doing a bunch of those elliptic curves, being careful to not combined a bunch curves which all have the same number of points. Gordon's test always had n plus 1 points. No point in combining those. You want to have a lot of orders which have a large least-common multiple. But that's easy to do. Where does this conjecture come from? Well, Erdos in 1956 -- this was the pays for Pomerance's analysis -- Erdos said there should be infinitely many Carmichael numbers because there should be infinitely many numbers n for which n minus 1 is a multiple of p minus 1 for every prime p dividing n. This is how you force w to the n minus 1 to be 1 modulo p is you force p minus 1 to divide n minus 1. So unless w is a multiple of p, certainly w to the n minus 1 will be, well, w to the p minus 1 times something, which is 1 modulo p. And if you manage to do that for every p dividing n, then, well, you've made an n for which w to the n minus 1 has a very good chance of being 1 and then a good chance of passing all the tests you might do with z mod n star. What's the chance that n actually gets through this? Well, if you think suppose I've got n where I know it's got a p times some other stuff, fix a p and then say I've got n as p times some q times whatever. Then what's the chance that n minus 1 will be a multiple of p minus 1 or it may be close enough that you'll have a good chance of w to the n minus 1 being 1. Well, basically you want some event of n being 1 modulo p minus 1 which has chance 1 over p minus 1, maybe a little bit more for allowing, say, p minus 1 over 2. What if you allow p to vary? Well, these aren't independent chances, because if you look at the 1 over p minus 1 chance for each p, the chance of all of those happening is not 1 over the product of p minus 1, it's 1 over the least common multiple of p minus 1. This was Erdos's central insight that there's going to be infinitely many Carmichael numbers that the least common multiple of p minus 1 does not have to be very big. You can have a whole lot of primes p where p minus 1 is a product of very small primes. If you start with a set q1, which is all the primes up to 100, 1,000, a million -- pick some number which grows slowly -- and then take all the primes p up to some bound such that p minus 1 is a product of a subset of those small primes, primes up to 1,000. Now you've got a bunch of primes p which the least common multiple of the p minus 1's is actually -- I'll guarantee it to be at most the product of all of the elements of q1. Now, that's not very big, and that actually gives a good chance that if you form a lot of different n's from these lot of different p's and then look for each of those n's as n minus 1 divisible by the least common multiple of the p minus 1's, well, is it divisible by the product of all the primes in q1, there's actually a very good chance of that happening, at least enough of a chance that when you look at all the n's over all the p's, then it actually does happen very frequently. So Erdos conjectured there are h to the 1 minus epsilon Carmichael numbers up h. And that still hasn't been proven, but at least we know there's h to some constant power. Pomerance attacked the linear and quadratic test by saying, well, let's -- instead of just having one set of small primes q1, let's have one set of small primes q1, say, every prime that's 3 mod 4 will be in q1 up to, say, 1,000, and then every prime that's 1 mod 4 up to 1,000, we'll put those into q2, and we'll have p minus 1 be a product of any subset of q1 and w plus 1 being a product of any subset of q2. And there's actually quite a few primes that satisfy both of these conditions. And then if you take n to be a product of a lot of these different p's, then there's a pretty good chance that n minus 1 will have -- will be divisible by all the of the elements of q1 and that n plus 1 will be divisible by all of the elements of q2, which guarantees that n will pass at least the simplest forms of the linear and quadratic test and has a good chance of passing even the fancier linear and quadratic tests you might write down. If you looks at Atkins' tests, three tests, a linear and a quadratic and a cubic, then you can, of course, apply Pomerance's argument, but if you quantify -- as the number of tests goes up, if you quantify how big the numbers are that you have to write down for Pomerance's argument to kill this test to exhibit a composite number that passes the test, then you see -- well, at least I did a pretty solid job, I think, for Pomerance's original analysis, but he wasn't trying for a lower bound, he was trying for an upper bound. Still, I think this is going to be pretty close to the truth that t is going to be bounded by something times log log n. So the answer you get from Pomerance are something like double the exponential in t. If you have two tests, then already it's so big nobody's found it yet. If you have three tests, it should be ridiculously large. As t goes up, the size of n you need to fool this test becomes, well, doubly exponential in t. I wouldn't be surprised if it's actually, say, log log n times log log log n or something else that makes analytic number theorists happy, but I'm certainly very comfortable conjecturing that I haven't missed so much in the analysis that t is certainly less than log n to the epsilon. That's actually a very weak conjecture compared to what seems to be the case. Yes? >>: [inaudible]. >>Daniel J. Bernstein: Well, this is coming from n. N is going to be a product of p's where the p minus 1's all have just -- each p is exploring a bunch of different primes from the same set q1, which is only, say, the primes up to 1,000. Now, there's a lot of different p's that have p minus 1 being a product of various subsets of those primes, but then the least common multiple of all the p minus 1's is not very big. It's just the product of the primes up to 1,000. So it's just e to the 1,000. So that's the chance that n minus 1 is divisible by that particular product of all the q1's? It's a huge number. It's like e to the minus 1,000, which on the scale of everything else happening here means you only have to look at e to the thousand different numbers n before you get one that passes that, and only e to the thousand for passing that. So that was Erdos's argument. And this is maybe not the most computationally effective way to construct an n which passes these tests, but it does convince analytic number theorists that there should be infinitely many counterexamples. At the same time, there are quantitative limits on how far this can go, so I do believe that there is an essentially quadratic time primality test. What if you don't believe these conjectures? Well, I'll get to that in a moment. I first promised that I would get back to constructing elliptic curves, because certainly you don't want to use very high degree extensions. They're much slower to do computations in than working with elliptic curves. So let's say you want to do a t test with t different elliptic curves or maybe t minus 5 tests with elliptic curves and five tests with degree 12345 extensions. Let me contrast this with what happens in ECPP. In ECPP we're trying to construct something like log n different curves so that we can find 1 that has its order being prime or 2 times the prime, 4 times the prime, something like that. In this context, we don't need that condition. We don't need orders which are essentially prime. That's important for proving primality of n, but that also slows things down dramatically by having such a big t. Here t -- well, I think it's log n to the epsilon. Let's assume it's log n to the .3 at most. Then you can easily generalize the standard Shallit ideas for making ECPP construct a curve quickly. You start with a bunch of square roots of small numbers, say numbers up to t to the one half -- anything that's substantially less than t will make the asymptotics kind of reasoning for the time to do that -- then there's a good chance, if you look at discriminants up to t squared or 10t squared, they have a good chance of being t to the one half smooth, the factoring into integers up to square root of t, which means that from the square roots -- the square roots are relatively slow. Writing down a single square root already takes log n squared time. Okay. Doing t to the one half square roots, that's t to the one half times log n squared time, once you have some square roots, square roots of all the numbers up to t to the 1 half, you can multiply them together much more efficiently to get square roots of discriminants up to t squared, or I should say negative discriminants down to minus t squared. Now, the time to do all those multiplications instead of the t to the one half times log n squared is more like t squared times log n, which -- well, for the remainder of t that I'm talking about is much, much smaller than the something times log n squared. That's the bottleneck. What do you do next? Well, do some lattice basis reduction to figure out which of your discriminants is actually happy with your prime, which of your primes is happy with a class group, and then that gives you something like t, discriminants up to t squared roughly, then you get something like t discriminants that are good for n. Maybe it will be only t over 10 or t over log t or some such, so instead of t squared I should be saying t to the 2 plus epsilon for some suitable epsilon, but it's about -- t squared is about the right number. And then fast CM -- I think Drew has left, but let me point to his very recent paper on speeding up CM. I believe that the run time that he gets heuristically under variation assumptions applied to the situation looks like t squared times log n plus t times log n squared. In other words, the time per curve that we're writing down is something like log n squared. For this range of t, the dominant part is the last part of this fast CM algorithm, which is kind of merging the class polynomial construction with writing down the smallest possible part of the class polynomial and then finding roots of it. Maybe there's something better here. I don't know how far this is going to go. Certainly this result from Sutherland is faster than the previous results. It seems to me that this will be the bottleneck in actually running this primality test for very large numbers, so it's actually a legitimate excuse for doing class polynomial computations for figuring out better class polynomial computations for moving from j to [inaudible], for instance, should actually seriously speed up this primality recognizing algorithm. I don't think I can say the same about ECPP as an application of class polynomial computations, but I think this -- it really is the bottleneck. I think it's really the most important step in this algorithm is doing interesting elliptic curve computations. All right. Suppose you're not happy with all these conjectures and you actually want to prove something. Well, then you have to increase t. You have to look around more and find a curve for which the number of points on the curve is something that you can factor so that you don't just check that some point has the order you expect in this group, you want to check that the point has not just order dividing what you expect, but you want to verify that the order is what you expect, so that tells you that the group has to be at least a certain size. And that's what ECPP does. The fast ECPP takes time log n to the fourth, verifying on ECPP certificate takes time log n cubed, and current project is getting that time down. I don't think it will be possible to do better than cubic, but at least you can look at little o of 1, things like log log n factors and try to get those out, try to improve the constants. So what actually takes the time here? Well, in ECPP, you've heard something already, but just to briefly review, a ECPP proof looks like a elliptic curve modulo n together with some point as w now is a point on this curve mod n which has prime order q. And part of this proof is recursively verifying that q is itself prime. Q can't be too small. The proof breaks down if q is too small, but the q's that weak actually find are pretty close to n. So this is not a serious restriction. What does a verifier do with this proof? Well, the verifier checks, first of all, that w looks like it has order q, checks q times w, sees that the neutral element on this curve. Because elliptic curve computations are compatible with base change, you get to reduce this modulo p and you've done a computation on the elliptic curve modulo p. For any prime p dividing n, you know that q times w is zero. So the order of w in e of z mod p is either 1 or q once you've checked that q recursively is prime. You check that w is non-zero and also non-zero after base change, so w is -- for instance, for Weierstrass coordinates you checks it's an affine point for other coordinate systems. You check that each of the coordinates is different -- the difference of coordinates is invertible modulo n. That's what this boils down to. So you check that w in each e much z mod p even without knowing what p is, do some very fast tests to see that w is going to be non-zero, doesn't have order 1 modulo p in the elliptic curve modulo p, and so now you know for every p dividing n, every prime p dividing n, that the order of w is exactly q. But that means that the size of the elliptic curve group is, well, at least q. And now knowing that q is pretty big, that tells you that p has to be pretty big, by [inaudible]. Specifically, every p dividing n has to be bigger than the square root of n, which immediately implies n is prime. What slows this algorithm down is, first of all, the recursion. You've got this recursive proof that q is prime. Q is pretty close to n. You can put more work into trying to find q's and slightly decrease the q's that you find, but you still have to go through something close to log n, maybe log n over log log n levels of recursion to actually prove that n is prime. Just because there's all these subproofs involved in it, you have to know that q is prime. The other thing that makes this algorithm slow is that doing arithmetic in the elliptic curve modulo n is slow. For instance, if you take the Goldwasseh-Kilian definition, which I've written here as the engineer's definition, of e of z mod n, this is follow your nose and say, well, I've got points on the elliptic curve mod n. I don't even know if n is prime. I'll just go ahead and use the formulas, uses the doubling formulas. X1 is different from x2. Well, I'll compute lambda equals y. 2 minus y1 over x2 minus x1 and, whoops, I just divided by something which was not invertible. That's the GCD computation is doing that inverse -- inversion. And, hey, I've just found a factor of n. And if that never happens, if nothing goes wrong, then you know that the computation you've done reduces modulo p so that this computation sort of looking at the algorithm you're doing is retroactively defining some piece of e of z mod n which is compatible with e of z mod n which is compatible with e of z mod p for every p dividing n, and then you know something about e of z mod p. It actually is a legitimate proof, and you don't have to think what e of z mod n actually is. You could alternatively come along and say oh, this is so ridiculous having e of z mod n defined implicitly by what some algorithm is doing. Let's give a proper definition compatible with how algebraic geometers would think about group schemes of what e of z mod n actually is. I'd like to show you the definition. I'll do that in a few minutes because I think it really is a very nice definition and fun to work with, but it still requires that GCD for every computation that you're doing. You could try other things. For instance, in one of Francois's [phonetic] papers there's using division polynomials which, as written, don't involve any inversions, but it's something like 20 multiplications per bit of n to do that computation, and that's kind of ridiculous compared to what you've heard for even Jacobian coordinates. You can do a nth multiple in 9 plus some -- something that converges to zero, 9 plus little o of 1 log n multiplications where some of these are the time you have to do for batching a bunch of GCDs, checking that everything that you were implicitly dividing by in the Jacobian coordinate formulas is actually invertible mod n. Fastest way to do that is multiply them all together, and you have to do that something like log n times, so doing a multi-inversion modulo n costs a significant chunk of this computation. I thought a few years ago that I could do better than this with a Montgomery Ladder-type computation which almost kills the little o of 1, gets rid of the multi GCD, reduces the 9 to 8 at the end of the day, essentially by killing that multi GCD, but I wouldn't trust this proof, and if somebody came along to me and said that's a proof of primality, I'd be kind of skeptical. So fortunately we know more now. In particular, we know how to do curve computations without exceptional cases and with incredible speed. So instead of using old-fashioned curve shapes, let's use x squared plus y squared equals 1 plus d x squared y squared where d is not a square and then we know that the addition law, the very fast addition law on this curve, always works. You never end up dividing by anything zero modulo p. So you don't have to bother checking anything. You do the fastest computation you can think of and it just -- it always works. You've heard what the fastest computations are, and it's only 7 times log n multiplications plus some little o of 1 for the occasional additions you have to do. That's much, much faster than certainly division polynomials or doing GCDs all the time. Well, this is -- this seems to be the state of the art. Maybe somebody will come up with something better, but this is already pretty good except you might object, wait a minute, how do I know that this d is not a square? We're trying to computations in e of z mod n in order to do computations in e of z mod p for every prime p that divides n. Now, how do I know that d is not a square in z mod p? I don't know anything about p. I mean, I think p equals n. We're going to prove p equals n. But that proof can't assume that p equals n. We don't know in advance. We can't assume anything about p. How do you know d is not a square? Well, the easy fix is to say, okay, at least there will be some p that works because you take a d whose symbol mod n is minus 1, and that means there's some p dividing n where d is not a square modulo p, and that tells you that some p -- all of your elliptic curve computations have worked, so for some p dividing n p is at least, well, what you get from Hisil [phonetic] assuming you've verified that something has order q. Now, depending exactly how close q is to n, you have to do a little bit more work to check does n have any small primes. If you know that some prime dividing n is, say, bigger than n over a million and you know that n has no prime factors up to a million, n has to be prime. You can try to balance, okay, what will you allow q to do, do you want to do more order verification versus doing less trial division, use better methods in trial division. There's all sorts of things to make this run even faster. But the basic idea certainly works, and that's what we're exploring right now. I promised I would tell you the mathematicians definitions of e of z mod n. I'll only do this for -- I'll do e of r in some generality, but only for r's with class No. 1, but z mod n has class No. 1. This goes back effectively to a [inaudible] paper by Lange and Ruppert -- different Lange -- saying for any abelian variety over any algebraically closed field -Were you writing papers in '85? >>: [inaudible]. >>Daniel J. Bernstein: -- for any abelian variety over any algebraically closed field, there's a low degree complete system of addition laws. So addition laws are polynomial expressions which are compatible with addition except they're allowed to sometimes give all zeros, some non-projective point i. They specifically showed that if you have a symmetric elliptic curve embedding, then you get a degree 2 in each variable system of addition laws. I'll say what this means quite concretely in a moment. They commented that this proof does not let you write down the addition laws. To determine explicitly a complete system of addition laws requires tedious computations already in the easiest case of an elliptic curve in Weierstrass normal form. But, okay, they were not deterred by tedious computations. In the same paper they actually did it. They wrote down a complete system of three addition laws for short Weierstrass curves and then a couple years later did it for long Weierstrass curves. I'm not going to show you the formulas. Just to give you an idea of how complicated they are, if you give names to some cross products, then you end up with only 53 monomials in the complete system of addition laws -- quite a mess -- until Bosma and Lenstra came along and made things much simpler. So what they did -- first of all, they reduced the three addition laws to two. So they wrote down six polynomials, x3, y3, z3, x3 prime, y3 prime, z3 prime -- the primes are just different polynomials, no derivatives -- in this generic polynomial ring with variables x1, y1, z1, x2, y2, z2 and generic curve coefficients a1 3a6 in Weierstrass form. What I've shown you on the previous slide in a ridiculous font here is not the system of two addition laws. This is two of the six polynomials. So they had y3 prime and z3 prime. Actually, this is the result -- so what I'm showing you is a scan from the printed publication, and the printed publication is not the same as what Bosma had inside his computer from an early version of magma. This is what happens when you feed magma output through the publisher and you get published y3 prime and published z3 prime. These are incorrect formulas as they actually appeared in print. I said I would say concretely what this means. Well, these polynomials have the following very explicit addition property. If you take any Weierstrass curve, any point p1 on this Weierstrass curve over whatever field, any point p2 on the same curve over the same field, then the first three polynomials evaluated from Bosma and Lenstra will be either the sum of the points or 000, and the second system of polynomials they wrote down will also give you either p1 plus p2 or 000, and they won't both give 000. So between the two of them, at least one of them will add any particularly pair of points that you feed in as input. Okay. Here's a similar theorem for Edwards curves. Instead of in p2, this is in p1 times p1. This is also geometrically outside characteristic too. This is arbitrary elliptic curves. So the same level of generality as this theorem. The formal expression is -- well, it's the same thing except it's all p1 times p1. Instead of p2, there's some explicit formulas, some explicit polynomials that we wrote down x3, z3, y3, t3 and x3 prime, z3 prime, y3 prime, t3 prime which always add any pair of points. There might be some occasional zero divided by zeros, but that will be made up for by something else not being zero divided by zero. The difference between these kinds of formulas and what you get from a, well, engineering approach to adding points on an elliptic curve is that these formulas will never give you anything bad other than 0 divided by 0 or 000 and more variables. For the normal formulas, if you try applying, say, the doubling formulas in textbooks, then those don't work for adding most pairs of points. They give you actual wrong answers. These are all valid on some open subsets of e times e or e of k times e of k. Here are the formulas for Edwards curves. That's the complete system of addition laws for addition on Edwards curves with all the extra variables to put it in p1 times p1 and show it to undergraduates. For comparison here, again, is the Bosma-Lenstra complete system of addition laws which -- I mean, these are both, you know, finite computations. In principle, there's no difference. Just big o of 1, right? Okay. This is your brain on Edwards curves. This is your brain on Weierstrass curves. And what does this have to do with defining e of r? Well, here's the general setup again for rings of class No. 1. You take projective space over r to be the set of lines through the origin in 3-dimensional space. So you take all -- for any xyz define x, colon, y, colon, z as all the multiples of xyz, same multiple of x, y, and z, and that's some line through the origin in 3-dimensional space. And then the set of those lines is -- well, I should say this is supposed to be a non-trivial line in the sense that x, y and z are supposed to generate the whole ring. Take all of the non-trivial lines through the origin and that is projective 2 space over r. And now define e of r for, say, Weierstrass form. This is what Lenstra did in '97. Define e of r as, well, the set of xyz in this projective space that satisfy the curve equation for, say, a short Weierstrass curve. How do you add points? How do you add these elements of e of r? Well, this is where the complete system of addition laws comes in. And you really need it to define e of r in this generality. You take the complete addition laws from -- well, back in '87 Lenstra's paper only had Lange-Ruppert to refer to. He said take those three addition laws for Weierstrass curves, add the points that you're trying to add, the x1, y1, z1 and x2, y2, z2 with those formulas, and that gives you three different choices for lines through the origin with are supposed to be the sum of your points projectively. And now they're all supposed to be the same point in some sense or maybe 0, 0, 0, but if r is -- say you have z mod n being z mod p times z mod q for two different primes p and q. Then it might have one of the formulas is working mod p and another one is working mod q. To get a general formula that always works you add these three lines through the origin, and that always gives you a proper line through the origin which is some xyz. This is the GCD computation. This is the inversion. You have to do something mod n. You have to find one generator for this module. I mean, you start out seeing it as a projective module, you know it's rank 1, and because r is assumed to be trivial class group, you know that there's a single generator for it, and that computation is exactly a GCD computation. Okay. So that's the right definition of e of r. Of course, if you allow r to have a bigger class group, then you need to allow more terms in this, not just a single xyz. Okay. Next mini talk, factoring integers into primes. Here's a quote from Atkin-Morain in '93, finding suitable curves for the elliptic curve method of factorization. They said for practical application -- they constructed a whole bunch of curves. The elliptic curve method of factorization we'll hear much, much more about tomorrow in Peter Montgomery's talk. Plus you've heard a bit about it before. In the context of ECM, well, it's good to start with a curve over q. You need to have the curve over q with a known non-torsion point over q, and then you'd like to have the curve having a big torsion curve. And that's what this whole paper is about is constructing curves over q with rank known shown explicitly to be at least 1 by exhibiting a point and with big torsion groups all the way up to the maximum you can have over q, namely, z mod 8 times z mod 2. And they say you may as well use this 16-torsion-point curve, family of curves. Giving a prescribed factor of 16, well, inside the context of the elliptic curve method, whatever groups you write down, if you know that they have, say, four torsion points for the clock or if you know they have two torsion points for z mod n star, if you know something about your group, then that effectively divides the size of your group by that little torsion. It improves the chances the elliptic curve method factoring, and so they say, yeah, this is the biggest groups we can give you. Use those curves. Except it's actually not true. These are not the best curves to use for ECM. Together with Tanya and Peter Birkner, this paper here is called Starfish on Strike. You'll have to look at the paper to understand the title. The result is there's sort of an expect part of this and then there was a surprising part of this. We were looking at all the results from Husey [phonetic] and Hissil that you've heard of about of how fast Edwards curves, and in particular this kind of twist of Edwards curves minus x squared plus y squared equals 1 plus the x squared y squared, how fast these curves are. It's certainly much faster than anything you can do with Jacobian coordinates or the other coordinates that have been considered for ECM. We were certainly expecting these curves to be very, very fast. There's a little problem that these curves are incompatible with having 12 or 16 torsion points. With this particular shape, the minus x squared prohibits having 10 or 12 or 16 torsion points. The best you can do is say z mod 6 or z mod 8. And we can constructed all these -- I use the word we loosely -- my coauthors constructed all these. I just did the computer experiment. We were expecting that the z mod 6 and z mod 8 cases would be very fast but would lose some effectiveness inside ECM. They don't have the maximum torsion. And everybody working with ECM knows we want big torsion except it's -- again, it's just not true. These curves actually find more primes than the previous curves do even though the torsion is smaller. Now, there's reasons -- there's easy reasons to explain why they might find the same number of primes inside ECM, why they might be as good in terms of effectiveness, how many primes they find, and then better for speed. And that's what we were hoping for, that they would be at least as good or maybe not so much worse for effectiveness and then so fast that they would be worthwhile. But actually they're very fast, faster than anything else, and they find more primes. The combined effect of that is illustrated in the following diagrams which are -- this is for 1,000 curves in five different families for finding different sizes of primes using parameters that were optimized for the now known to be not best possible families. What you see here is taking, for example, 7200 modular multiplications to find an average 25-bit prime. So feed all 25-bit primes to your ECM program and then see how many of the primes you find, compare that to the number of multiplications that you took which was maybe 4,000 for finding half the primes comes out to 8,000 per prime actually found. The curves are then sorted in order of the lowest cost curves on the left, the highest on the right. The red curve at the bottom here, the lowest cost curve, is the z mod 6 which actually has 2 pieces because there are two different families that we were looking at and they apparently have different performance despite having the same torsion and all other obvious algebraic features being the same. Up here is the 12 and 16 and then some other curve shapes that we tried. Very stable. As we increased the size of primes we're looking for, it is very clear that ECM is happier with these z mod 6 curves than with the previous z mod 12 and z mod 2 times z mod 8 curves. >>: [inaudible]. >>Daniel J. Bernstein: I'm sorry? >>: [inaudible]. >>Daniel J. Bernstein: The order does slightly change between 12 and 2 times 8. The parameters, again, were optimized for a certain slice of ECM parameter space and then -- if you're trying to optimize ECM parameters then you end up being kind of limited by -- you're looking at small numbers and you have a limit to how many different cutoffs you can use for, say, b1 in ECM and the stage 2 parameters. And so you often get kind of discontinuities, and so for some of those parameters actually the 12 and 16 reversed slightly. But those were the best parameters found in a fairly comprehensive search for those curves, and this -- just using the parameters optimized for those is quite a lot better. So what's going on here? We don't know, but maybe somebody can figure it out. All right. Last little section of my talk. I have officially 10 minutes, but I only have a few slides. Sometimes used to be going for a while doing some serious mathematics and then you kind of degenerate and you end up saying, all right, what are all the primes up to 1,000? I'm really bored. 2, 3, 5. And you would think a problem like this, there's nothing to say about it because Eratosthenes figured it all out thousands of years ago. And what the Sieve of Eratosthenes does is, well, evaluate -- enumerates systematically all the small values of some quadratic forms. In the traditional expression, you're numerating product. You take, say, all the multiples of 2, all the multiples of 3, you can skip all the multiplies of 4, all the multiplies of 5, skip all the multiplies of 6. You're numerating products, i times j. But for lots of reasons I prefer to re-express those products. Let's ignore what happens at 2. Just look at odd products i the times j. You can think of i as x plus y and j as y minus x and then, well, i times j is minus x squared plus y squared. Y squared minus x squared is a generic way, again, ignoring even numbers. Y squared minus x squared is a generic way to write an odd product which -I remember a wonderful book called Category Theory Made Difficult. This is the Sieve of Eratosthenes made difficult, and there's kind of limits to how far you can go compared to category theory, but, okay. Y squared minus x squared, that is the norm of the same kind of thing I was writing down before, y plus xt from the ring z adjoined t mod t squared minus 1. Hey, that's not irreducible. Don't worry about it. You can take norms from z adjoin t by t squared minus 1 down to z, and in particular the norm of y plus xt is y squared minus x squared. You take y plus xt times y minus xt, you get y squared minus xt squared which is y squared minus x squared. So that's what the Sieve of Eratosthenes is doing is systematically numerating norms from this reducible ring. And then somehow by knowing something about the numbers of ways of write n as a value of these norms, it figures out whether n is prime. If you can write n in several different ways as a product, then it's not prime. Like I said, you kind of degenerate after a while, but it's still kind of fun to look at this. All right. If you do this computation for all small numbers n, say all numbers n up to h, and then it's actually very, very fast, because numerating all the values of y squared minus x squared that are up to h, well, if you just take some y in some range and x in some range, then you're not going to get any particular n with a very good chance, but if you're looking at all the n's, just zoom through all the x's and all the y's that could possibly be relevant and you make a table of all the n's that you care about, and that's what the Sieve of Eratosthenes does, and it's very, very efficient. But you can actually do better. I was on the way to a conference with Oliver, and he mentioned that he actually uses something different to check whether a number is prime, whether a small number is prime. Namely, evaluating -- well, numerating values of x squared plus y squared or 4x squared plus y squared, 3x squared plus y squared. Okay, the complete Sieve of Atkin -- there's, of course, many choices you can make here, but what is now widely known as the Sieve of Atkin is enumerating -- instead of y squared minus x squared values you numerate y squared plus 4x squared values for some n's, y squared plus 3x squared values for some other n's and y squared -- well, 3x squared minus y squared values for some other n's. And this covers all possible n's. Now, this is better than the Sieve of Eratosthenes. There are fewer values of these forms than there are of y squared minus x squared because there are fewer elements of these number fields than there are of q times q. If you take q adjoin -- what people sometimes call q adjoin square root of 1, q adjoin t mod t squared minus 1. That's q adjoin square root of 1. It's not a number field, it's a product of two number fields. Its zeta function is a product of zeta functions that has a double pole at 1. You've got more ideals, you've got more elements of this number field than you do of -- well, product of two number fields than you do of an actual authentic number field like the ones that are showing up in the Sieve of Atkin, q adjoin square root of 3, square root of minus 3 or whatever square root you want to put in except for square root of a square like square root of 1. Now, as a result of this, if you ask how long does it take to write down all the values of x squared plus y squared, say, it's just less time than writing down all the values of y squared minus x squared. I have a parenthetical note that I don't know the answer to, namely, can you do something similar, enumerating points on elliptic curves. I heard him mentioning this and said, well, that's funny. That actually answers an open question in prime enumeration -- you'd be surprised how many papers there actually are on this topic -namely, can you numerate primes in what seems to be the best time possible, not quite h over log h time for all the primes up to h but h over log log h is the best anybody's been able to do. That's the number of additions of numbers that are up to about h or h squared or so. Can you numerate primes with that minimum amount of time using a lot less than h space. It was previously known how to do this enumeration using something like h or h over log h space, but can you do it, some previous papers asked, in only, say, square root of h space. That was previously known to be do believe only with much, much more time, like h times log log h. And, well, Atkins sieve immediately answers that question. And so we wrote a little paper and then Will Galway came along and said actually you can do same kinds of techniques with better allowed spaces reduction gets down h to the one-third, you should be able to get this down to h to the one fourth. But more recently I've been looking back at this and wondering is this actually a sensible kind of optimization to do? This is saying -- it's saying go for the absolute minimum time, paying attention to, like, log log h factors, not willing to do h or h times log h or any such thing and then asking can we make these huge memory reductions, but still not willing to compromise at all in the amount of time. I don't actually think this is a meaning faculty gain. I think that meaningful gains are ones played on current state-of-the-art graphics cards like the Radeon 5970 graphics card which has 3200 parallel multipliers, all running at 725 megahertz. It can do 2.3 times 10 to the 12th multiplications per second, draws 300 watts, costs about $600. This is a picture of something running at even higher speed doing more multiplications per second but needs more cooling, and you have to make sure to plug it into an even better power supply. Now, this is the future of computation. This is the fastest computational machinery you can buy today. Consider its price, it's the fastest -- it's the best-priced performance ratio you can get for computation by far. And it's not what we're optimizing for. If you think about the 3,000 parallel multipliers here and you try to put your typical number theoretic algorithm onto this graphics card, then you see it's actually incredibly slow because those 3,000 parallel multipliers, they can all operate at once with very small amounts of memory. They can't talk to huge amounts of memory, they can't sieve very quickly. If you tell them oh, we're going to have a large amount of memory and just access that, it's incredibly slow. And physically it wouldn't make sense for it to be faster. So to take advantage of this, we should be willing to trade some time for reducing memory consumption much, much further. And it's actually -Yeah, go ahead. >>: [inaudible]. >>Daniel J. Bernstein: Yeah. >>: [inaudible]. >>Daniel J. Bernstein: Yeah, this is not counting the printer's paper. So the bits of memory are -Sorry? >>: [inaudible]. >>Daniel J. Bernstein: Yeah, you keep spitting out of this machine the primes in order: 2, 3, 5. Somebody else might write them down ->>: [inaudible]. >>Daniel J. Bernstein: Sorry. Say again. >>: [inaudible]. >>Daniel J. Bernstein: Yeah, something like h bits. This is -- the operations are h over log log h operations, each of which is working on integers of log h bits. So the total number of bit operations is h log h divided by log log h. Now, you're talking about a much, much smaller number, namely, h, of bits that you have to print. It's an important question -- yeah, it's important that these are operations on -- you can count bit operations as well. It takes just a bit more work. Okay. All right. So back to here. And this is actually my last slide. If we want to optimize number theoretic algorithms for real computers, then we really have to reduce the amount of memory we're using and even do that if it means increasing the time somewhat. A great example of this is a paper by John Sorenson a few years ago on the pseudosquares prime sieve which always prints the primes 1 through h in order and is conjectured to take, well, h times log h operations on the log h bit integers and uses only log h squared bits of memory. And, okay, nobody's put it onto this machine yet, but it will run much, much, much faster on modern computer architecture and computers of the future than any of the other algorithms that I've mentioned possibly will. I think it might be possible to improve this a little bit, maybe get rid of almost a log h factor using elliptic curve primality tests using class polynomials and so on, putting everything together that I've said. So maybe the problem of numerating primes is not actually as simple as I once a thought. In any case, even if this is the best possible, I think Oliver really would have enjoyed playing with these computers, and I hope that you will enjoy in the future playing with them too. Thanks for your attention. [applause] >>: Are there any questions? Dan's blown you away, hasn't he? >>: I just wanted to hear conjecture on putting primality the log n squared [inaudible] are there any quadratic [inaudible] lurking in that prediction? >>Daniel J. Bernstein: There are tons of conjectures that are -- I think that would be harder to prove than that. And, yeah, I guess that in particular is one of pieces that's needed. So it's certainly relying on being able to construct all sorts of stuff that we have no way to prove can actually be constructed. So I think you might actually be one of the culprits in conjecturing that the best is log n cubed, and this log n squared relies on many, many more conjectures than would go into previous stuff. It relies on quite a bit of stuff that's way beyond what anybody can prove, but nevertheless I believe this is correct. I think I could even write down an explicit algorithm which I conjecture to reliably determine the primality of n and which takes this amount of time. But, of course, it's way beyond current technology. >>: [inaudible]. >>Daniel J. Bernstein: The actual run time -- that's the easy part to analyze. The actual run time is log n squared times a certain lower order. The hard part is convincing somebody that it actually reliably determines the primality. So that's what Oliver went on for 10 pages in his paper about the linear quadratic and cubic tests being put together, and then when you try to put together more and more tests and doing the analysis of when should they first fail is actually -- again, I think I've done a reasonable first job, but I wouldn't be surprised if I'm off by some noticeable factor, even log squared log n, I would believe. But I think that the final t, the number of tests that I need, is much, much smaller than log n, so log n to the epsilon. >>: Any other questions? >>: I have a question. I'm not sure if I should be asking one of the other speakers [inaudible] on that machine that you showed us if anyone does highly parallelized [inaudible]. >>Daniel J. Bernstein: So the question was whether anybody's tried a parallel pairing on a machine with 3,000 multipliers, and I'm trying to get forward to the picture of it. I have no idea how to actually use this PDF viewer. Flip, flip, flip. This is one of the things that reminds me that I have too many slides. I guess it doesn't help answering your question to show the machine. I just think it's a really -- it's really, really cool, I think -- I've never actually seen this in operation, but all the fans are supposed to be going [inaudible]. >>: You don't actually have one? >>Daniel J. Bernstein: I have one of the lower-clocked ones. I don't have -- this is one of more expensive limited edition -- it's so cool. I mean, [laughter] they only produce like a thousand of them, and they run at something like 900 megahertz instead of 725 megahertz and ->>: It's only cool if you have enough fans. >>Daniel J. Bernstein: Yeah. To answer the question, I ->>: [inaudible]. >>Daniel J. Bernstein: I do believe that there is a group that has started looking at this question. I don't know if they're public about it, and I don't know if they're happy to cooperate with other people about it, but speaking for myself, I'm mostly look at much simpler things. I see all the interesting activity in pairings, and it's very cool to watch and interesting to see how fast things are going. And, yeah, if you believe, as I do, that these are the computers of the future, then it makes perfect sense to optimize pairings for these. But if you looking for people who might have done work in this direction and would be willing to talk about it, then you might have to look around for whether they're willing to say anything. >>: What are the multipliers? >>Daniel J. Bernstein: There's a single precision floating point approximately i triple e. >>: [inaudible]. >>Daniel J. Bernstein: About 24-bit [inaudible]. >>: Does that machine double as also a cook top range [laughter]? >>Daniel J. Bernstein: Sorry? Double as? >>: Could you cook on that machine [laughter]? >>Daniel J. Bernstein: I assume you could, yeah. You just have to take the fans off. >>: So if you can flip back to the graph you had [laughter] ->>Daniel J. Bernstein: This is what I get for promising a series of mini talks. >>: I was just curious. So I guess the Edwards curves with lesser torsion, you expect fewer bits because the torsion is -- it's not as large, but you're being more efficient because the divisions are faster, and it averages out that you're doing better. But how much -- how do the effects cancel? I mean, how much worse is the torsion factor and then how much better is the [inaudible]. >>Daniel J. Bernstein: Okay. So this is what I was trying to address. So what you said is exactly what we expected at the beginning, namely, that these curves would be faster but find fewer primes. And the actual effect is that they're faster and find more primes. Each curve ->>: But you're counting [inaudible]. >>Daniel J. Bernstein: That's true. So this is the combined effect. So out of this, some part of the distance here is the speedup which is some percentage -- I mean the gap between, like, the 7250 or so and the 7500 -- I don't know offhand what the answer is for this. This is showing the overall effect of switching to the -- from x squared plus y squared equals 1 plus dx squared y squared with torsion say z mod 12, this is -- these guys, I think this one and this one, is minus x squared plus y squared equals 1 plus dx squared y squared in a particular family with torsion z mod 6. Part of the gap here -- so I do know for each of these that it's not a cancellation. So for each of these it is an improvement in the number of primes found and an improvement in [inaudible]. So each curve is running faster than the previous curve. You can go to the paper. It's on online -- it appeared at [inaudible] just recently -- and you'll get tables which say what the number of primes actually found is in each of these cases. That's not what this graph is showing, so you might think that, yeah, the graph is saying oh, well there's a speedup, but then a loss in the -- from the primes found. But, no, it's actually there's some speedup and there's an improvement in the number of primes found. So that was the surprising part of looking at this was that actually these curves are better for reasons that are still not known. You might think -- if you start looking at what happens with the 2 torsion and 4 torsion over appropriate quadratic extensions, then you can argue why it should be as good to do these as the previous z mod 12s. But that still doesn't say why they're better. And they really are better. It's just -- it's a certain percentage improvement, not a huge improvement, but it's quite a surprise. Certainly something that should be explored. >>: A small addition to this screen, the 2 times 4 and the 8 are [inaudible]. >>Daniel J. Bernstein: Yeah, that's right. So up at the top, the -- so these are, you can see again, multiple levels for multiple different families that we were actually trying. Except for some sporadic curves that do substantially better, most of the curves in the 2 times 4 -- excuse me, most of the curves in the 2 times 4 and 8 families are worse in the number of primes found, I mean much, much worse than the z mod 12 or z mod 2 times z mod. In z mod 6 some interesting things are happening, some of which we understand, and those could, again, explain why this is getting to be a little better than the previous just being faster and about as effective, but why they're more effective, no idea. >>: [inaudible]. >>Daniel J. Bernstein: The x axis is an inverse error function distribution. So if you want to turn a normal distribution into a straight line, then you use this distribution. >>: [inaudible]. >>Daniel J. Bernstein: Oh, sorry. This is 1,000 curves sorted in performance order, and the scale is chosen as inverse so that if it were a normal distribution, you would get a straight line. >>: And then you said that [inaudible]. >>Daniel J. Bernstein: [inaudible]. >>: The best price performance ratio currently, does this include the cost of powering the unit to run it? >>Daniel J. Bernstein: That's a good question, yeah, how much do you pay for the power of these things. I think given the amount of computation you're getting out, even if the power ends up doubling your budget, then it's certainly worth it. In fact, depending where you live, the power can cost quite a bit. So in the typical, say, five-year lifetime of a machine -- these things don't have five-year warranties -- but if you imagine this going for five years, then you could easily spend as much on power as you do on actually buying equipment. If you buy two of these along with a regular CPU and disc and so on in a case and you end up spending, say, $2,000 on a PC with 4.6 times 10 to the 12th multiplications per second, then you could easily spend $2,000 on power. Depends where you are how much your power bill is. Of correspond, with solar energy, in principle you should be able to power something like this for -- sustainably without very much area because the sun is producing a huge amount of energy on the square meter of the earth's surface. Think green [laughter]. >>: [inaudible]. >>: Can you come back to the picture again [laughter]? >>: [inaudible]. >>Daniel J. Bernstein: No. There are some sporadic curves. So actually -- okay, something we were expecting is that there would be, of course, random variation between curves. So if you're willing to say, okay, actually I'm going to maliciously think -- you know, all the numbers I care about factoring, I care about finding, say, 25-bit primes. I'm going to precompute curves which are really good at finding 25-bit primes, and so you would expect some random variation, and that's what you see the straight lines for. Now, when you see deviations, some of them are -- here there's actually two different families which separated by this type of experiment. But some of them are curves which are sporadic curves which are better. And now we know that those curves are better because the same curves are showing up as much, much better -- well, first of all, it's kind of implausible that it would be like this, that you would have such a jump, for instance, for the green curve randomly. But beyond that, the same curves are good for 26-bit primes, 25-bit primes, 24-bit primes and so on. And that's something that can't happen by random. >>: [inaudible]. >>Daniel J. Bernstein: Well, the right edge -- those are the slopers. Those are the ones to throw away. The interesting ones are the fast ones over on the left here. And these are good families. This is a z mod 6, not so good family, although still better than anything we can do with 12 or 2 times 8. So there are multiple families which are stratified somehow. It's possible that that is something easily explainable from looking at torsion over small extensions of q, but then the sporadic curves here don't -- we've looked at some of those and don't have any idea what's making them so fast. And in general the z mod 6, the red line down at the bottom, no idea what's making it so fast. >>: [inaudible]. >>Daniel J. Bernstein: Yeah, the sporadic ones are great. Something like this is almost as good as the worst z mod six ones. And this is not a bad curve to use. Just by randomly looking around and seeing which curves are the best ones, some of them are surprisingly good. >>: Any more questions? Any more reluctant questions? Okay. Well, let's thank Dan again and -[applause]

>>: This afternoon's session is, as I say, a... you who never had the good fortune to meet Oliver,...

Related documents

Products

Support

&gt;&gt;: This afternoon's session is, as I say, a... you who never had the good fortune to meet Oliver,...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>>: This afternoon's session is, as I say, a... you who never had the good fortune to meet Oliver,...