>> Kristin Lauter: Okay. So today we're very pleased to have Neal Koblitz from University of Washington visiting and speaking on special versus random curves. Neal Koblitz is the founder of elliptic curve cryptography and the author of several books, including a book on cryptography called Introduction to Number Theory and Cryptography. And he has visited us several times here at Microsoft Research, but not for quite a while. So we're very pleased to have him. Thanks. >> Neal Koblitz: Thank you, Kristin, and thank you for the invitation. I'm very glad to be here. This talk is based on a paper, a much longer paper written jointly with Alfred Menezes and with my wife, Ann Hibner Koblitz. That's posted on the e-Print server, if you're interested. And I want to apologize in advance for clumsiness in my use of PowerPoint. This is only the second PowerPoint talk that I've ever given. The first one was a disaster, so... Okay. So the conventional wisdom in cryptography is that you get greater security if you choose any parameters that are at your disposal as randomly as possible, so that in particular in elliptic and hyperelliptic curve cryptography the safest option is to choose the defining equation to have random coefficients. Now, it doesn't mean you can't use special curves, special choice of parameters often improve efficiency. And if you're only interested in short-term security or if you're not particularly paranoic about security issues, that's fine. But someday that choice might be one that you regret. At least that's the conventional wisdom. Now, so, for example, when I proposed hyperelliptic cryptography in the late '80s, one could justify the choice of hyperelliptic curves over -- as opposed to elliptic curves by thinking in terms of this conventional wisdom. So the idea is that there's a new parameter; namely, the genus of the curve. An elliptic curve has genus 1. We can vary G instead of fixing it to be 1. So -- and I thought that the higher the genus the more complicated an object you're working with. The Jacobian group of the hyperelliptic curve is a more complicated object than an elliptic curve. And you might just choose G randomly, like it'd be a random prime number. So here in a paper I wrote a couple years later, I chose a random prime; namely, 191, which has no special properties. And I gave a specific curve that was easy to compute the number of points on because I didn't have any point counting techniques, so I just chose a very simple curve, and its Jacobian group has three times the prime number of points, and I said this might be good for cryptography. At the time I was also thinking that let's take, say, G equals 191 over the field of two elements, which would result in a group of approximately the size you want. We could choose maybe even random coefficients and it's just a lot better, I thought. Let's say that would give you 382 random coefficients rather than two random coefficients that you have with elliptic curves. Well, my fallacies at the time -- well, first of all, a conceptual complexity is not the same thing as computational complexity, because that was the first elementary fallacy. When you read books about -- introductory books about -- introductory books about algebraic geometry and curves, they talk about G as a measure of the complexity of the curve. But it turns out that even a random genus G curve over the field of two elements is not helped from the standpoint of security by all the hundreds of random coefficients you can have. So there were basically two fallacies: one, thinking that having a large number of randomly chosen parameters would help security, and the other was the fallacy of thinking that a more complicated object would help -- a more complex object would mean greater complexity of the computational problem. And both of these were basically rookie mistakes that I made at the time in thinking these things. In fact, thinking that the more random parameters you have to play with the greater the security is almost as silly a mistake as to -- as the classic beginner's mistake of thinking that if you have a very large key space you're safe. It's on that level. Well, it's well known what happened. In 1994 Addelman, Demarus and Wong [phonetic] showed that high genus curves over small fields are insecure, and this was a big disappointment to me and a big shock at the time; although, it didn't mean that hyperelliptic curves were totally useless. >>: If I get it right, low genus curves [INAUDIBLE] are insecure trivially, because that's not enough points. >> Neal Koblitz: Yes. Yes. Yeah. In this -- you're going to want a group of a certain size. The size is essentially fixed, or the bit length of the size of the group is basically fixed, yeah. Okay. So now right now it's known that basically the only genus that's a serious contender for being as secure as elliptic curves is genus 2. That even genus 3 [inaudible] anyway can be -- the discrete log problem can be broken. Discrete log problem of course is the fundamental hard problem that's at the heart of all these systems. And if you compare the discrete log problem on a genus G curve for G greater than 3 for fixed bit length of the group size, you need larger groups for the same level of security, if genus is 3 or greater. It's only genus 2 that remains as a serious contender. Notice that one of my fallacies was that you can have the same word with a common real-world meaning and then a special cryptography-type meaning. So in that case it was the word complexity, which means something in a conceptual sense, but it means something different in a computational sense. And the same goes for another word that I want to talk about today, which is special as opposed to random. That also has to be a word that has to be used with care. And let me give an example, a recent example that shows some of the difficulty of this word. Now, from a mathematical standpoint, for genus at least 3 the modulized space of genus G curves has dimension 3G minus 3 whereas the subspace or submanifold consisting of the hyperelliptic curves is much smaller, it has codimension G minus 2. That is, if you choose a curve randomly over the field of Q elements, it has a 1 out of Q to the G minus 2 chance of being hyperelliptic. So in terms of the dimension of the space of these curves, the hyperelliptic curves are a very special subset of all curves. So conceptually the hyperelliptic curves are special and the nonhyperelliptic curves are the generic ones, in the sense that a random curve is almost certain to be nonhyperelliptic. Yet, a few years ago Diem and Thomé found an index calculus attack on the discrete log problem in the Jacobian group of a genus 3 nonhyperelliptic curve with running time of order Q, Q to the 1 bounded by Q to the 1 plus epsilon. So now the generic discrete log algorithms, so-called square root algorithms, in this case, the group involved, if you have a genus 3 curve over the field of Q elements, the Jacobian group will have order Q cubed. And so this is like a cube root attack. Now, in the case of hyperelliptic curves, the best one can do, the fastest attack on the discrete log problem is Q to the four-thirds plus epsilon, which is slower. So in the case of the nonhyperelliptic curves, there's a cube root attack that is an attack whose running time is of order the cube root of the group order, whereas in the hyperelliptic case, it's a four-ninth power attack. Now, as I said, once you get up to genus 3, you can do better in the square root attack. That's why genus 3 is not fully competitive with elliptic curves, for a well-chosen elliptic curve, there's nothing better than a square root attack. In the case of a hyperelliptic genus 3 curve, there's a four-ninth power attack on the discrete log problem. And in the case of a nonhyperelliptic curve, there's a still better attack, cube root attack. So the hyperelliptic algorithm is better than square root algorithm, but the nonhyperelliptic algorithm is still better. Oh, and I should say that -- there is a conceptual reason for it, what's at the heart of this discrepancy is that a nonhyperelliptic curve can be represented as a smooth curve on the -- smooth plane curve of degree 4, whereas a non -- whereas a nonhyperelliptic curve cannot be so represented. And it was the possibility of representing a curve in that particular way that led to the Diem-Thomé algorithm. Diem was able to generalize this to a very large class, so-called sufficiently general nonhyperelliptic curves of any genus, and he found that expression Q minus 2 divided by G minus 1 plus epsilon for nonhyperelliptic curves, this is an arbitrary genus at least 3. And this should be compared with the best general algorithm for the hyperelliptic case, which has a slower running time because you have G in the denominator rather than G minus 1. So what happened in genus 3 also occurs in higher genus as well. So what that means is that in terms of the complexity of the attack on the discrete log problem, a G dimensional, nonhyperelliptic group has the same complexity as a G minus 1 dimensional hyperelliptic group. So you'd have to go to one higher dimension to get the same level of complexity in your problem if you're using nonhyperelliptic curves. So with hyperelliptic curves, you can achieve the same level of security with one lower -one lower genus, meaning one lower dimension of the group, which is a big difference in its size, Q to the G versus Q to the G minus 1. Okay. So the conclusion is given the present state-of-the-art algorithms for genus at least 3, a random genus G curve is less secure than genus G hyperelliptic curve. So okay. So now I want to go to another issue or another part of my talk about the whole question of special versus random. So and this is sort of interesting from I guess a sociological standpoint because what one sees on this issue of preferring special versus random curves, you see a sort of national division here between NSA and the German equivalent, BSI. Now, NSA has a long history of supporting the use of special curves in elliptic curve cryptography. In fact, the very first public presentation at a Crypto conference by an NSA person was Jerry Solinas's paper at Crypto '97 on anomalous binary curves which have their equation defined over the field of two elements. They're basically the ordinary curves over the field of two elements. And he -- his paper was devoted to very efficient computations on those curves that improve the efficiency of crypto systems based on those curved. In addition, NIST, which is essentially NSA, has implicitly endorsed pairing-based cryptography. And they organized a workshop on it in June. At that workshop a company, Voltage, presented a choice of curve for pairing-based cryptography; namely, the supersingular curve Y squared equals X cubed plus B over a prime field where 12 divides P plus 1 which among other things means that it's supersingular. And this is a very, very special curve. It has all sorts of special properties. So that's another example. It's not exactly NSA pushing this, but it's certainly -- NIST has been very cooperative with pairing-based cryptography. Meanwhile -- yeah. >>: Do you happen to know if Voltage has any patents on speeding up that particular curve? >> Neal Koblitz: They might. I don't know what the patent situation is, though. Now, there's a European consortium called Brainpool that's led essentially by BSI that has made some very different recommendations that I think provide amusing contrast with the role that NSA and NIST have been playing. First of all, according to their draft recommendations, when you have -- an elliptic curve over a finite field can be lifted to a curve over a complex multiplication field, and they insist that number field must have degree greeter than 10 million. So if it lifts to a field -like Voltage's curve lifts to a curve over the rational numbers, which has degree 1. So it very much violates this, to say the least. That also precludes the anomalous binary curves, which have complex multiplication by Q by a class number 1 numbering. >>: [inaudible] >> Neal Koblitz: Why? Well, to -- presumably to prevent the possibility that someday someone will find a way to use the theory of global elliptic curves over those number fields. They want the number field to be big enough so that computations are not feasible there in case someone finds a way to use computations there to -- in some sort of attack. So it's based on speculation related to possibilities for future attacks. Secondly, they inquire that -- they require that the embedding degree -- by the embedding degree in an elliptic curve system it means that the smallest degree of an extension of the finite field into which the elliptic curve group can be embedded. So you can embed an elliptic curve group into the multiplicative group of a finite field, but you have to go to a field extension to do that. And in a randomly chosen -- if your parameters are random, that extension degree will be astronomically big. And that's what they insist on. In fact, they insist on that the embedding degree, that's the degree of the field extension, should be greater than Q minus 1 over 100. >>: [inaudible] >> Neal Koblitz: Pardon? >>: [inaudible] >> Neal Koblitz: Their requirement? Their reason for doing this? Well, certainly avoiding very small embedding degree has a reason connected with the discrete log problem in the finite field. Why it should be that big, I mean, you really have to ask them. So in particular, this immediately precludes all pairing based cryptography. Now, they really want to err on the side of caution because, for example, they're saying that if you use an elliptic curve that embeds in a finite field extension of K, Q minus 1 over 1,000, that's too risky. So that would mean if K has -- so that would mean Q is presumably -- has about 160 bits, so K, if K has 150 bits, that's too risky. >>: If I remember [inaudible] NSA and NIST [inaudible] and in two years 160 [inaudible]. >> Neal Koblitz: Okay, okay, I'm -- this is just, you know, just to give an idea. But let's say that K has only 160 -- that the field size has 160 bits, in that case we're talking about K of 150 bits. And you certainly wouldn't have it any less than that. So K, this excluded K by Brainpool would have at least 150 bits. Now, the fastest algorithm for the discrete log problem in that field has running time, this number, the running time is about 10 to the 400 trillion operations. But still that's not safe enough, according to Brainpool. Well, in practice what they're really doing is they're insisting on random curves. >> Kristin Lauter: So do you think they know any better algorithms for curves with [inaudible] discriminate field? >> Neal Koblitz: I don't think so, but I'm not privy to what they might know. But I don't think so. So in practice what they're really doing is saying we have to use random curves. So they won't allow you to use the CM method, anomalous binary curve, supersingular curves, no pairing-based cryptography. Now, that's an incredible contrast, you know, between the two. So and at least I found this surprising when I realized how extreme this was. Now, one theory, some might be tempted to talk about German versus American national traits. For example, in Germany, this -- my friend Johannes Boatman [phonetic] who I was visiting many years ago in Germany told me the story that in Germany it's -- there's a law saying that motorists must carry some rubber gloves in their car. And I couldn't figured out why. He explained why they have this law. It's because there's a Good Samaritan law in Germany which means that if you see an accident, you're required to stop and help the injured parties. But there's always a chance that someone's who's bleeding might be HIV positive, and in that case you still have to be a Good Samaritan, but you're require today have these rubber gloves that you can use to handle this. And this to me sort of like epitomized a very cautious attitude towards life to have this requirement, whereas in contrast the American stereotype is that Americans are very happy to indulge in high-stakes, very risky gambling. So that's attempting a -- an explanation, but I think it's a bogus one. It's up in the air why it is that Germany went in one direction and the NSA in another. And as far as in our paper, we're completely agnostic on the question of who's right about this. We're not claiming that NSA is being reckless and risky, or NIST or Voltage, and we're not claiming that Brainpool is being ridiculously overcautious. An argument can be made either way. But, now, the irony of this is that it's not really a simple issue do you want to be extra cautious or do you want to be a little bit reckless or what some people might consider to be reckless, but Brainpool would consider to be reckless. It's not really a clear-cut issue because one can imagine scenarios where Brainpool's approach might not be the safer one even though they're insisting on random curves. So there are various scenarios in which someone, and I'll call her Alice, who chooses ECC with a special curve might end up better off than someone else who I'll call Bob who chooses a random curve. So these scenarios are suggested by recent work on isogonies. So let me just quickly summarize that. But some people -- well, Venki [phonetic] I know is an expert on this more than I am, so I probably -- and Kristin's done work on this too, so I shouldn't be the one talking about isogonies, but I'll quickly go over the basic bases of what an isogony is between two curves. So we have two curves defined, two elliptic curves defined over the field of Q elements. And isogony is simply a nonconstant rational map defined over FQ -- that is, an isogony defined over FQ is a nonconstant rational map defined over FQ that takes the point infinity to the point infinity. Its degree is its degree as a rational map, which also in the case we'll be considering is the order of the kernel of the isogony. An isogony, there's a dual isogony going the other way, and so there's an equivalent relation between elliptic curves and being isogenous, so this is a larger class than isomorphism. Then there's a basic theorem by Tate related to curves over a finite field, that they're isogenous over the finite field if and only if they have the same number of points over that finite field. Now, from a computational standpoint it turns out that low-degree isogonies are easy to construct by high-degree isogonies are usually not, especially if you don't have an explicit form of the isogenous curve; that is, if you're just given a curve and you want to construct an isogony of some large prime degree, that's a very hard computational problem. Now, it also has to talk about endomorphisms here. So if we have an elliptic curve over the field of Q elements, the trace is the difference between Q plus 1 and the number of points. An endomorphism is an isogony to the curve to itself that's defined over the algebraic closure. In the case we'll be considering the nonsupersingular curves or ordinary curves E which means that T is prime to the characteristic of the field. In that case, all endomorphisms defined over the algebraic closure are actually defined over the field of definition. That's the case we'll be considering. So the ordinary case, the case of ordinary elliptic curves, which is the usual case, the endomorphisms are all defined over the field, the definition itself. Now, the endomorphisms form a ring that contains the subring of the obvious endomorphisms of scale or multiplication. Now, the delta -- the discriminate of a curve is the square of the trace minus 4Q, which is a negative number. And the CM field is a quadratic imaginary field generated by the square root of the discriminate. Now, that discriminate, if we write it in the form D stands for the discriminate of the field, so D is a fundamental discriminant, the discriminant of a quadratic imaginary field, delta in general is equal to the discriminate of the field multiplied by some square. And that square plays a crucial role in classifying the possible endomorphisms that -- the possible endomorphism rings that E could have. Well, it turns out that the endomorphism ring of E is an order of the ring of integers of that quadratic imaginary field. So this is all part of the sort of basic theory of complex multiplication. But it's not necessarily the full ring of integers. In some cases it is, in many cases it is. But in general it will be in order of the range of integers of this quadratic imaginary field of a certain index C which is called the conductor of the endomorphism ring, not to be confused with other meanings of the word conductor. So the conductor of the endomorphism ring of an elliptic curve tells you something about -- tells you its index in the maximal possible, the largest possible endomorphism ring. So if you take all elliptic curves that are isogenous to the given elliptic curve, they can be partitioned according to their endomorphism ring. Namely, the endomorphism ring are determined by the conductor C which are in 1-to-1 correspondence with the divisors of that factor that's being squared in the determinant. So these are basically all of the facts that we need, and I went through quickly because, you know, it's part of the basic theory of complex multiplication and endomorphisms, and it would take a lot of time to go into any more detail on where all this comes from. Okay. Now, we ask how many isomorphism classes of elliptic curves are in a given endomorphism class. The answer is that it's the class number of the order which is related to the class number of the field. It's essentially proportional to the conductor; that is, if in the case of conductor 1, it's the class number of the field. But if the conductor is larger, there's a larger number of isomorphism classes with the endomorphism ring. In a sense, the smaller the endomorphism ring, meaning the larger C is, the more curves there are, the more different isomorphism classes there are with that particular endomorphism ring. So that's basically what we need to know. So, for example, if -- if the discriminant is square free, then all of the curves in an isogony class -- and I should have said before that in fact the number of isomorphism classes in the isogony class of an elliptic curve is of order of the square root of Q. Remember, the isogony classes correspond to the number of points on the curve, and the number of points of the curve fall in the [inaudible] interval of which there are roughly 4 of the square root of Q possibilities, so there are roughly 2 of the square root of Q possible isogony classes, and there are of order Q elliptic curves, and so there are roughly the square root of Q curves in each isogony class. Now, if a delta is square free, then they all have the same endomorphism ring of conductor 1. There's no -- C0 is just one. So that's the simplest case. They're all in the same class. If -- another special space, if [inaudible] large prime, then there are two endomorphism classes. There's the isogony class consisting of a small number of curves whose endomorphism ring is the full ring of integers; namely, the class number of the quadratic imaginary extension. That's how many -- which will be a quite small number, probably. And the remaining curves, the vast majority of them which will have endomorphism ring of conductor C0. Okay. Now, concerning the isogonies, let L denote a prime. Now, if there's a degree L isogony between two curves, then either the two curves have the same endomorphism ring or else the conductors differ by -- in one direction or the other by a factor of L. So if we have two endomorphism classes, by the conductor gap, we mean the largest prime that divides one conductor and not the other. And that determines how easy it is to go from one class to another using isogonies. So what we're going to be talking about is going from one endomorphism class to another one using isogonies, and in order to change the endomorphism class, we have to have the -- the gap between the conductors has to be the prime degree of the isogony. Now, if there's a large conductor gap between two endomorphism classes -- that is, there's a large prime that divides one conductor, the conductor of one endomorphism class and not the conductor of another endomorphism class -- then one cannot go from a curve of one class to a curve of the other by a string of low-degree isogonies. So remember I said that in constructing isogonies the basic fact is if you're given a curve and you want to construct a degree L isogony where L is a prime and that's all you're given, you just got to construct this isogony, if L is small you can do it, if L is extremely large you can't. And so if there's a large prime that divides one endomorphism class's conductor and not the other one, then in practice you can't go using isogonies from one endomorphism class to another. And, conversely, if there is no large gap, then by a result of people who were here at the time, Jau Miller [phonetic] and Venki, within an endomorphism class or among several classes but with small conductor gaps, one can travel randomly and uniformly through the set of curves by just a sting of low-degree isogonies. Now, the thing about isogonies, they allow one to transport the discrete log problem from one curve to another. So the discrete log problem is random self-reducible within a set of endomorphism classes with small conductor gaps. So what that means -- well, first, let me give the definition. By the L conductor gap class it is the set of all endomorphism classes in the isogony class of E that have conductor gaps smaller than L. So what this means basically is that if you have -- if you were to find a faster algorithm -let's say you found an algorithm that solved the discrete log problem in time T1 in a certain proportion of all elliptic curves, that there's some criterion that if an elliptic curve happened to satisfy you could apply this new algorithm. And so there's a certain proportion of weak curves. And let's suppose that the property being a weak curve is independent of the isogony and endomorphism class, then if you had an L conductor gap class, so you could travel freely around that class, then you could solve the discrete log problem on any curve in the class in time T1 plus T2 over epsilon where T2 is the amount of time it takes you to construct a low-degree isogony. So low degree means degree less than L, that you can -- using degree less than L isogony you can jump around randomly and uniformly in this class. And epsilon of course is the proportion of weak curves. So it takes you time -- T2 divided by epsilon to find a weak curve, and then T1 to solve the problem once you get there. And this of course only works if the L conductor gap class contains more than 1 over epsilon curve so that you have a good chance of finding a weak curve. So the whole point of this is that it's the possibility of random walks, random sort of strings of low-degree isogonies through a conductor gap class that under certain circumstances might make a random curve less secure than a special curve. So that's why I want to give some example of. Now, notice, it's important to note that a random curve -- for a random curve you'd expect that all isogenous curves are in the same conductor gap class because delta has negligible probability being divisible by the square of a large prime. So there just aren't going to be any large prime around that could conduct -- that could divide the conductor of the endomorphism ring. So we'll look at some hypothetical scenarios. And all of this is hypothetical, and we're not talking about algorithms that -- well, we're not talking about things that are occurring in the real world at present with random curves. Okay. So here is -- I'll have time for a couple of examples. Muller in '98 suggested some curve for elliptic curve cryptography that generalized slightly more general cases besides the anomalous binary curves that are defined over the field of two elements. He suggested some curves defined over very small degree extensions of F2. So here's an example, one of his examples. Let's let Q be 2 to the 177th power. So this is 2 to a composite degree. And let gamma -- that's supposed to be a gamma, but unfortunately PowerPoint has terrible gammas, so it looks like a Y, but that's not Y, that's gamma. So let Y be a generator of the degree 3 extension of F2 satisfying gamma cubed equals gamma squared plus 1, and let EB be the following elliptic curve. This is one of Muller's elliptic curves, very similar to the anomalous binary curves but with -- defined over F8. Its group order is over that particular field extension is -- so this is a prime degree extension of F8. It's a degree 59 extension of F8, and it turns out that its group order is six times a prime of suitable size for elliptic curve cryptography, and that was one of a handful of examples he suggested. Okay. Now, suppose that Alice -- remember, Alice is the one whose using a special curve, and she read Muller's paper and followed his suggestion and chose this E. Now, she figures that solving the discrete log problem by the Pollard method, by the square root of attack, will take roughly 2 to the 84 operations. Now, there's a slight speedup whenever you have a curve defined over a smaller field and then you work with it over an extension. There's a speedup which is really quite small but still has to be taken into account of the square root of the extension degree because of ways you can group together points that are in the same conjugacy class under the Frobenius map of the extension of finite field. So you can sort of group together points in sets of 59 points and apply Pollard row to those -- instead of to the set of points, you can apply to the set of conjugacy classes and get this little speedup. That's about speeds -- that reduces your security by 3 bits. So that's why she as 84 bits rather than -- okay. Yeah, normally she would have -- well, also, okay, so it's 175 bit prime, so she would normally have 87 bits, but it's reduced to 84 bits because of this speedup. Okay. Now, Bob thinks that Alice is foolish, first of all for having chosen a curve with very special properties that not only allow for this one speedup that we know about but who knows what else could result from choosing a special curve. So it could leave her vulnerable to other attacks. So Bob figures that he'll say fine, let's use -- if you want to work over the field of 177 elements, let's do that, and he choose a random curve over the same field with group order -- 2 always divides the group order if it's an ordinary curve, but you could get twice 176 bit prime working over that field with random coefficients. Then he'll get 88 bits of security rather than 84. And he'll also be less vulnerable to special attacks. So that's what he figures. Okay. So here is Bob lecturing Alice with condescension oozing from his voice that she was really quite foolish to choose this very special curve. Well, he'll get more security choosing a random curve, even using the field she wants to use. And until recently Bob's reasoning would have appeared to be correct; that is, that you'd be better off from a security standpoint using random coefficients over this field. But some work in 2006 by Alfred Menezes and Edlyn Teske on [inaudible] descent shows that Bob might not have nearly the security level that he thinks he has; namely, they found that a certain proportion of all elliptic curves over this particular field, the field of 177 elements, the same field that was in Muller's paper, that a certain proportion of all elliptic curves with group order congruent to 2 mod 8, which is half of the them, if you choose randomly, are weak in the sense that the discrete log problem can be transported to the Jacobian of a genus 3 hyperelliptic curve over the field of 2 to the 59th element. So [inaudible] you take a -- you take a curve -- an elliptic curve defined over a composite degree extension of F2 and you transport the discrete log problem to a hyperelliptic curve over a smaller field. And if you're lucky, the genus of the hyperelliptic curve will be equal -- you'll get a group of the same size. You might not, but in these cases you get a genus 3 curve over the field 2 to the 59 element whose group order is also 2 to the 3 times 59. And there, as we saw, you have a four-thirds power, Q to the four-thirds algorithm, which is about 2 to the 79th, is how long it takes to solve the discrete log problem on that curve. And now this weak property is likely to be independent of isogony class. Now, in Bob's case the discriminant of his curve is almost certainly not divisible by the square of a large prime, and so it will be feasible to use isogonies to transport his discrete log problem along a random walk through the isogony class. Now, each isogony in this case given the current state of these algorithms take about 2 to the 17 to construct. So in this case epsilon is 2 to the negative 58, so it will take just time about 2 to the 75th to transport Bob's discrete log problem to a weak curve. If his group order is congruent to 2 mod 8. Their results apply only in that case. So maybe Bob's lucky -- what was lucky -- of course, Bob did this before he knew about the result, so he had no way of knowing that he should avoid group orders congruent to 2 mod 8. So if he was lucky, his group order is congruent to 6 mod 8. But there's a 50 percent chance that his group order is congruent to 2 mod 8, in which case in time of order 2 to the 75th, his discrete log problem can be transported to a weak curve so that he has actually 79 bits of security, not the 88 bits that he thought, and not even 84 bits as Alice has. Okay. So basically because of this difference between genus 3 and genus 1 where genus 3 you have a faster than square root algorithm, the four-thirds power is not that much faster than the three-halves power of Q, which is what a square root algorithm would give you. But it's enough to make a difference of in this case 9 bits of security and put them in a worst-case analysis. And of course Alice also has greater efficiency with her special curve. So she sort of gets the last laugh on that. Now, if even if -- it turns out that Alice's group has group order congruent to 6 mod 8, but let's just say for the sake of argument that the Menezes-Teske result applied to -- didn't have that condition, nevertheless she'd still be safe because she was working with a special curve. That's because her curve's endomorphism ring has conductor 1 and lies in conductor gap class. Now, 2 to the 66th, if you choose L, that capital L to be 2 to the 66th, that's far above the range where you can construct isogonies. For a prime greater than 2 to the 66th, you cannot construct an L isogony. Now, in this case, for her, her discriminant does have a very large square factor, of course, as special curves always will. And this square factor has a large prime and then an intermediate-sized prime. And so using isogonies, you can go from her curve, which has conductor 1, to curves that have conductor 11,681. That's feasible. It's somewhat time-consuming, but it's certainly feasible to go -- to go outside her very small -- her very small endomorphism class to the endomorphism class with conductor 11,681. You can do that, but the total number of curves both in her endomorphism class and in the endomorphism class of conductor -- her endomorphism class is conductor 1 and the larger endomorphism class of conductor 11,681, the total number of curves is approximately 2 to the 16th. So there's negligible probability that any one of those curves that you can get to using isogonies from her curve will be susceptible to the Menezes-Teske version of day descent [phonetic]. So the so-called weak curves, remember there are roughly 2 to the negative 58 of all curves are weak in this setting. And because of the particular nature of her special curve and it's the factorization of its discriminant, there's no way of using isogonies to get from her curve to 2 of the 58 curves. So it's highly unlikely that her discrete log problem can be transported to a weak one by an isogony walk. So what saves Alice is precisely the very special nature of her curve, the fact that it has -- the endomorphism ring has conductor 1. Okay. So that's an example with a composite degree extension field where these results using day descent apply. Now I'll be more hypothetical and imagine algorithms that don't at present exist, which Brainpool has done, so I figure they've opened the way to making maybe outrageous speculations, so I'll do that too. Now, in the next example, we'll take a prime degree extension of the field of two elements. And in fact in almost all practical implementations of elliptic curve cryptography, it's prime degree extension fields that are used in the characteristic 2 case. So this is more realistic in practical terms in that sense. Now, let's suppose, and here's where I'm being very speculative, that some version of day descent or another approach someday leads to a faster than square root attack on a small but nonnegligible proportion of curves defining over F -- defined over this prime degree extension of F2. Now, right now any of the really good day descent methods require a composite degree extension of F2. But it's conceivable that that could change or there's some totally different attack would apply to prime degree field extensions of F2. So let's just suppose that that happens. And let's look at the digital signature standard recommendations for five elliptic curves. In 2000 this recommended five specific elliptic curves over prime fields and ten over binary fields. And they had five different binary fields at different security levels, and for each one they suggested one random curve and one anomalous binary curve. Now, the largest case for greatest security is the degree 571 extension of F2 which should provide plenty of security to protect a high-security AES private key. So that was motivation for going up to that high of a degree. Now, the conventional wisdom in line with what was on the early slide is that if anything, if there is any difference in security level between the two curves that are recommended for that field, the random one and the anomalous binary one, the random one, R571, is the safer choice than K571 binary code they recommend. However, that's the conventional wisdom. Let's suppose that a proportion epsilon of all curves over this field could be attacked by this hypothetical algorithm. And let's always suppose that the weak property of being susceptible to this new algorithm is independent of isogony and endomorphism class. Now, the curve R571 has square free discriminant as random curves often do. And so the isogony walks can fan out from that curve throughout the isogony class, which consists of 2 to the 285 curves approximately. So after approximately 1 over epsilon isogonies, whatever epsilon is, the DLP, the discrete log problem can be transported to a weak curve. But in contrast, the anomalous binary curve has discriminant -- the square free part is just negative 7 because it's an anomalous binary curve, which always has the discriminant for an anomalous binary curve, whatever the extension field is, always has the form negative 7 times a square. And in this particular case the number square happens to be the product of a fairly small prime and an extremely large prime. What this means -- the endomorphism ring of the actual anomalous binary curve is always -- has conductor 1, because it has a very large ring of endomorphism. That's in fact why it's sufficient, why it was possible for Jerry Solinas to develop these really nice algorithms for point multiples. Because you have this tremendous ring of endomorphisms to work with. So you have the full ring of integers of Q root negative 7 endomorphisms, so it's conductor 1. And the -- so if you take the 2 to the 262 conductor gap class of this curve, there are original about 2 to the 22 curves. That is, there's the one curve, K571 itself, and then there are also about 2 to the 22 curves that have -- whose endomorphism ring has conductor equal to this 22 bit prime factor NC0. So if epsilon is much less than 2 to the negative 22, less than 1 out of 4 million probability that the special attack will work, then the discrete log problem probably cannot be transported to a weak curve by isogonies because there just aren't enough curves in its conductor gap class to move around to. And under these hypothetical assumptions, that special curve is likely to be safer than the random curve. So that's -- so that's another example. Again, hypothetical example where the random curve would be more dangerous. And finally a final sort of setup, let me just suppose that we're worried about a new approach, the discrete log problem might turn out to give a faster than square root attack for a certain proportion; again, a small but nonnegligible proportion of concerns defined over a large prime field. In that case, if we're thinking about that, we might want to choose our elliptic curve to be in a very small L conductor gap class where L is large, so that an attacker could not use isogonies to transport the problem to -- the discrete log problem to a weak curve. In that case there's a very, very easy construction which is sort of fun to ask questions about, some nice number -- analytic number theory question you can ask about this. Just choose B to B, a random prime of whatever order you need for security. A, a random even number which has a couple conditions. You want A squared plus B squared to be prime and you want either one of those two to be prime, either P plus 1 over 2 minus A or plus A to be prime. In that case, the curve with the very special equation Y squared equaled X cube minus alpha X has two end points where alpha is a quadratic nonresidue in the prime field, and the quartic residue symbol of alpha depends on the sign that we chose when we defined N. And the trace then is plus or minus 2A and the discriminant is easily computed to be minus 4B squared, and by construction B is a prime, which is why we did that, why we chose B to be a prime. Then this particular curve up there has conductor 1. It's easy to see that it has complex multiplication by I, so it has complex multiplication by the full ring of integers out of conductor 1. And it's the only isomorphism class in its B conductor gap class. So it can't be moved anywhere using isogonies because you're never going to be able to construct a degree B isogony. So all other isogenous curves have endomorphism ring of conductor B for any reasonable K, it's not feasible to transport the discrete log problem from E to any other isomorphism -- any isogenous isomorphism class. Now, this curve is totally against the device of Brainpool, is very concrete, very, very special curve with no randomness in it, or hardly -- very little randomness in the equation. And completely goes against the device of Brainpool. Now, whether it's reckless to do this or wise to do that is just a judgment call. And we're not saying that people should use that curve, it's just if one's worried about the possibility speculatively about these sort of algorithms it might apply to only epsilon of curves, then it might be reasonable. So the conclusion is not that we should prefer special curves over random ones, I'm not making an argument that, oh, it's bad to use random curves. And I'm not saying that Brainpool is wrong. We're sort of agnostic on that question. Our only real point is that we don't really know and that some humility in dealing with these issues is called for. And I think one of the purposes of a lot of the joint work that I've done with Alfred Menezes in recent years, especially our papers on provable security, is that there's a little bit of an excessive tendency in the cryptography world to convey to the outside world an impression of self-confidence and mathematical certainty about our recommendations when there is some reason to wonder whether this self-confidence is justified. So a lot of -- so the flavor of what we've tried to do with various papers including this one is to call for some humility about expressing mathematical certainty about recommendations. So finally I want to put this in a sort of sociological context by talking about narrative inversion, which is a term that applies when the farther the story that one tells and the language that one uses are from reality the more fervently this narrative is repeated and the more adamantly people insist that it's true. So some examples of narrative inversion is when a U.S. says that it's defending freedom, when macho guys use bravado to hide their insecurities. So here's some example of narrative inversion. And, you know, football, you see some examples of narrative inversion there. But in the world of cryptography -- well, another example from outside cryptography is that very often people who work on social questions like to use the word science all the time when they talk about social science, political science. And in a sense, I think the reason why they use the word science -- usually if someone really is doing science, they don't go around saying, oh, look at me, I'm doing science. When someone's constantly using the word science in reference to their work, there's a good chance it's a case of narrative inversion. Similarly, in cryptography, when crypto researchers claim that their systems are provably secure and that the rigorous methodology of provable security -- and this is from a new book by Katz and Lindell's new textbook -- has transformed cryptography from an art to a science, again using the word science in expressing this mathematical certainty, something provably secure. You have people reacting very strongly to -- here's Jonathan Katz -- I mentioned in my abstract that I -- that I'm -- or I guess in my bio that I've been making a lot of enemies recently as a result of my Notices article. And here's an example of someone getting extremely upset at a discussion of some of the doubts that arise in the cryptographic world. This is what he accused me of: name-calling, sheer elitism, snobbery at its purist. And he went on to say -- this is all from his letter to Notices, AMS Notices -- that despite my criticisms of a lot of what goes on in provable security, so-called provable security, the definitions proofs and formal reasoning have help cryptography progress from an art to a science. And this constant harping on how cryptography thanks to the methodology of provable security has gone from an art to a science reminds me of a line from Shakespeare about protesting too much. And that's really what narrative inversion's about. And I think a lot of what goes on when attached too much confidence to conventional wisdom and to certain assumptions, does veer off in the direction of narrative inversion sometime. So, anyway, that's what I want to say about this and I welcome questions or comments or disagreements. [applause] >> Kristin Lauter: Questions? >>: [inaudible] of BSI, also a two-headed animal like the NSA [inaudible] U.S. >> Neal Koblitz: I don't know. I'm not familiar with BSI's different -- other differences with NSA other than the one I talked about. >>: So do you not think that it's not [inaudible] BSI or NSA [inaudible] approach is more secure but that they actually might just [inaudible] to the other one? Because like the method of NSA says it's the best because the way the inside does it [inaudible] they don't want to tell us about? Last time we heard the NSA do this, they said use this special box [inaudible]. >> Neal Koblitz: So you're saying that NSA is propose -- say this again. >>: The reason that they're not -- they don't like BSI's methods because they know of an attack for the way that BSI said is the safe one. >> Neal Koblitz: Something on the form that I was talking about, something like an attack that works on a certain proportion and they know they can get at a random curve but they can't get at a special curve? I would tend to really doubt that. I mean, theoretically, it could be -- I think there's a little bit of a tendency we have to overestimate what NSA knows and what other secret agencies know. It's now -- I believe it's certain, for example, that none of those agencies had thought of elliptic curve cryptography. It's known that in Britain they did think about something similar to RSA a few years before RSA did, but they messed it up and they didn't appreciate its importance, they put it on the back burner. They sort of didn't do it right. It was only in the academic world where Diffie-Hellman and the RSA people and others understand how to do it right. So I think sometimes we tend to -- because they're secret we tend to overestimate what's there. And I'm not saying they don't do good work, but I'm just a little skeptical that they would have some brilliant attack on random curves that has eluded everybody who works in the open. That's just my personal view. And I think the history -- the things that have become known about what these agencies do don't suggest that they're light years ahead of everyone else. But that would just be my opinion. It's conceivable, of course, that they know this. Some people would turn it around and say, well, maybe, on the contrary, they know how to break anomalous binary curves, and that's why they're recommending them so that people will use them and then they can break them. So people can imagine various scenarios, that they're either recommending something because they know that the alternative is weak or that they're recommending something because they know the alternative is strong. So my guess is neither. >>: A question about that BSI [inaudible]. Do you know how the number behaves over time? Once looked at this in the mid-1990s, it was 100 and two years later it was 400. >> Neal Koblitz: Of what number? >>: Number 10 million that you have for the degree of ->> Neal Koblitz: Oh, yeah. >>: -- of the class field in this case. It was degree 100 in '95, 400 in '99 [inaudible] now it's sort of 10 million. Do you know anything? >> Neal Koblitz: I don't know. I don't know why degree more than 1 is necessary. So I'm still stuck at 1. So between 400 and 10 million, I can't really see any basis even for going above 1. >> Kristin Lauter: I have a question. So just on the nomenclature we use, we say special versus random. So it seems like this is also an example of kind of what you were saying in the beginning where words can end up being taken to mean something that they don't actually mean. So somehow these words special curve getting assigned to, you know, something that has like, for example, [inaudible] field, a very small discriminant. So if then you take that to mean special, which of course no one in the nonmathematical world thinks that that's what the word special means, but in a lot of these cases, that is actually what's happening, it just happens to be a field of small discriminant. And then you take and you look at it from a different angle where you actually have this issue of the large conductor gap to take into account, well, if you just use the English language and use the word special to apply to the ones that were on the wrong side of the large conductor gap problem. And, I mean, you could flip the whole thing around. And so I was wondering how intrinsic the words special and random are in your title. I mean, it seems like... >> Neal Koblitz: Well, the story about nonhyperelliptic curve supports what you say, that it's tricky sometimes to say which is special and which isn't special. Although, I think usually people mean if something's chosen randomly in a certain set, you have negligible probability of getting something with a certain property, then that -- then that's distinguished from particularly looking for something with that particular property. So I guess in some sense that's a well-defined distinction. And in the hyperelliptic and nonhyperelliptic curve, I don't think anybody would use the word special for nonhyperelliptic curves and calling -- I don't think anybody would call hyperelliptic curves the generic case and nonhyperelliptic curves special. But if you look at Diem's algorithm, it's almost as if it was that way. So in his algorithm could be viewed as showing a weakness in a special -- in a generic curve as opposed to a special curve. But as you say, we could just sort of reverse the meaning of the word -- we could just reverse the usage of the word special and that would take care of it. But these words do have reasonable accepted meanings, but it's just that the reasonable accepted meaning at least if we use it to be consistent with real-world uses of the word, with dictionary uses of the word, there might be some surprising consequences where something special turns out to be safer than something general or something simple turns out to have more complexity in its discrete log problem than something complicated. So we get a sort of discrepancy between the commonly accepted connotation of a word and what actually happens in cryptography. And I think that's also true -- we were talking about this earlier -- about the word provable -- term provable security. That the terms carry a lot of baggage, a lot of connotation that comes from their outside world uses. And sometimes when we use words in cryptography that have this baggage attached to them, people can get confused and can assume that certain things are going to happen, and the reverse might happen. People can expect a certain level of certainty when they hear the word provable. Well, as -- I think it was Lars Knudsen once said if something is provably secure, then it probably isn't. And sometimes that can happen, that a word seems to be implying something but what actually might happen is the reverse. But whether this can be fixed by just changing the usage of the words or just maybe avoiding loaded words entirely or just avoiding putting too much confidence in conventional wisdom sort of being a little bit sensitive to the tendency to allow the terminology to take us farther in interpreting something than we have any right to go with it. You know, that sometimes the terminology develops its own momentum and people conclude certain things, they conclude they can have a lot of confidence because something's provably secure or they conclude that they should make a random choice because it sounds right or the terminology suggests that. So I think we have to be cautious whether we can accomplish anything by changing our word usage, maybe sometimes we can do that too. >> Kristin Lauter: Any questions? >> Neal Koblitz: Yes. >>: [inaudible] that get it is random curve [inaudible] reduction [inaudible] hypothetical attacks, do you have any estimates -- I know obviously it's all hypothetical [inaudible] introductions [inaudible]. >> Neal Koblitz: Well, we have no way of knowing because it is hypothetical. A weak curve -- by the term weak curve I meant a curve for which a quicker algorithm is known. Now, in this case, we were talking about -- we were talking about genus 3 curves. These are still exponential time algorithms. They're an improvement over square root attacks but not a super dramatic improvement. So that's why it was just 9 bits improvement. Now, clearly if we had -- but nothing like this is remotely known, but if we had say a polynomial time attack on the discrete log problem for a certain proportion of the curves, then conceivably the time that it would take to solve the problem would be just dominated by the isogony walk. In other words, once you found a weak curve, you'd be home free because you'd have a very quick algorithm. So you could imagine a situation where the only real obstacle to solving the discrete log problem on a random curve was implementing the isogony walk defined a weak curve. So depending on what happens -- this is all hypothetical, there's no way of knowing, a weak curve might mean very weak or just a little bit weak. And a weak curve might occur with very little frequency, in which case it could take a long, long time to find one, but maybe once you found one, the actual algorithm would be very quick. Or as in this case, where we actually do have an algorithm using they day descent, you only get 9 bits. So it really depends. And since this is hypothetical, no way to know. >> Kristin Lauter: Okay. Well, let's thank Neal again for his [inaudible]. [applause]