>> Kristin Lauter: Okay. So today we're very... University of Washington visiting and speaking on special versus random...

advertisement
>> Kristin Lauter: Okay. So today we're very pleased to have Neal Koblitz from
University of Washington visiting and speaking on special versus random curves.
Neal Koblitz is the founder of elliptic curve cryptography and the author of several
books, including a book on cryptography called Introduction to Number Theory and
Cryptography.
And he has visited us several times here at Microsoft Research, but not for quite a while.
So we're very pleased to have him. Thanks.
>> Neal Koblitz: Thank you, Kristin, and thank you for the invitation. I'm very glad to
be here.
This talk is based on a paper, a much longer paper written jointly with Alfred Menezes
and with my wife, Ann Hibner Koblitz. That's posted on the e-Print server, if you're
interested. And I want to apologize in advance for clumsiness in my use of PowerPoint.
This is only the second PowerPoint talk that I've ever given. The first one was a disaster,
so...
Okay. So the conventional wisdom in cryptography is that you get greater security if you
choose any parameters that are at your disposal as randomly as possible, so that in
particular in elliptic and hyperelliptic curve cryptography the safest option is to choose
the defining equation to have random coefficients.
Now, it doesn't mean you can't use special curves, special choice of parameters often
improve efficiency. And if you're only interested in short-term security or if you're not
particularly paranoic about security issues, that's fine. But someday that choice might be
one that you regret. At least that's the conventional wisdom.
Now, so, for example, when I proposed hyperelliptic cryptography in the late '80s, one
could justify the choice of hyperelliptic curves over -- as opposed to elliptic curves by
thinking in terms of this conventional wisdom.
So the idea is that there's a new parameter; namely, the genus of the curve. An elliptic
curve has genus 1. We can vary G instead of fixing it to be 1.
So -- and I thought that the higher the genus the more complicated an object you're
working with. The Jacobian group of the hyperelliptic curve is a more complicated
object than an elliptic curve.
And you might just choose G randomly, like it'd be a random prime number. So here in a
paper I wrote a couple years later, I chose a random prime; namely, 191, which has no
special properties. And I gave a specific curve that was easy to compute the number of
points on because I didn't have any point counting techniques, so I just chose a very
simple curve, and its Jacobian group has three times the prime number of points, and I
said this might be good for cryptography.
At the time I was also thinking that let's take, say, G equals 191 over the field of two
elements, which would result in a group of approximately the size you want. We could
choose maybe even random coefficients and it's just a lot better, I thought. Let's say that
would give you 382 random coefficients rather than two random coefficients that you
have with elliptic curves.
Well, my fallacies at the time -- well, first of all, a conceptual complexity is not the same
thing as computational complexity, because that was the first elementary fallacy. When
you read books about -- introductory books about -- introductory books about algebraic
geometry and curves, they talk about G as a measure of the complexity of the curve.
But it turns out that even a random genus G curve over the field of two elements is not
helped from the standpoint of security by all the hundreds of random coefficients you can
have.
So there were basically two fallacies: one, thinking that having a large number of
randomly chosen parameters would help security, and the other was the fallacy of
thinking that a more complicated object would help -- a more complex object would
mean greater complexity of the computational problem.
And both of these were basically rookie mistakes that I made at the time in thinking these
things. In fact, thinking that the more random parameters you have to play with the
greater the security is almost as silly a mistake as to -- as the classic beginner's mistake of
thinking that if you have a very large key space you're safe. It's on that level.
Well, it's well known what happened. In 1994 Addelman, Demarus and Wong [phonetic]
showed that high genus curves over small fields are insecure, and this was a big
disappointment to me and a big shock at the time; although, it didn't mean that
hyperelliptic curves were totally useless.
>>: If I get it right, low genus curves [INAUDIBLE] are insecure trivially, because that's
not enough points.
>> Neal Koblitz: Yes. Yes. Yeah. In this -- you're going to want a group of a certain
size. The size is essentially fixed, or the bit length of the size of the group is basically
fixed, yeah.
Okay. So now right now it's known that basically the only genus that's a serious
contender for being as secure as elliptic curves is genus 2. That even genus 3 [inaudible]
anyway can be -- the discrete log problem can be broken. Discrete log problem of course
is the fundamental hard problem that's at the heart of all these systems. And if you
compare the discrete log problem on a genus G curve for G greater than 3 for fixed bit
length of the group size, you need larger groups for the same level of security, if genus is
3 or greater. It's only genus 2 that remains as a serious contender.
Notice that one of my fallacies was that you can have the same word with a common
real-world meaning and then a special cryptography-type meaning. So in that case it was
the word complexity, which means something in a conceptual sense, but it means
something different in a computational sense.
And the same goes for another word that I want to talk about today, which is special as
opposed to random. That also has to be a word that has to be used with care. And let me
give an example, a recent example that shows some of the difficulty of this word.
Now, from a mathematical standpoint, for genus at least 3 the modulized space of genus
G curves has dimension 3G minus 3 whereas the subspace or submanifold consisting of
the hyperelliptic curves is much smaller, it has codimension G minus 2. That is, if you
choose a curve randomly over the field of Q elements, it has a 1 out of Q to the G minus
2 chance of being hyperelliptic.
So in terms of the dimension of the space of these curves, the hyperelliptic curves are a
very special subset of all curves.
So conceptually the hyperelliptic curves are special and the nonhyperelliptic curves are
the generic ones, in the sense that a random curve is almost certain to be nonhyperelliptic.
Yet, a few years ago Diem and Thomé found an index calculus attack on the discrete log
problem in the Jacobian group of a genus 3 nonhyperelliptic curve with running time of
order Q, Q to the 1 bounded by Q to the 1 plus epsilon.
So now the generic discrete log algorithms, so-called square root algorithms, in this case,
the group involved, if you have a genus 3 curve over the field of Q elements, the Jacobian
group will have order Q cubed. And so this is like a cube root attack.
Now, in the case of hyperelliptic curves, the best one can do, the fastest attack on the
discrete log problem is Q to the four-thirds plus epsilon, which is slower.
So in the case of the nonhyperelliptic curves, there's a cube root attack that is an attack
whose running time is of order the cube root of the group order, whereas in the
hyperelliptic case, it's a four-ninth power attack.
Now, as I said, once you get up to genus 3, you can do better in the square root attack.
That's why genus 3 is not fully competitive with elliptic curves, for a well-chosen elliptic
curve, there's nothing better than a square root attack. In the case of a hyperelliptic genus
3 curve, there's a four-ninth power attack on the discrete log problem. And in the case of
a nonhyperelliptic curve, there's a still better attack, cube root attack.
So the hyperelliptic algorithm is better than square root algorithm, but the
nonhyperelliptic algorithm is still better.
Oh, and I should say that -- there is a conceptual reason for it, what's at the heart of this
discrepancy is that a nonhyperelliptic curve can be represented as a smooth curve on
the -- smooth plane curve of degree 4, whereas a non -- whereas a nonhyperelliptic curve
cannot be so represented. And it was the possibility of representing a curve in that
particular way that led to the Diem-Thomé algorithm.
Diem was able to generalize this to a very large class, so-called sufficiently general
nonhyperelliptic curves of any genus, and he found that expression Q minus 2 divided by
G minus 1 plus epsilon for nonhyperelliptic curves, this is an arbitrary genus at least 3.
And this should be compared with the best general algorithm for the hyperelliptic case,
which has a slower running time because you have G in the denominator rather than G
minus 1.
So what happened in genus 3 also occurs in higher genus as well. So what that means is
that in terms of the complexity of the attack on the discrete log problem, a G dimensional,
nonhyperelliptic group has the same complexity as a G minus 1 dimensional hyperelliptic
group. So you'd have to go to one higher dimension to get the same level of complexity
in your problem if you're using nonhyperelliptic curves.
So with hyperelliptic curves, you can achieve the same level of security with one lower -one lower genus, meaning one lower dimension of the group, which is a big difference in
its size, Q to the G versus Q to the G minus 1.
Okay. So the conclusion is given the present state-of-the-art algorithms for genus at least
3, a random genus G curve is less secure than genus G hyperelliptic curve.
So okay. So now I want to go to another issue or another part of my talk about the whole
question of special versus random. So and this is sort of interesting from I guess a
sociological standpoint because what one sees on this issue of preferring special versus
random curves, you see a sort of national division here between NSA and the German
equivalent, BSI.
Now, NSA has a long history of supporting the use of special curves in elliptic curve
cryptography. In fact, the very first public presentation at a Crypto conference by an
NSA person was Jerry Solinas's paper at Crypto '97 on anomalous binary curves which
have their equation defined over the field of two elements. They're basically the ordinary
curves over the field of two elements. And he -- his paper was devoted to very efficient
computations on those curves that improve the efficiency of crypto systems based on
those curved.
In addition, NIST, which is essentially NSA, has implicitly endorsed pairing-based
cryptography. And they organized a workshop on it in June. At that workshop a
company, Voltage, presented a choice of curve for pairing-based cryptography; namely,
the supersingular curve Y squared equals X cubed plus B over a prime field where 12
divides P plus 1 which among other things means that it's supersingular. And this is a
very, very special curve. It has all sorts of special properties.
So that's another example. It's not exactly NSA pushing this, but it's certainly -- NIST
has been very cooperative with pairing-based cryptography.
Meanwhile -- yeah.
>>: Do you happen to know if Voltage has any patents on speeding up that particular
curve?
>> Neal Koblitz: They might. I don't know what the patent situation is, though.
Now, there's a European consortium called Brainpool that's led essentially by BSI that
has made some very different recommendations that I think provide amusing contrast
with the role that NSA and NIST have been playing.
First of all, according to their draft recommendations, when you have -- an elliptic curve
over a finite field can be lifted to a curve over a complex multiplication field, and they
insist that number field must have degree greeter than 10 million. So if it lifts to a field -like Voltage's curve lifts to a curve over the rational numbers, which has degree 1. So it
very much violates this, to say the least. That also precludes the anomalous binary
curves, which have complex multiplication by Q by a class number 1 numbering.
>>: [inaudible]
>> Neal Koblitz: Why? Well, to -- presumably to prevent the possibility that someday
someone will find a way to use the theory of global elliptic curves over those number
fields. They want the number field to be big enough so that computations are not feasible
there in case someone finds a way to use computations there to -- in some sort of attack.
So it's based on speculation related to possibilities for future attacks.
Secondly, they inquire that -- they require that the embedding degree -- by the embedding
degree in an elliptic curve system it means that the smallest degree of an extension of the
finite field into which the elliptic curve group can be embedded. So you can embed an
elliptic curve group into the multiplicative group of a finite field, but you have to go to a
field extension to do that. And in a randomly chosen -- if your parameters are random,
that extension degree will be astronomically big. And that's what they insist on. In fact,
they insist on that the embedding degree, that's the degree of the field extension, should
be greater than Q minus 1 over 100.
>>: [inaudible]
>> Neal Koblitz: Pardon?
>>: [inaudible]
>> Neal Koblitz: Their requirement? Their reason for doing this? Well, certainly
avoiding very small embedding degree has a reason connected with the discrete log
problem in the finite field. Why it should be that big, I mean, you really have to ask
them.
So in particular, this immediately precludes all pairing based cryptography.
Now, they really want to err on the side of caution because, for example, they're saying
that if you use an elliptic curve that embeds in a finite field extension of K, Q minus 1
over 1,000, that's too risky. So that would mean if K has -- so that would mean Q is
presumably -- has about 160 bits, so K, if K has 150 bits, that's too risky.
>>: If I remember [inaudible] NSA and NIST [inaudible] and in two years 160
[inaudible].
>> Neal Koblitz: Okay, okay, I'm -- this is just, you know, just to give an idea. But let's
say that K has only 160 -- that the field size has 160 bits, in that case we're talking about
K of 150 bits. And you certainly wouldn't have it any less than that. So K, this excluded
K by Brainpool would have at least 150 bits.
Now, the fastest algorithm for the discrete log problem in that field has running time, this
number, the running time is about 10 to the 400 trillion operations. But still that's not
safe enough, according to Brainpool.
Well, in practice what they're really doing is they're insisting on random curves.
>> Kristin Lauter: So do you think they know any better algorithms for curves with
[inaudible] discriminate field?
>> Neal Koblitz: I don't think so, but I'm not privy to what they might know. But I don't
think so.
So in practice what they're really doing is saying we have to use random curves. So they
won't allow you to use the CM method, anomalous binary curve, supersingular curves, no
pairing-based cryptography.
Now, that's an incredible contrast, you know, between the two. So and at least I found
this surprising when I realized how extreme this was.
Now, one theory, some might be tempted to talk about German versus American national
traits. For example, in Germany, this -- my friend Johannes Boatman [phonetic] who I
was visiting many years ago in Germany told me the story that in Germany it's -- there's a
law saying that motorists must carry some rubber gloves in their car. And I couldn't
figured out why. He explained why they have this law.
It's because there's a Good Samaritan law in Germany which means that if you see an
accident, you're required to stop and help the injured parties. But there's always a chance
that someone's who's bleeding might be HIV positive, and in that case you still have to be
a Good Samaritan, but you're require today have these rubber gloves that you can use to
handle this.
And this to me sort of like epitomized a very cautious attitude towards life to have this
requirement, whereas in contrast the American stereotype is that Americans are very
happy to indulge in high-stakes, very risky gambling. So that's attempting a -- an
explanation, but I think it's a bogus one. It's up in the air why it is that Germany went in
one direction and the NSA in another.
And as far as in our paper, we're completely agnostic on the question of who's right about
this. We're not claiming that NSA is being reckless and risky, or NIST or Voltage, and
we're not claiming that Brainpool is being ridiculously overcautious. An argument can
be made either way.
But, now, the irony of this is that it's not really a simple issue do you want to be extra
cautious or do you want to be a little bit reckless or what some people might consider to
be reckless, but Brainpool would consider to be reckless. It's not really a clear-cut issue
because one can imagine scenarios where Brainpool's approach might not be the safer
one even though they're insisting on random curves.
So there are various scenarios in which someone, and I'll call her Alice, who chooses
ECC with a special curve might end up better off than someone else who I'll call Bob
who chooses a random curve.
So these scenarios are suggested by recent work on isogonies. So let me just quickly
summarize that. But some people -- well, Venki [phonetic] I know is an expert on this
more than I am, so I probably -- and Kristin's done work on this too, so I shouldn't be the
one talking about isogonies, but I'll quickly go over the basic bases of what an isogony is
between two curves.
So we have two curves defined, two elliptic curves defined over the field of Q elements.
And isogony is simply a nonconstant rational map defined over FQ -- that is, an isogony
defined over FQ is a nonconstant rational map defined over FQ that takes the point
infinity to the point infinity. Its degree is its degree as a rational map, which also in the
case we'll be considering is the order of the kernel of the isogony.
An isogony, there's a dual isogony going the other way, and so there's an equivalent
relation between elliptic curves and being isogenous, so this is a larger class than
isomorphism. Then there's a basic theorem by Tate related to curves over a finite field,
that they're isogenous over the finite field if and only if they have the same number of
points over that finite field.
Now, from a computational standpoint it turns out that low-degree isogonies are easy to
construct by high-degree isogonies are usually not, especially if you don't have an
explicit form of the isogenous curve; that is, if you're just given a curve and you want to
construct an isogony of some large prime degree, that's a very hard computational
problem.
Now, it also has to talk about endomorphisms here. So if we have an elliptic curve over
the field of Q elements, the trace is the difference between Q plus 1 and the number of
points.
An endomorphism is an isogony to the curve to itself that's defined over the algebraic
closure. In the case we'll be considering the nonsupersingular curves or ordinary curves
E which means that T is prime to the characteristic of the field. In that case, all
endomorphisms defined over the algebraic closure are actually defined over the field of
definition. That's the case we'll be considering.
So the ordinary case, the case of ordinary elliptic curves, which is the usual case, the
endomorphisms are all defined over the field, the definition itself.
Now, the endomorphisms form a ring that contains the subring of the obvious
endomorphisms of scale or multiplication. Now, the delta -- the discriminate of a curve
is the square of the trace minus 4Q, which is a negative number. And the CM field is a
quadratic imaginary field generated by the square root of the discriminate.
Now, that discriminate, if we write it in the form D stands for the discriminate of the
field, so D is a fundamental discriminant, the discriminant of a quadratic imaginary field,
delta in general is equal to the discriminate of the field multiplied by some square. And
that square plays a crucial role in classifying the possible endomorphisms that -- the
possible endomorphism rings that E could have.
Well, it turns out that the endomorphism ring of E is an order of the ring of integers of
that quadratic imaginary field. So this is all part of the sort of basic theory of complex
multiplication. But it's not necessarily the full ring of integers. In some cases it is, in
many cases it is. But in general it will be in order of the range of integers of this
quadratic imaginary field of a certain index C which is called the conductor of the
endomorphism ring, not to be confused with other meanings of the word conductor.
So the conductor of the endomorphism ring of an elliptic curve tells you something
about -- tells you its index in the maximal possible, the largest possible endomorphism
ring.
So if you take all elliptic curves that are isogenous to the given elliptic curve, they can be
partitioned according to their endomorphism ring. Namely, the endomorphism ring are
determined by the conductor C which are in 1-to-1 correspondence with the divisors of
that factor that's being squared in the determinant.
So these are basically all of the facts that we need, and I went through quickly because,
you know, it's part of the basic theory of complex multiplication and endomorphisms, and
it would take a lot of time to go into any more detail on where all this comes from.
Okay. Now, we ask how many isomorphism classes of elliptic curves are in a given
endomorphism class. The answer is that it's the class number of the order which is
related to the class number of the field. It's essentially proportional to the conductor; that
is, if in the case of conductor 1, it's the class number of the field. But if the conductor is
larger, there's a larger number of isomorphism classes with the endomorphism ring.
In a sense, the smaller the endomorphism ring, meaning the larger C is, the more curves
there are, the more different isomorphism classes there are with that particular
endomorphism ring. So that's basically what we need to know.
So, for example, if -- if the discriminant is square free, then all of the curves in an
isogony class -- and I should have said before that in fact the number of isomorphism
classes in the isogony class of an elliptic curve is of order of the square root of Q.
Remember, the isogony classes correspond to the number of points on the curve, and the
number of points of the curve fall in the [inaudible] interval of which there are roughly 4
of the square root of Q possibilities, so there are roughly 2 of the square root of Q
possible isogony classes, and there are of order Q elliptic curves, and so there are roughly
the square root of Q curves in each isogony class.
Now, if a delta is square free, then they all have the same endomorphism ring of
conductor 1. There's no -- C0 is just one. So that's the simplest case. They're all in the
same class.
If -- another special space, if [inaudible] large prime, then there are two endomorphism
classes. There's the isogony class consisting of a small number of curves whose
endomorphism ring is the full ring of integers; namely, the class number of the quadratic
imaginary extension. That's how many -- which will be a quite small number, probably.
And the remaining curves, the vast majority of them which will have endomorphism ring
of conductor C0.
Okay. Now, concerning the isogonies, let L denote a prime. Now, if there's a degree L
isogony between two curves, then either the two curves have the same endomorphism
ring or else the conductors differ by -- in one direction or the other by a factor of L.
So if we have two endomorphism classes, by the conductor gap, we mean the largest
prime that divides one conductor and not the other. And that determines how easy it is to
go from one class to another using isogonies. So what we're going to be talking about is
going from one endomorphism class to another one using isogonies, and in order to
change the endomorphism class, we have to have the -- the gap between the conductors
has to be the prime degree of the isogony.
Now, if there's a large conductor gap between two endomorphism classes -- that is,
there's a large prime that divides one conductor, the conductor of one endomorphism
class and not the conductor of another endomorphism class -- then one cannot go from a
curve of one class to a curve of the other by a string of low-degree isogonies.
So remember I said that in constructing isogonies the basic fact is if you're given a curve
and you want to construct a degree L isogony where L is a prime and that's all you're
given, you just got to construct this isogony, if L is small you can do it, if L is extremely
large you can't. And so if there's a large prime that divides one endomorphism class's
conductor and not the other one, then in practice you can't go using isogonies from one
endomorphism class to another.
And, conversely, if there is no large gap, then by a result of people who were here at the
time, Jau Miller [phonetic] and Venki, within an endomorphism class or among several
classes but with small conductor gaps, one can travel randomly and uniformly through
the set of curves by just a sting of low-degree isogonies.
Now, the thing about isogonies, they allow one to transport the discrete log problem from
one curve to another. So the discrete log problem is random self-reducible within a set of
endomorphism classes with small conductor gaps. So what that means -- well, first, let
me give the definition. By the L conductor gap class it is the set of all endomorphism
classes in the isogony class of E that have conductor gaps smaller than L.
So what this means basically is that if you have -- if you were to find a faster algorithm -let's say you found an algorithm that solved the discrete log problem in time T1 in a
certain proportion of all elliptic curves, that there's some criterion that if an elliptic curve
happened to satisfy you could apply this new algorithm.
And so there's a certain proportion of weak curves. And let's suppose that the property
being a weak curve is independent of the isogony and endomorphism class, then if you
had an L conductor gap class, so you could travel freely around that class, then you could
solve the discrete log problem on any curve in the class in time T1 plus T2 over epsilon
where T2 is the amount of time it takes you to construct a low-degree isogony.
So low degree means degree less than L, that you can -- using degree less than L isogony
you can jump around randomly and uniformly in this class. And epsilon of course is the
proportion of weak curves. So it takes you time -- T2 divided by epsilon to find a weak
curve, and then T1 to solve the problem once you get there. And this of course only
works if the L conductor gap class contains more than 1 over epsilon curve so that you
have a good chance of finding a weak curve.
So the whole point of this is that it's the possibility of random walks, random sort of
strings of low-degree isogonies through a conductor gap class that under certain
circumstances might make a random curve less secure than a special curve. So that's why
I want to give some example of.
Now, notice, it's important to note that a random curve -- for a random curve you'd expect
that all isogenous curves are in the same conductor gap class because delta has negligible
probability being divisible by the square of a large prime. So there just aren't going to be
any large prime around that could conduct -- that could divide the conductor of the
endomorphism ring.
So we'll look at some hypothetical scenarios. And all of this is hypothetical, and we're
not talking about algorithms that -- well, we're not talking about things that are occurring
in the real world at present with random curves.
Okay. So here is -- I'll have time for a couple of examples. Muller in '98 suggested some
curve for elliptic curve cryptography that generalized slightly more general cases besides
the anomalous binary curves that are defined over the field of two elements. He
suggested some curves defined over very small degree extensions of F2. So here's an
example, one of his examples. Let's let Q be 2 to the 177th power. So this is 2 to a
composite degree. And let gamma -- that's supposed to be a gamma, but unfortunately
PowerPoint has terrible gammas, so it looks like a Y, but that's not Y, that's gamma.
So let Y be a generator of the degree 3 extension of F2 satisfying gamma cubed equals
gamma squared plus 1, and let EB be the following elliptic curve. This is one of Muller's
elliptic curves, very similar to the anomalous binary curves but with -- defined over F8.
Its group order is over that particular field extension is -- so this is a prime degree
extension of F8. It's a degree 59 extension of F8, and it turns out that its group order is
six times a prime of suitable size for elliptic curve cryptography, and that was one of a
handful of examples he suggested.
Okay. Now, suppose that Alice -- remember, Alice is the one whose using a special
curve, and she read Muller's paper and followed his suggestion and chose this E. Now,
she figures that solving the discrete log problem by the Pollard method, by the square
root of attack, will take roughly 2 to the 84 operations. Now, there's a slight speedup
whenever you have a curve defined over a smaller field and then you work with it over an
extension. There's a speedup which is really quite small but still has to be taken into
account of the square root of the extension degree because of ways you can group
together points that are in the same conjugacy class under the Frobenius map of the
extension of finite field. So you can sort of group together points in sets of 59 points and
apply Pollard row to those -- instead of to the set of points, you can apply to the set of
conjugacy classes and get this little speedup.
That's about speeds -- that reduces your security by 3 bits.
So that's why she as 84 bits rather than -- okay. Yeah, normally she would have -- well,
also, okay, so it's 175 bit prime, so she would normally have 87 bits, but it's reduced to 84
bits because of this speedup.
Okay. Now, Bob thinks that Alice is foolish, first of all for having chosen a curve with
very special properties that not only allow for this one speedup that we know about but
who knows what else could result from choosing a special curve. So it could leave her
vulnerable to other attacks.
So Bob figures that he'll say fine, let's use -- if you want to work over the field of 177
elements, let's do that, and he choose a random curve over the same field with group
order -- 2 always divides the group order if it's an ordinary curve, but you could get twice
176 bit prime working over that field with random coefficients. Then he'll get 88 bits of
security rather than 84. And he'll also be less vulnerable to special attacks. So that's
what he figures. Okay.
So here is Bob lecturing Alice with condescension oozing from his voice that she was
really quite foolish to choose this very special curve. Well, he'll get more security
choosing a random curve, even using the field she wants to use.
And until recently Bob's reasoning would have appeared to be correct; that is, that you'd
be better off from a security standpoint using random coefficients over this field.
But some work in 2006 by Alfred Menezes and Edlyn Teske on [inaudible] descent
shows that Bob might not have nearly the security level that he thinks he has; namely,
they found that a certain proportion of all elliptic curves over this particular field, the
field of 177 elements, the same field that was in Muller's paper, that a certain proportion
of all elliptic curves with group order congruent to 2 mod 8, which is half of the them, if
you choose randomly, are weak in the sense that the discrete log problem can be
transported to the Jacobian of a genus 3 hyperelliptic curve over the field of 2 to the 59th
element.
So [inaudible] you take a -- you take a curve -- an elliptic curve defined over a composite
degree extension of F2 and you transport the discrete log problem to a hyperelliptic curve
over a smaller field.
And if you're lucky, the genus of the hyperelliptic curve will be equal -- you'll get a group
of the same size. You might not, but in these cases you get a genus 3 curve over the field
2 to the 59 element whose group order is also 2 to the 3 times 59.
And there, as we saw, you have a four-thirds power, Q to the four-thirds algorithm, which
is about 2 to the 79th, is how long it takes to solve the discrete log problem on that curve.
And now this weak property is likely to be independent of isogony class.
Now, in Bob's case the discriminant of his curve is almost certainly not divisible by the
square of a large prime, and so it will be feasible to use isogonies to transport his discrete
log problem along a random walk through the isogony class.
Now, each isogony in this case given the current state of these algorithms take about 2 to
the 17 to construct. So in this case epsilon is 2 to the negative 58, so it will take just time
about 2 to the 75th to transport Bob's discrete log problem to a weak curve. If his group
order is congruent to 2 mod 8. Their results apply only in that case. So maybe Bob's
lucky -- what was lucky -- of course, Bob did this before he knew about the result, so he
had no way of knowing that he should avoid group orders congruent to 2 mod 8.
So if he was lucky, his group order is congruent to 6 mod 8. But there's a 50 percent
chance that his group order is congruent to 2 mod 8, in which case in time of order 2 to
the 75th, his discrete log problem can be transported to a weak curve so that he has
actually 79 bits of security, not the 88 bits that he thought, and not even 84 bits as Alice
has. Okay.
So basically because of this difference between genus 3 and genus 1 where genus 3 you
have a faster than square root algorithm, the four-thirds power is not that much faster than
the three-halves power of Q, which is what a square root algorithm would give you. But
it's enough to make a difference of in this case 9 bits of security and put them in a
worst-case analysis. And of course Alice also has greater efficiency with her special
curve. So she sort of gets the last laugh on that.
Now, if even if -- it turns out that Alice's group has group order congruent to 6 mod 8, but
let's just say for the sake of argument that the Menezes-Teske result applied to -- didn't
have that condition, nevertheless she'd still be safe because she was working with a
special curve. That's because her curve's endomorphism ring has conductor 1 and lies in
conductor gap class. Now, 2 to the 66th, if you choose L, that capital L to be 2 to the
66th, that's far above the range where you can construct isogonies. For a prime greater
than 2 to the 66th, you cannot construct an L isogony.
Now, in this case, for her, her discriminant does have a very large square factor, of
course, as special curves always will. And this square factor has a large prime and then
an intermediate-sized prime. And so using isogonies, you can go from her curve, which
has conductor 1, to curves that have conductor 11,681. That's feasible. It's somewhat
time-consuming, but it's certainly feasible to go -- to go outside her very small -- her very
small endomorphism class to the endomorphism class with conductor 11,681.
You can do that, but the total number of curves both in her endomorphism class and in
the endomorphism class of conductor -- her endomorphism class is conductor 1 and the
larger endomorphism class of conductor 11,681, the total number of curves is
approximately 2 to the 16th.
So there's negligible probability that any one of those curves that you can get to using
isogonies from her curve will be susceptible to the Menezes-Teske version of day descent
[phonetic].
So the so-called weak curves, remember there are roughly 2 to the negative 58 of all
curves are weak in this setting. And because of the particular nature of her special curve
and it's the factorization of its discriminant, there's no way of using isogonies to get from
her curve to 2 of the 58 curves.
So it's highly unlikely that her discrete log problem can be transported to a weak one by
an isogony walk. So what saves Alice is precisely the very special nature of her curve,
the fact that it has -- the endomorphism ring has conductor 1.
Okay. So that's an example with a composite degree extension field where these results
using day descent apply. Now I'll be more hypothetical and imagine algorithms that don't
at present exist, which Brainpool has done, so I figure they've opened the way to making
maybe outrageous speculations, so I'll do that too.
Now, in the next example, we'll take a prime degree extension of the field of two
elements. And in fact in almost all practical implementations of elliptic curve
cryptography, it's prime degree extension fields that are used in the characteristic 2 case.
So this is more realistic in practical terms in that sense.
Now, let's suppose, and here's where I'm being very speculative, that some version of day
descent or another approach someday leads to a faster than square root attack on a small
but nonnegligible proportion of curves defining over F -- defined over this prime degree
extension of F2.
Now, right now any of the really good day descent methods require a composite degree
extension of F2. But it's conceivable that that could change or there's some totally
different attack would apply to prime degree field extensions of F2. So let's just suppose
that that happens. And let's look at the digital signature standard recommendations for
five elliptic curves. In 2000 this recommended five specific elliptic curves over prime
fields and ten over binary fields. And they had five different binary fields at different
security levels, and for each one they suggested one random curve and one anomalous
binary curve.
Now, the largest case for greatest security is the degree 571 extension of F2 which should
provide plenty of security to protect a high-security AES private key. So that was
motivation for going up to that high of a degree.
Now, the conventional wisdom in line with what was on the early slide is that if anything,
if there is any difference in security level between the two curves that are recommended
for that field, the random one and the anomalous binary one, the random one, R571, is the
safer choice than K571 binary code they recommend.
However, that's the conventional wisdom. Let's suppose that a proportion epsilon of all
curves over this field could be attacked by this hypothetical algorithm. And let's always
suppose that the weak property of being susceptible to this new algorithm is independent
of isogony and endomorphism class.
Now, the curve R571 has square free discriminant as random curves often do. And so the
isogony walks can fan out from that curve throughout the isogony class, which consists
of 2 to the 285 curves approximately.
So after approximately 1 over epsilon isogonies, whatever epsilon is, the DLP, the
discrete log problem can be transported to a weak curve.
But in contrast, the anomalous binary curve has discriminant -- the square free part is just
negative 7 because it's an anomalous binary curve, which always has the discriminant for
an anomalous binary curve, whatever the extension field is, always has the form negative
7 times a square. And in this particular case the number square happens to be the product
of a fairly small prime and an extremely large prime.
What this means -- the endomorphism ring of the actual anomalous binary curve is
always -- has conductor 1, because it has a very large ring of endomorphism. That's in
fact why it's sufficient, why it was possible for Jerry Solinas to develop these really nice
algorithms for point multiples. Because you have this tremendous ring of
endomorphisms to work with. So you have the full ring of integers of Q root negative 7
endomorphisms, so it's conductor 1.
And the -- so if you take the 2 to the 262 conductor gap class of this curve, there are
original about 2 to the 22 curves. That is, there's the one curve, K571 itself, and then
there are also about 2 to the 22 curves that have -- whose endomorphism ring has
conductor equal to this 22 bit prime factor NC0. So if epsilon is much less than 2 to the
negative 22, less than 1 out of 4 million probability that the special attack will work, then
the discrete log problem probably cannot be transported to a weak curve by isogonies
because there just aren't enough curves in its conductor gap class to move around to.
And under these hypothetical assumptions, that special curve is likely to be safer than the
random curve. So that's -- so that's another example. Again, hypothetical example where
the random curve would be more dangerous.
And finally a final sort of setup, let me just suppose that we're worried about a new
approach, the discrete log problem might turn out to give a faster than square root attack
for a certain proportion; again, a small but nonnegligible proportion of concerns defined
over a large prime field.
In that case, if we're thinking about that, we might want to choose our elliptic curve to be
in a very small L conductor gap class where L is large, so that an attacker could not use
isogonies to transport the problem to -- the discrete log problem to a weak curve. In that
case there's a very, very easy construction which is sort of fun to ask questions about,
some nice number -- analytic number theory question you can ask about this. Just choose
B to B, a random prime of whatever order you need for security. A, a random even
number which has a couple conditions. You want A squared plus B squared to be prime
and you want either one of those two to be prime, either P plus 1 over 2 minus A or plus
A to be prime.
In that case, the curve with the very special equation Y squared equaled X cube minus
alpha X has two end points where alpha is a quadratic nonresidue in the prime field, and
the quartic residue symbol of alpha depends on the sign that we chose when we defined
N.
And the trace then is plus or minus 2A and the discriminant is easily computed to be
minus 4B squared, and by construction B is a prime, which is why we did that, why we
chose B to be a prime.
Then this particular curve up there has conductor 1. It's easy to see that it has complex
multiplication by I, so it has complex multiplication by the full ring of integers out of
conductor 1. And it's the only isomorphism class in its B conductor gap class. So it can't
be moved anywhere using isogonies because you're never going to be able to construct a
degree B isogony.
So all other isogenous curves have endomorphism ring of conductor B for any reasonable
K, it's not feasible to transport the discrete log problem from E to any other
isomorphism -- any isogenous isomorphism class.
Now, this curve is totally against the device of Brainpool, is very concrete, very, very
special curve with no randomness in it, or hardly -- very little randomness in the
equation. And completely goes against the device of Brainpool.
Now, whether it's reckless to do this or wise to do that is just a judgment call. And we're
not saying that people should use that curve, it's just if one's worried about the possibility
speculatively about these sort of algorithms it might apply to only epsilon of curves, then
it might be reasonable.
So the conclusion is not that we should prefer special curves over random ones, I'm not
making an argument that, oh, it's bad to use random curves. And I'm not saying that
Brainpool is wrong. We're sort of agnostic on that question. Our only real point is that
we don't really know and that some humility in dealing with these issues is called for.
And I think one of the purposes of a lot of the joint work that I've done with Alfred
Menezes in recent years, especially our papers on provable security, is that there's a little
bit of an excessive tendency in the cryptography world to convey to the outside world an
impression of self-confidence and mathematical certainty about our recommendations
when there is some reason to wonder whether this self-confidence is justified.
So a lot of -- so the flavor of what we've tried to do with various papers including this one
is to call for some humility about expressing mathematical certainty about
recommendations.
So finally I want to put this in a sort of sociological context by talking about narrative
inversion, which is a term that applies when the farther the story that one tells and the
language that one uses are from reality the more fervently this narrative is repeated and
the more adamantly people insist that it's true.
So some examples of narrative inversion is when a U.S. says that it's defending freedom,
when macho guys use bravado to hide their insecurities. So here's some example of
narrative inversion. And, you know, football, you see some examples of narrative
inversion there.
But in the world of cryptography -- well, another example from outside cryptography is
that very often people who work on social questions like to use the word science all the
time when they talk about social science, political science.
And in a sense, I think the reason why they use the word science -- usually if someone
really is doing science, they don't go around saying, oh, look at me, I'm doing science.
When someone's constantly using the word science in reference to their work, there's a
good chance it's a case of narrative inversion.
Similarly, in cryptography, when crypto researchers claim that their systems are provably
secure and that the rigorous methodology of provable security -- and this is from a new
book by Katz and Lindell's new textbook -- has transformed cryptography from an art to
a science, again using the word science in expressing this mathematical certainty,
something provably secure.
You have people reacting very strongly to -- here's Jonathan Katz -- I mentioned in my
abstract that I -- that I'm -- or I guess in my bio that I've been making a lot of enemies
recently as a result of my Notices article. And here's an example of someone getting
extremely upset at a discussion of some of the doubts that arise in the cryptographic
world.
This is what he accused me of: name-calling, sheer elitism, snobbery at its purist. And
he went on to say -- this is all from his letter to Notices, AMS Notices -- that despite my
criticisms of a lot of what goes on in provable security, so-called provable security, the
definitions proofs and formal reasoning have help cryptography progress from an art to a
science.
And this constant harping on how cryptography thanks to the methodology of provable
security has gone from an art to a science reminds me of a line from Shakespeare about
protesting too much. And that's really what narrative inversion's about. And I think a lot
of what goes on when attached too much confidence to conventional wisdom and to
certain assumptions, does veer off in the direction of narrative inversion sometime.
So, anyway, that's what I want to say about this and I welcome questions or comments or
disagreements.
[applause]
>> Kristin Lauter: Questions?
>>: [inaudible] of BSI, also a two-headed animal like the NSA [inaudible] U.S.
>> Neal Koblitz: I don't know. I'm not familiar with BSI's different -- other differences
with NSA other than the one I talked about.
>>: So do you not think that it's not [inaudible] BSI or NSA [inaudible] approach is more
secure but that they actually might just [inaudible] to the other one? Because like the
method of NSA says it's the best because the way the inside does it [inaudible] they don't
want to tell us about? Last time we heard the NSA do this, they said use this special box
[inaudible].
>> Neal Koblitz: So you're saying that NSA is propose -- say this again.
>>: The reason that they're not -- they don't like BSI's methods because they know of an
attack for the way that BSI said is the safe one.
>> Neal Koblitz: Something on the form that I was talking about, something like an
attack that works on a certain proportion and they know they can get at a random curve
but they can't get at a special curve? I would tend to really doubt that. I mean,
theoretically, it could be -- I think there's a little bit of a tendency we have to
overestimate what NSA knows and what other secret agencies know. It's now -- I believe
it's certain, for example, that none of those agencies had thought of elliptic curve
cryptography.
It's known that in Britain they did think about something similar to RSA a few years
before RSA did, but they messed it up and they didn't appreciate its importance, they put
it on the back burner. They sort of didn't do it right. It was only in the academic world
where Diffie-Hellman and the RSA people and others understand how to do it right.
So I think sometimes we tend to -- because they're secret we tend to overestimate what's
there. And I'm not saying they don't do good work, but I'm just a little skeptical that they
would have some brilliant attack on random curves that has eluded everybody who works
in the open. That's just my personal view. And I think the history -- the things that have
become known about what these agencies do don't suggest that they're light years ahead
of everyone else.
But that would just be my opinion. It's conceivable, of course, that they know this. Some
people would turn it around and say, well, maybe, on the contrary, they know how to
break anomalous binary curves, and that's why they're recommending them so that people
will use them and then they can break them.
So people can imagine various scenarios, that they're either recommending something
because they know that the alternative is weak or that they're recommending something
because they know the alternative is strong. So my guess is neither.
>>: A question about that BSI [inaudible]. Do you know how the number behaves over
time? Once looked at this in the mid-1990s, it was 100 and two years later it was 400.
>> Neal Koblitz: Of what number?
>>: Number 10 million that you have for the degree of ->> Neal Koblitz: Oh, yeah.
>>: -- of the class field in this case. It was degree 100 in '95, 400 in '99 [inaudible] now
it's sort of 10 million. Do you know anything?
>> Neal Koblitz: I don't know. I don't know why degree more than 1 is necessary. So
I'm still stuck at 1. So between 400 and 10 million, I can't really see any basis even for
going above 1.
>> Kristin Lauter: I have a question. So just on the nomenclature we use, we say special
versus random. So it seems like this is also an example of kind of what you were saying
in the beginning where words can end up being taken to mean something that they don't
actually mean. So somehow these words special curve getting assigned to, you know,
something that has like, for example, [inaudible] field, a very small discriminant. So if
then you take that to mean special, which of course no one in the nonmathematical world
thinks that that's what the word special means, but in a lot of these cases, that is actually
what's happening, it just happens to be a field of small discriminant.
And then you take and you look at it from a different angle where you actually have this
issue of the large conductor gap to take into account, well, if you just use the English
language and use the word special to apply to the ones that were on the wrong side of the
large conductor gap problem. And, I mean, you could flip the whole thing around. And
so I was wondering how intrinsic the words special and random are in your title. I mean,
it seems like...
>> Neal Koblitz: Well, the story about nonhyperelliptic curve supports what you say,
that it's tricky sometimes to say which is special and which isn't special. Although, I
think usually people mean if something's chosen randomly in a certain set, you have
negligible probability of getting something with a certain property, then that -- then that's
distinguished from particularly looking for something with that particular property. So I
guess in some sense that's a well-defined distinction.
And in the hyperelliptic and nonhyperelliptic curve, I don't think anybody would use the
word special for nonhyperelliptic curves and calling -- I don't think anybody would call
hyperelliptic curves the generic case and nonhyperelliptic curves special. But if you look
at Diem's algorithm, it's almost as if it was that way. So in his algorithm could be viewed
as showing a weakness in a special -- in a generic curve as opposed to a special curve.
But as you say, we could just sort of reverse the meaning of the word -- we could just
reverse the usage of the word special and that would take care of it.
But these words do have reasonable accepted meanings, but it's just that the reasonable
accepted meaning at least if we use it to be consistent with real-world uses of the word,
with dictionary uses of the word, there might be some surprising consequences where
something special turns out to be safer than something general or something simple turns
out to have more complexity in its discrete log problem than something complicated. So
we get a sort of discrepancy between the commonly accepted connotation of a word and
what actually happens in cryptography.
And I think that's also true -- we were talking about this earlier -- about the word
provable -- term provable security. That the terms carry a lot of baggage, a lot of
connotation that comes from their outside world uses. And sometimes when we use
words in cryptography that have this baggage attached to them, people can get confused
and can assume that certain things are going to happen, and the reverse might happen.
People can expect a certain level of certainty when they hear the word provable. Well,
as -- I think it was Lars Knudsen once said if something is provably secure, then it
probably isn't. And sometimes that can happen, that a word seems to be implying
something but what actually might happen is the reverse.
But whether this can be fixed by just changing the usage of the words or just maybe
avoiding loaded words entirely or just avoiding putting too much confidence in
conventional wisdom sort of being a little bit sensitive to the tendency to allow the
terminology to take us farther in interpreting something than we have any right to go with
it. You know, that sometimes the terminology develops its own momentum and people
conclude certain things, they conclude they can have a lot of confidence because
something's provably secure or they conclude that they should make a random choice
because it sounds right or the terminology suggests that. So I think we have to be
cautious whether we can accomplish anything by changing our word usage, maybe
sometimes we can do that too.
>> Kristin Lauter: Any questions?
>> Neal Koblitz: Yes.
>>: [inaudible] that get it is random curve [inaudible] reduction [inaudible] hypothetical
attacks, do you have any estimates -- I know obviously it's all hypothetical [inaudible]
introductions [inaudible].
>> Neal Koblitz: Well, we have no way of knowing because it is hypothetical. A weak
curve -- by the term weak curve I meant a curve for which a quicker algorithm is known.
Now, in this case, we were talking about -- we were talking about genus 3 curves. These
are still exponential time algorithms. They're an improvement over square root attacks
but not a super dramatic improvement. So that's why it was just 9 bits improvement.
Now, clearly if we had -- but nothing like this is remotely known, but if we had say a
polynomial time attack on the discrete log problem for a certain proportion of the curves,
then conceivably the time that it would take to solve the problem would be just
dominated by the isogony walk. In other words, once you found a weak curve, you'd be
home free because you'd have a very quick algorithm.
So you could imagine a situation where the only real obstacle to solving the discrete log
problem on a random curve was implementing the isogony walk defined a weak curve.
So depending on what happens -- this is all hypothetical, there's no way of knowing, a
weak curve might mean very weak or just a little bit weak. And a weak curve might
occur with very little frequency, in which case it could take a long, long time to find one,
but maybe once you found one, the actual algorithm would be very quick. Or as in this
case, where we actually do have an algorithm using they day descent, you only get 9 bits.
So it really depends. And since this is hypothetical, no way to know.
>> Kristin Lauter: Okay. Well, let's thank Neal again for his [inaudible].
[applause]
Download