>> Justin: Thank you, again, everybody for coming. ... less people in the audience in person, it makes me...

advertisement
>> Justin: Thank you, again, everybody for coming. It's bittersweet because while I'm seeing
less people in the audience in person, it makes me happy that I know I'll be able to afford all of
the gifts that I'm going to be buying for attending all six. There is a sign-up sheet in the back.
Even if you haven't been to all six it would be great to put you on that. We won't over spam you
with e-mail, but what I would like to do is at least give you summaries of kind of where you can
go for additional crypto stuff, any other advertisements that Josh has, links to the Resnet
broadcasts so that you can share amongst your team or go watch again because, I don't know
about you guys, but I don't feel very smart now that I'm in session four. I know there's a lot
smarter people than me. I think they're sitting down and not standing up talking, so I always
want to introduce you as like the guy that we should depend on for crypto information, but I am,
again, please that you are here to do this and I think that we all have a lot to thank you for with
the information that you're giving us. I hope to get a little bit to take away for myself even
though some of it is over my head now, but hopefully not for the rest of you guys.
>> Josh Benaloh: Hopefully, this will be motivating.
>> Justin: Yeah, I'm, hopefully like some that are like me. We still keep coming and we still
learn from some of the areas that we do understand. Let's give a round of applause for Josh.
[applause].
>> Josh Benaloh: All right. Thank you Justin. And I don't know. If you're really nice to him
you might convince him that 5 out of 6 is enough. I don't know. It's not up to me; it's up to him,
right? Okay. And a quick warning as I get started, we'll have some time for questions afterward,
but I have to get to SeaTac. My flight was originally scheduled for four o'clock which meant
that I would have to run out, but now it's 4:30, so I don't have to run but still, I, yes. Anyway,
and actually I'll be -- on another trip I get back just before the session in two weeks, but that I
have a little bit more leeway. Okay. Where are we? Last time we talked about the basics of
asymmetric cryptography. We talked about the Diffie Hellman protocol. We talked about the
Rivest, Shamir, Adleman protocol, best known as RSA and the digital signature algorithm or
DSA. What we're going to do today is talk about special forms, non-integer forms of
asymmetric crypto. In particular, elliptic curves, elliptic curve systems and lattices and latticebased systems. Most of the time we will be on elliptic. But before we do that, I want to start
with something that probably logically belongs with the last session, but timing wise it's good to
squeeze it in here. All of those three things that we talked about last time began with the
following in some form or another. Pick a large prime or a couple of large primes and then. It
was pointed out to me by a few people afterwards that I never really mentioned how you get
large primes, so I'm going to spend just a few minutes on that to finish covering that. And in
fact, everything we do today is also going to require being able to pick large primes, so it does fit
in here as well. The question we have to answer is, how do we do this? How do we find big
primes? The basic technique is pretty simple. You run through a loop something like this. Pick
a large random number, see if it's prime. If not, repeat. Okay. So how long is this going to take,
first of all? Is a very important theorem in number theory called the prime number theorem that
says how many primes there are. One out of every .7n n-bit integers is prime. That .7 actually
approaches natural log of 2. That's where that comes from. But if you're looking for 100-bit
primes, one out of every 70 100-bit primes. If you are looking for a thousand-bit primes, 1 out of
700, so if you just follow this simple loop, you have to go through it about 700 times to find each
of the a thousand-bit primes that you need to select to set up RSA, for instance. Okay. I have
begged the question though of how do you check to see if it's prime. Now that we know the
prime number theorem you know how many primes there are. How do you do this check? If
you remember from last time, we talked about Fermat’s Little Theorem. This nice little theorem
that says if you have a prime number then if you take any x, x to the Pth power is congruent mod
p. If you reduce it mod p you get the same thing as x or it’s mod p. In particular, if you take
something smaller than p, raise it to the Pth power, you should get back to where you started
from. This is true for all primes. If you take an x and raise it to the Pth power and you get
something else back, you know you've got something that's not prime. We got a really good way
of finding not primes, which isn't exactly what we want here, but it turns out that this, while it's
true for whenever you have a prime, it's almost never true in a practical sense if p is not prime.
There are a few very screwy minor exceptions and such, but in practice, if you pick a big random
number and then you pick something smaller than it, it's going to have to be bigger than one, but
pick something random in the range of 2 to p-1 or p-2 probably, but anything random is going to
be somewhere in the middle. Raise it to the Pth power. If you get x back, it's prime. This is
how we check for primality in our code and I think we've done it millions of times and I would
wager that we have never randomly picked something that was not prime and had this check
come back even once. In practice, we run it several times to be sure, but it's almost never the
case. We can actually do a little bit better than this. I just want to say quickly, that we can speed
things up by this trick. Instead of just picking a large random number, let's pick a large random
number that we think has some good chance of being prime and test only those instead of just
testing every random number out there. So check if that's prime and go back. It's a slightly
smarter version of the prime generation and protocol. And the way of being slightly smarter is to
introduce a sieve. The idea of the sieve is we pick a random starting point and we figure out, we
have this array. Maybe it's a thousand bits long; it's just a bit array, maybe a thousand long. And
we find the first value in this array; just do a quick calculation that's divisible by two. So all this
is his take n mod 2 and figure out okay, that's where it is, and then we say okay. Once we have
this first multiple of two, that's also a multiple of two, that's also a multiple of two. We just go
through the array and set all of those values to one. And then we figure out where the first
multiple of three is. We set that one and then we go through every third set. We don't even test
what's already there. If it's zero or one, it doesn't matter we set it to one, set it to one set it to one,
go through there. Go through five, set it to one, set it to one, set it to one. Go through small
primes. In practice, maybe we see keep a list of the thousand smallest primes and go through
this. But you see after not very long, you have only a couple of reasonable candidates left and
we will do the primality testing just on the remaining candidates. This is a good way of thinning
the herd very quickly so you don't waste a lot of time doing this large exponentiation when you
could just quickly find the likely primes here. Now this does, we have to be a little bit careful
and introduce some skewing. We're not getting random primes anymore. If you want really
random primes, you shouldn't use a sieve like this because I imagine these are both prime here.
If what I do is take the first prime I run into, I checked that and that's prime. I stop. Then I am
much less likely to pick that as my prime than that. I could say maybe I will pick the second one
first. Well, you are going to get into problems. If you've got just one prime in the range of your
sieve versus two primes in the range of your sieve, the range that has just one prime, that prime
is far more likely to be picked than one of the primes in the range that has two primes, no matter
how I do it. Yep, question?
>>: How much space would say the first thousand bit primes [indiscernible]? Can you store
them all once, store them once and pick one at random?
>> Josh Benaloh: No. Definitely not, because first of all, there are huge numbers of thousand
bit primes. Basically, if you do the calculation, divide two to the thousand by natural log of two,
I'm sorry. Natural log of two to the one thousand, so a thousand times the natural log of two. So
you've got more than you could possibly store total. If you've got an amount that you can store
and somebody knows these are the ones you've stored, then they can just go through and look
and find all those. You want it to be really random each time. You don't want to store them or
keep them someplace where somebody else might run into them.
>>: [indiscernible] and just pick one at random from the pack?
>> Josh Benaloh: Do you mean really the whole set of thousand bit primes?
>>: Yes.
>> Josh Benaloh: You can't possibly store that. There are not enough atoms in the universe to
store all thousand bit primes. There are something like 2 to the 250 particles in the universe and
there are something around 2 to the 900 something thousand bit primes, high 900s, in fact, for
whatever that matters. We introduced a little bit of skewing when we do this. It doesn't really
matter. We bound the skewing. We don't skew arbitrarily. Also, let me just quickly mention a
little story here because it's kind of fun. One of the first things I did when I came to Microsoft,
literally almost 20 years ago, not quite, about 19, was look at the code that was doing this and
looked at, it was doing exactly this. You start out, you sieve out twos, and I said wait a minute.
Do we really have to sit out even numbers? We kind of know. There's a sophisticated theorem
in mathematics that says there are no large even primes, right? [laughter]. There's a small one,
but if you are looking for a big… So the step way back here of sitting out multiples of two, all
we had to do was just compress the array, take the array basically half-size and then every third
odd integer is divisible by three and every fifth odd integer, and effectively just use the same
table, exactly the same process. I literally changed four lines of code because it was just
interpreting where the starting position is a little bit differently and then it's the nth odd integer
afterwards instead of the nth integer afterwards, so the ith integer that you find. Almost exactly,
four, literally touched four lines of code. I got a Ship It award for NT4. [applause]. That might
be a record. But people come into me the next day and say, hey it's 30 percent faster at finding
primes. What did you do? I applied this sophisticated theorem. [laughter]. Okay.
>>: That's a lot.
>> Josh Benaloh: Yeah, I was surprised it was that much faster, but it's spending a lot of time at
the beginning just going through. Anyway, that's it on primes. Any prime questions before we
go on? Okay. Now we can move into elliptic curves. The first question if you want to deal with
elliptic curve cryptic systems is just what are elliptic curves? Just a note, don't expect to see
ellipses here. These have practically nothing to do with -- the connection with ellipses is conic
sections in a very bizarre way, so don't think about ellipses here. Elliptic curves are something
different. By the way, there are a few ringers in the audience who are going to catch me on
anything I say, so we'll see if -- anyway. So what's an elliptic curve? So we go to the source of
all knowledge [laughter] and get a definition. In mathematics, an elliptic curve is smooth
projective algebraic curves of genus one, with this specified point O with, in fact, abelian variety
with multiplication defined algebraically with respect to what is necessarily communicative
group et cetera et cetera. Okay, good. We know what an elliptic curve is [laughter], right?
Maybe we can do this a little more easily. At elliptic curve something that looks like that. That's
an elliptic curve. And you could actually have an x squared term here or non 1 coefficient. You
could have a y, a linear term of y there, but it turns out that all of those things, you can remove
by simple change of variables and these sort of translations of things, so this is considered the
general form for an elliptic curve. Any elliptic curve can be shifted to look like this, so there are
just two constants to worry about and that describes the curve. Okay. So what does one of these
things look like if we try to graph it? They are really weird, right? There's x cubed -4x +.67.
Those are the two constants there, -4 and .67. You get something that looks like this or
something that can look like that. Here are two. They look very different, but the only
difference is +1. Another, another really strange looking one, another, there. Why does it look
that way? Where'd you get that strange shape? To understand this but start off by eliminating
that square and just start looking at this. This is a curve that you probably all graphed in high
school. It's a simple cubic polynomial. It looks kind of like this. In particular, since the
coefficient here is one, then it starts off, there's a negative infinity down here on the left and
positive infinity down here on the right, up on the right. It looks something like this. It's got a
couple humps. In some cases those homes can move together and merge, but generally not. The
general case looks kind of like that. What happens when we put that square back? The first
thing that happens is now we only care about values that are positive. We want to take
effectively the square root of this curve. All of the places where this wound up negative goes
way. We sort of flatten this out because we are taking the square root. I'm not going to show it
because it looks kind of the same. It's kind of flattened here. And then notice that if positive y is
a solution, then negative y is a solution to the same thing, so we have symmetry across the xaxis. That's why we get something that looks kind of like that. The various forms come from
where these pumps are in the cubic. If we start here and do the same thing, cut off the negatives
and reflect, we just get a simple curve over here, no extra parts. If we start there with both of
these, the local max and the local min above the x-axis, then when we cut things off we get
something that looks like that. These are the reasons for these different odd shapes, but they all
come from basically the same thing. Now, things do get kind of weird if you get either the min
or the max exactly touching the x-axis and not crossing it, so we just want to eliminate those
cases from our consideration there and to do that we just eliminate that possibility. We just rule
out that for the constants A and B. Now you know what an elliptic curve is. I have to spend a
couple of minutes telling you about some math, telling you about mathematical groups. If you
ever took a discrete math course or basic college algebra course, you may have seen
mathematical groups. A group is a set of objects together with an operator, a single operator.
I'm writing it as multiplication here. You could write it somewhat differently, and it satisfies
four properties. The first property is that one of the elements in this group is an identity, such
that if I apply that operator and the identity to some element in that group, any element in that
group, I get that element back, whether I apply on the left or on the right. That's the identity
property. There are inverses in groups always. Every element has an inverse such that if I apply
this operator to the inverse in the element, I get the identity, both ways. The third property is
associativity, which means I can group things in either way. Basically, if I just say A times B
times C, whether I do it is A times B times C or A times B times C, I get the same thing. If you
think back, this is why Diffie Hellman works. Diffie Hellman you are taking G to the A to the B
or G to the B to the A. You're still doing A times B Gs, but you're grouping them differently and
you need associativity to work. Whenever you have a group you have associativity and Diffie
Hellman will work. The final property is closure, which just says if I apply the operator to two
things in a group I get something in the group. Okay? So just to get a quick understanding of
these things, I'm going to give you some examples of some groups and some not groups. If I
take the integers, whole numbers, positive and negative including 0, 0 is the identity if I have
addition as my operator, right? I add 0, I get back to where I started, no problem. And all the
other properties, inverses exist. The inverse of 3 is -3 and et cetera. The integers with
subtraction, multiplication or division, and none of those are groups. Maybe subtraction is a
little subtle. Is it clear why subtraction doesn't work? The property it loses on is that
associativity property. 1-1-1. 1-1-1 is different from 1- 1-1. Associativity doesn't work there.
Multiplication, you don't have the inverse of 2 even. 1/2, that's not a whole number, that's not an
integer so you don't have inverses there. Division just messes up on all sorts of things because
associativity doesn't work. You don't have inverses. Division is not close. Okay. Rational
numbers, fractions, things in the form A over B. Again, with the addition 0 as the identity still
works there. Again, subtraction, multiplication, division don't work, basically for the same
reasons except note, we get kind of close with multiplication. We have an inverse for two, one
half, that's in there, so we have it. The problem is we don't have an inverse of 0. If we take that
out, the nonzero rationals with multiplication, 1 is the identity, we do get a group. Okay? A
couple of other examples. The integers mod n, the finith set 0 through n-1, we do our addition
mod n as our operation. Zero is the identity. That's the group. The inverse of 1 is n-1. You add
them together and you get 0. The inverse of 2 is n-2. Okay? One other group I'll mention is the
integers with multiplication mod p and no 0, 1 through p-1, if p is prime. If p is prime then that
turns out to be a group. If p is not prime, it won't be a group. You won't get in versus of some of
the elements in there, but if p is prime it will always be a group. I'm not telling you how to
compute inverses, but you can always find the inverse of 2 is going to be p +1 divided by 2 and
since p is odd for most primes, p +1 over 2 -- p +1 is even divided by 2 and you will find
something, whatever. You can generalize that. It would take some time to show you how to do
division in there, but it's all doable. It works. Now we can get to elliptic groups. And the way
we're going to get there is we are going to look at what happens when you take an elliptic curve,
here's our generic elliptic curve, that form and intersect it was just a straight line. Here's a
typical straight line, any straight line that's not vertical can be written this way. Let's see what
happens. If we take the elliptic curve and that non-vertical straight line, we've just got these two
equations here. Substitute this in here for y and you get ax + b squared equals this. If I just
move the x’s around, I don't really need to do the calculation. This is cubic in x. X cubed plus
something x squared plus something x plus something equals 0 if I just moved things around
here. How many solutions are there to this? This is our friendly cubic equation again, or cubic
polynomial again. The solutions are forever this crosses the x-axis, wherever 0, zeros of the
polynomial. On a typical case there, we're going to have three solutions, but in general, if the
curve is up here, there's only going to be one solution. If it comes down to write where it just
touches and goes back up, the tangent case, very narrow case, you get two solutions. Here's
another common case. You get three solutions as it goes down to there. There are two solutions
again, but that is just a very narrow case. And then down here there's one solution. There's
always at least one. You've got either one intersection point or three intersection points most of
the time, between that curve and the straight line, but you can in this tangent case, these
exceptional cases, you can get two intersection points. Just want to bring in vertical lines also.
A vertical line is x equals C is a vertical line. How does that intersect this elliptic curve? You
have something very similar. We can substitute in here x equals C and you just get y squared
equals some constant. If that constant is positive, there are two solutions, y squared equals 4 has
plus and minus 2 as resolutions. If the constant is negative, you get no solutions and if that
constant is exactly zero, you're going to get one solution. You've got those three cases, just one
fewer intersection points effectively. Zero and 2 are common; 1 is uncommon. Why am I telling
you all of this? Why should you care in the slightest? What I'm going to do, and I'm not
claiming that this is anything -- I just learned this from others. I'm going to take these relations
and make a group. The way I'm going to do it is I'm going to take two points on the curve, any
two points and the operator that I'm going to form is to say what happens if I take those two
points and I draw a line through them. If I've got to separate points, then I've already got two
intersections with a line, so the typical case is going to be a third intersection. Here's the place
where the third intersection is. Two points gives a unique third point, but just to make things a
little weird, I'm not going to take that point as the result. I'm going to take that point and take its
negation, flip it over the x-axis; that point is going to be the result. The group operation is going
to say take these two points, draw a line through them, hit the third intersection point, flip it and
that is going to be your result. Okay. Weird thing to do. It turns out though this gives you a
group. You go through all of the associativity and inverse stuff, you'll get a group by doing this.
How do you add a point to itself or multiply a point by itself depending on how you label the
operation? Here is where I'll use that tangent thing. It's sort of getting arbitrarily close to two
points, getting closer and closer together. The line going through those two points as those two
points actually merge becomes the tangent here that goes right along the curve. That tangent
case hits at exactly one other point, great, because I want a unique result. Take that one other
point, flip that and I've got the result. This handles almost everything. There are a few things
that are left and I just have to describe what to do in those few cases. Here's the point, sorry.
Here's a point and it’s negative. It's inverse. There is the vertical line case when I draw a line
through that, that doesn't fit anything else. What I'm going to do for that case, or what is done
for that case, is create one more point and attach it to this elliptic curve to create my elliptic
group. That point is an artificial point. We call it I. Sometimes it's called the point of infinity.
It's going to handle that vertical line case. This special point also serves as the identity of the
group. Let's just see what happens if I take this point off of infinity and map it through this point
here that kind of goes down here because it was infinitely higher infinitely far away it comes
straight down here, hits the opposite and then when I flip it like I've done with all the others, I
get back to where I started. It serves nicely as the identity, just intuitively. Here are all of the
operations on this curve to create an elliptic group. Once we have done that, you can go back to
high school for a while and do some geometry. You take two points here, x and the y-values
compute the third point. The main case is when x and y are different, two different points.
These are the equations you get. You can work it out for yourself. Tenth grade students should
be able to work this out. I am not going to do that here; I promise. There are a few other cases
when x1 and x2 are the same and y1 and y2 are the same and non-zero; this is the tangent case.
This is the case of adding a point or multiplying a point by itself. That, you get these equations.
Okay. We get similar equations. The final equations are you get the identity if the things are
negatives of each other and the identity composed with any other point is that point. The identity
composed with itself is the identity. These are all the rules for an elliptic group. Hooray. At this
point forget about all those curves; forget about all that geometry. These are just equations now.
You've got some equations. They form a group. I'm not proving that you. I'm just asserting
that, but if you use those equations on points in the group you will get other point in the group
and everything will work well. Now you can do computation in elliptic groups. For any two
points, you can now compute their composition. You can compute u times v. For any point and
any integer, you can compute x to the rth power; just multiply it by itself r times. I want to be a
little bit careful here. I'm using the multiplicative notation here. I'm describing the group
operator as multiplication and saying repeating it is exponentiation. I think it works better for
cryptography in the things I'm going to show you to represent it multiplicatively. Most
mathematicians like to represent elliptic groups additively, so they'll talk about the operation as
addition and repeating the operation many times is just multiplication. It would be scaler
multiplication. Either way, it's exactly the same thing. It looks very different, but it's exactly the
same. We can do large exponentiations now if we want to. We just do the repeated squaring
trick that we saw in the first session or the second session. Early on this repeated scaling trick
gets us to a large power very quickly, so if we want to compute x to the 360th power where x is a
point on an elliptic curve we just sort of square things up and take the side multiplies of the
things that we wanted to get to x to the 360 without doing 360 of these elliptic curve operations.
It's reasonably quick. One more thing to say before we show how to use this in crypto is we in
computer science like things to be finite. I mentioned that earlier. These elliptic groups are
typically very large or potentially very large. They could be infinite. What I described before
would be infinite. Picking a random number from an infinite set is very hard. Pick a random
integer. If I don't give you a bound that random integer is infinitely large because there are very
few comparative integers that are not very, very large. We want things to be finite. What we are
going to do is we've got a set of equations. Let me just finish this and I'll get to it. We're going
to do all of those calculations mod some prime, keep things finite, just the way we were doing
with RSA and Diffie Hellman and whatnot before. We'll do those operations mod a prime and
for some technical reasons we want that pride to be bigger than three because if you look at those
equations before, if you've got twos and threes in there you start dividing by 0 and things get
ugly. Pick a prime bigger than three, typically a large prime and we'll do exactly those algebraic
computations that are inspired by geometry but they are algebraic computations, mod some
prime. Question.
>>: Are you dealing with integers or starting points here?
>> Josh Benaloh: Integers. Well, I'm dealing with integers effectively. Dealing with integers
mod p, so these are integers and I will, whenever I divide I do a mod division which gives
another integer, but it's another integer always smaller than p.
>>: The solutions, the x on the elliptic curve are floating-point numbers, right?
>> Josh Benaloh: Nope. Before I take that mod p, they are rationales, so they would be
represented as floating-point. We don't want to go there because that gets really ugly.
>>: [indiscernible]
>> Josh Benaloh: Instead of doing these equations over the reals or over the rationales and
getting sort of arbitrarily messy things with precision issues that we would deal with, we will
instead every time we pick a number it's going to have, a point is going to be xy where x and y
are both integers smaller than p. And every time we do these computations, we are going to do
those computations mod p and we're going to get results which are another point, which is to
integers less than p.
>>: [indiscernible] if x and y are points on the elliptic curve, you probably have a choice to pick
only one integer. You pick x advantage here, then y is not another integer.
>> Josh Benaloh: Y is not an integer, but if you do -- let me go all the way back to just a picture
of the equation. This equation here, if I take this mod p, so I take x cube it, take some big integer
x, cube it, add x, add b. a and b are integers. I can take a square root, mod p. I'll also get an
integer mod p. I could spend some time, and I would recommend if you really like this stuff,
play around with sort of small things, mod 11 mod 5 and whatnot. You'll find things like -what's a good example? Two cubed mod 7 is 1. Two cubed is 8. So 1 has three cubed roots.
It's got 1; it's got 2 and it's got -2, which is 5. 5 cubed is 125. If that’s right mod 7 should be 1. I
think I got that right. Something seems wrong there, but anyway. 4, or 4, yes, because negative
it's not going to be, because other, yes. Thank you. See I told you I had ringers in the crowd. 5
cubed is -1.
>>: [indiscernible] operation for these curves and we call it a proof, but now we are calling it a
finite field. Don't we need another operation to call it a finite field?
>> Josh Benaloh: I haven't called it a finite field. I've carefully avoided the term field here, or
finite field. We're doing one operation and only one. A field has two operations. Do I have
field anywhere here? If I do I didn't mean to.
>>: On the next one?
>> Josh Benaloh: Oh yeah, finite field. Okay, sorry. Yes. I didn't say whether a field is, so
basically you have two operations in a field. Think of the real numbers with addition and
multiplication and that gets you a field, but I don't want to go into fields. Basically, all the things
we have to do here are just mod p with a one elliptic group operation on top and the arithmetic at
the base is all mod p. Think of it that way. What it gets you is division works and back to this
equation, see, over here there is a division. Here there is a division and divisions there. You
need to be able to do division, you know, mod 7 with three divided by two? Three divided by
two is the thing that you can multiply two by to get three. It turns out to be five. 2x5 is 10; mod
7 is 3. How I computed that, that is mostly trial and error, but there is a way of computing that
with big numbers. I could show you. We could spend some time, but it's called the extended
Euclidean algorithm. It's not that hard, but I don't want to spend the time. We're doing these
operations mod p. Everything is mod p. There is just algebra now, no geometry, doing all this
mod p. We're good. Once we have that we have this notation, E sub p of A, B refers to the
elliptic group that you get when you take this curve and you do the operations mod p. That's our
notation. Now we get back to crypto. Remember Diffie Hellman. Long time back, Diffie
Hellman is the process of pick a prime p and some starting point g. These are agreed to public
values. Alice over here takes a random A; g to the A is her public key. Bob picks a random
private key B; g to the B is his private key and they exchange the values and they apply their
private keys to what they received and they get back a common key. Diffie Hellman. Can we do
Diffie Hellman over elliptic groups instead of over the integers? What has to change? Here we
are starting with a public point on elliptic curve of that form. G is that. Then we're doing the
same exponentiation repeating the group operation and this is the group that we are working in.
We're not doing multiplication over the integers anymore or multiplication mod p anymore.
We're doing these multiplications in this elliptic group according to the equation that I showed
you earlier. Down here there is another exponentiation, so we'll do that in the group. And then
this comes out exactly the same. We still get a common key that way. Diffie Hellman works
really the same way here or in any group, so why do we care what group we are doing it in? We
care because of how hard it is to break Diffie Hellman. What an attacker sees this starting point
g and the two public keys that are exchanged by each of Alice and Bob. That is supposed to be g
to the A,B. I don't know what happened there. G to the A, B. Take a marker and okay. Sorry
about that.
>>: On the previous slide I was confused by A and B. You have capital A and capital B in two
places. I can't see why they would be the same.
>> Josh Benaloh: Capital A…
>>: The one elliptic curve definition.
>> Josh Benaloh: Oh, yeah.
>>: Those are different A’s?
>> Josh Benaloh: Those are different A’s and B’s. I'm sorry.
>>: Those A’s and B’s are preset public?
>> Josh Benaloh: Yes. My bad, my bad. You know, I had a previous version of that where I
changed the variables to u and v and I changed them back here because I wanted to be consistent
with the Diffie Hellman I did earlier. My bad, so okay. This A and B has nothing to do with this
A and B. They were completely different, right? [laughter].
>>: Okay. For the entire elliptic curve including A and B are public and shared ahead of time?
>> Josh Benaloh: Yes. Initial points g, p, A, B here these are all public values. Once you've
done that please forget about that A and B and start with a new A and B down here. I'm sorry.
>>: Does that change the one elliptic curve that everybody uses?
>> Josh Benaloh: Generally there is a small set of elliptic curves that have been well vetted. I'll
talk about that in a little bit. In fact, we have a little project going on to find really good elliptic
curves. But generally, yes, you could generate a new elliptic curve every time, but we all
typically use one of a few. There are some that have come through NIST from the U.S.
government that people don't seem to want to use as much anymore as they did six months ago,
but still the most common ones. There are a bunch of curves around, common curves, and we
support the common curves. Back to this. The most effective attack on Diffie Hellman is
basically to compute discrete logs over the integers are over elliptic curves. If I could get one of
little a or little b then that's enough for me to compute. If I can get little a I have g to the b so I
can raise that to the little a power. So I have to compute one discrete log and that's the best
known attack of any kind. Over the integers there are some ways of doing discrete logs better
than effectively exhaustive search. It's something called the index calculus. It's a sub
exponential algorithm. It's very slow. You get over a thousand bits and it becomes wildly
impractical, but you can do a thousand bits and you can't do an exhaustive search through a
space of 2 to the a thousand. You can get some improvement. It's a real improvement. There is
no similar sub exponential algorithm known for discrete logs in elliptic groups. Therefore, we
can get away with smaller primes, smaller sizes of things and still feel secure, at least secure
against best currently known attacks. Why do we want to use elliptic curves? Really, it’s
efficiency. The elliptic curves are a hot thing. You've heard a lot of people say yeah, that's the
new thing. We should use them. The big benefit is efficiency. Here are just some numbers, 160
bit elliptic curve takes roughly the same amount of time to compute discrete logs on as a 1024 bit
integer. It's roughly equivalent to 1024 bit Diffie Hellman or 1024 bit RSA which we feel is no
longer very secure. We want to go up a little bit higher. 256 bit elliptic curves are what we
typically use and we feel comfortable with that. That has roughly the strength and when we use
RSA now or Diffie Hellman now over the integers, we're typically using 2048. If we use 256 bit
elliptic curves, then we have shorter keys, shorter ciphertext. Everything is smaller than our
2048 integer algorithm. Okay? So there's a real opportunity for improvement there, especially if
you are on small devices. Why not? Elliptic curves have been studied I'm saying far less. Over
recent years they've gotten a lot more studyings, so still less. Integers have been, integer
factorization has been studied for centuries since the time of Gauss, probably even before that.
Not so much for elliptic curves, but they are seeming more and more robust, so we're feeling
more and more confident about them, but I would say still not quite as confident as with integers.
There's no fundamental reason why there could not be a sub exponential elliptic curve discrete
algorithm process found similar to integers. Ringers in the back, challenge me if I'm saying
something wrong there. We don't know of any. The trick is really sort of a notion of smallness.
There are small integers. We can look for small integers and take advantage of small integers
and do good things. There is no notion of a small point on in elliptic curve. Therefore, all of our
methods for integers which are targeted at finding small values and taking advantage of small
values doesn't seem to make sense there. But somebody might come up with some notion of
smallness which has equivalent properties, maybe somebody will find something. That's
possible. If that were the case then elliptic curves wouldn't suddenly become insecure for
cryptography, but 256 bit elliptic curves would suddenly become insecure and we would have to
go up to a larger size and we would lose benefits. To get elliptic curve operations to work
because they are more cumbersome, we only do get a benefit if we have much smaller key sizes.
If we have, if things come sub exponential, then all bets are off. Getting good performance often
requires use of special curves and there have been a lot of special curves in the past that have
been proposed and determined to not be so secure, so we have gone through a litany of special
curves that, we should use this because it's very fast. Yeah, it's really fast for attackers too, not
so good. In answer to what was said before, elliptic curve crypto requires the use of let’s say
sophisticated processes that generate really good curves that we can agree on and we've got, here
are the ones that have been agreed upon by the national Institute of standards and technology and
recommended for everybody and maybe not; maybe we should these for those. We don't expect
individual people to just pick your own favorite elliptic curve. With RSA, pick two primes,
multiply them together and you are good to go. We don't want people trying to do that with
elliptic curves. They are a little bit more delicate. There are some trade-offs. Okay. That said, I
want to say a little bit more about other crypto algorithms and elliptic curves. The digital
signature algorithm that we talked about the end of last time, just like Diffie Hellman goes
through very nicely if we are doing things over elliptic curves and elliptic groups instead of over
integers. RSA also works over elliptic groups, but it's insecure. The trick of only I know the
factorization and therefore only I can do the conversion, doesn't apply with elliptic curves, at
least any way that anybody has found. RSA as it's known doesn't work there, so that's why we
use DSA for signatures; RSA signatures don't work there. I want to take a few minutes unless
there are any questions on elliptic curves? Yep.
>>: What exactly makes elliptic curves so fragile to using your own curves?
>> Josh Benaloh: There are a couple of things. As I said, we have a project that I'm not directly
involved in, so I can look at the people in back and see if you want to say anything. It's mostly a
matter of trying to balance performance very carefully against security. If you are willing to go
with larger curves, not a special form, then you can do pretty well in most cases. It's not a big
problem, but if you want to get sort of the optimal performance and squeeze everything out, then
if you take things that are too good in some ways you get into problems. As long as you don't try
to push the margins to much you don't get too much fragility, but if you try to squeeze out every
little bit of performance, that's where things get a little risky. Yep.
>>: So you mentioned [indiscernible] encryption with elliptic curves?
>> Josh Benaloh: The way it's typically done is you agree on a key and then you encrypt with
AES or symmetric cipher using that key. There is, you could do what's called ElGamal
encryption which is implicitly the thing you agreed on is your key immediately and you use it as
a one-time pad and just send over ciphertext with that as a one-time pad so there isn't a separate
step. Effectively, you generally do some sort of a symmetric step with the agreed-upon key,
which is what you do with integer Diffie Hellman as well.
>>: Channel encryption between two parties. With RSA you could do public-key encryption of
a key which is used to encrypt your text at rest, right?
>> Josh Benaloh: You can, although…
>>: Is there a way to do that with elliptic curve?
>> Josh Benaloh: You wouldn't typically use RSA in that forum because the idea is you have a
public key. I can use the public key to encrypt, to get a key or to encrypt data and send it to you.
If I'm just encrypting locally for my own purposes…
>>: I'm in correcting it. You are my intended audience. I would use your public key to encrypt
the key which I could then leave at rest without communicating with you. And then I give you
the drive at some point and you can read the encrypted data.
>> Josh Benaloh: Right. We talked a little bit about different versions of Diffie Hellman and if I
have a static public-key, you can do the same thing with Diffie Hellman. You have seen my
Diffie Hellman in g to the a; I'm Alice here. You've seen my g to the a. Now if I have the static
key you pick a b. You send me g to the b and you encrypt with g to the ab and you can do
effectively the same thing. Okay. Let me spend a few minutes and I'm really not going to spend
a lot of time talking about lattices, but they are interesting to know about. I have to answer the
same question that I answered before about elliptic curves. What is a lattice? Of course, once
again, we turn to the source of all knowledge and we get another beautiful definition. Lattice is a
discrete subgroup of Rn which spans the real vector space. We generated from a basis, from
linear combinations et cetera et cetera, or let's make it a little easier. A lattice is something that
looks kind of like that. It's a set of points in a regular pattern and I'll say a little bit more about
what that regular pattern is. If we want a two-dimensional lattice, the nicest regular pattern is
just a square lattice like that, but they don't have to be square. This is rectangular. It starts to
look like an optical illusion if I do this. This is still rectangular but it's not square anymore. You
can have other nice tiling patterns effectively. They don't have to be triangular, things like that.
Where it comes from is a basis. Basically, a lattice is formed from a basis which is a set of
vectors and vectors, think of it as just points, from point zero to some point in the plane. What
we do is we take these vectors and take any linear combination of those with integer coefficients
and the points we get to are the lattice. A simple basis for a lattice, the square lattice would be
here is vector one. Here is vector two and then I can put it here and I can easily count how I get
from one point to another. To get from here to here it's two v1’s plus three v2’s. It's really easy
to see most of the time how to get from one point to another point. You have a nice simple basis.
But you can have a more complex basis for exactly the same lattice. Here's the same lattice. It's
generated by this basis, but now how do you go from this point to a neighboring point? You can
figure it out. You can work it out but it's not going to be quite so obvious anymore. You have to
add some of these and then subtract that. Sort of the difference between this and that is a
Knight’s move if you think of it. Over one, up two, now how do I get from a Knight’s move? If
I go to there, I've got a Knight’s move, maybe down to there and I can get back to here. You can
work it out, but now imagine this with not just to dimensions but a thousand dimensions which is
the kind of thing we do for crypto and it can get really ugly. Here's another case. Here's a
simple basis for this lattice. This is a pretty clean lattice and this is a pretty clean basis. And
here is an uglier basis for the same lattice. You generate exactly the same set of points by integer
combinations of these things. It's just that some are easy to see and some are harder to see.
There is something called the closest vector problem in a lattice. With a good basis, finding
nearby lattice points, if you are somewhere in the plane say, or somewhere in space, is easy with
a skewed basis, more elongated ones. Finding nearby lattice points can be very difficult. We
can use that for crypto systems in a couple of ways. Lattice-based cryptosystems typically looks
like you generate a key by picking a nice clean basis for your lattice, things that are almost
rectilinear, things that have nice big angles between your vectors and you might have literally a
thousand different vectors. It might be a thousand dimensional lattice. You then transform your
basis into something that's really skewed and ugly and that's what you give to other people to
work with. They can't manipulate this lattice very easily. Once you have that you can do a
couple of things. Encryption could just be I give you the skewed basis. You use the skewed
basis to pick some lattice point and you perturbed by a little bit and what, your perturbation is
actually your message. It might be that it's a little down and right of a point or a little up and left
of a point and that might be a bit or there might be a few bits of sort of what direction you go
from that point. You decrypt by using good basis to figure out where the nearby lattice point
really is. A more common way to do this is actually a little bit more complicated. Is you use
your message; you get a higher data rate if you do something like this. You use your message to
take the skewed basis and let linear combination becomes your message, so let's say I put my
message as zeros and ones. 0 times the first basis vector +1 times the second base vector +1
times the third base vector et cetera and sort of combined them and I get some point in the lattice
and then I perturb that a little bit. And I give that to you as the encryption knowing the good
basis I can annoy how I transformed from the good basis to the skewed basis, I can go back and
figure out exactly which vectors you combined and how and that's the trick. So that's the basis
here. Once again, we have finite lattices. We don't want infinite things, the usual problem, so
we do our computation mod sum large prime. The same questions. Why should we use lattices?
It's nice to have different algorithms that have very different designs. It turns out that discrete
log and factorization and integers are closely related problems and if somebody figures out how
to factor, Diffie Hellman would likely fall apart. In particular, lattice methods seem to be a lot
more resistant against quantum attacks, so I'm not planning on talking about quantum computers
now. If you want to hear more about quantum computers bother me another time when I'm not
about to run off to the airport. If quantum computers develop lattices would be a very good thing
to have as a cryptographic method. Why not? They are unwieldy. The public keys tend to be
enormous. A thousand vectors, vectors tend to have each -- this much in the first dimension and
this bunch in the second dimension and there are a thousand dimensions all from a thousand-bit
prime and now you have a thousand of those vectors. That's your public key. Sheesh.
Similarly, the encrypted data tends to be enormous compared to the amount of data that you are
actually transmitting, so they are kind of unwieldy. We don't really use them in practice a lot,
but it's nice to have them in our back pockets just in case.
>>: [indiscernible] the method is unwieldy so like a thousand vectors, how big is [indiscernible]
>> Josh Benaloh: It depends. I've sort of sloughed over it because there are different
cryptosystems using lattices, but it could be that it's something like a thousand bits that you
transmit with a basis point that you've gotten in a thousand dimensions. So it's, what, a factor of
a thousand. Do you know roughly a factor of a thousand and something like [indiscernible], for
instance? The difference between a payload and the size of the encryption? Michael, do you
know?
>>: [indiscernible]
>> Josh Benaloh: Okay. Something like a thousand, factor of a thousand. Yeah, it's hard to
work with. Okay. Next session, a lot less math but we're not getting completely rid of the math.
Sort of vulnerabilities, attacks, tactical considerations, these actually tied together because of a
lot of practical tricks that we use to make things more efficient that leads to our vulnerabilities.
[laughter]. And leads to attacks and whatnot, so it's important. We want to know about them.
We want to do them but we want to do them carefully because that has been our bane many
times. The final session, what I'm planning on is talking about some applications, some of these
things. If I'm permitted I'll squeeze in election protocols. I am flying out right now to Austin to
talk to them about, to work with them on their new election system design, so something I would
like to talk about if there's time. But if there are other things that people want to do we can do
that in addition, or instead, just let me know. Okay. Any questions? Good. Then I can get off
and make my flight. Okay. Thank you. [applause]
Download