>> Justin: Thank you, again, everybody for coming. It's bittersweet because while I'm seeing less people in the audience in person, it makes me happy that I know I'll be able to afford all of the gifts that I'm going to be buying for attending all six. There is a sign-up sheet in the back. Even if you haven't been to all six it would be great to put you on that. We won't over spam you with e-mail, but what I would like to do is at least give you summaries of kind of where you can go for additional crypto stuff, any other advertisements that Josh has, links to the Resnet broadcasts so that you can share amongst your team or go watch again because, I don't know about you guys, but I don't feel very smart now that I'm in session four. I know there's a lot smarter people than me. I think they're sitting down and not standing up talking, so I always want to introduce you as like the guy that we should depend on for crypto information, but I am, again, please that you are here to do this and I think that we all have a lot to thank you for with the information that you're giving us. I hope to get a little bit to take away for myself even though some of it is over my head now, but hopefully not for the rest of you guys. >> Josh Benaloh: Hopefully, this will be motivating. >> Justin: Yeah, I'm, hopefully like some that are like me. We still keep coming and we still learn from some of the areas that we do understand. Let's give a round of applause for Josh. [applause]. >> Josh Benaloh: All right. Thank you Justin. And I don't know. If you're really nice to him you might convince him that 5 out of 6 is enough. I don't know. It's not up to me; it's up to him, right? Okay. And a quick warning as I get started, we'll have some time for questions afterward, but I have to get to SeaTac. My flight was originally scheduled for four o'clock which meant that I would have to run out, but now it's 4:30, so I don't have to run but still, I, yes. Anyway, and actually I'll be -- on another trip I get back just before the session in two weeks, but that I have a little bit more leeway. Okay. Where are we? Last time we talked about the basics of asymmetric cryptography. We talked about the Diffie Hellman protocol. We talked about the Rivest, Shamir, Adleman protocol, best known as RSA and the digital signature algorithm or DSA. What we're going to do today is talk about special forms, non-integer forms of asymmetric crypto. In particular, elliptic curves, elliptic curve systems and lattices and latticebased systems. Most of the time we will be on elliptic. But before we do that, I want to start with something that probably logically belongs with the last session, but timing wise it's good to squeeze it in here. All of those three things that we talked about last time began with the following in some form or another. Pick a large prime or a couple of large primes and then. It was pointed out to me by a few people afterwards that I never really mentioned how you get large primes, so I'm going to spend just a few minutes on that to finish covering that. And in fact, everything we do today is also going to require being able to pick large primes, so it does fit in here as well. The question we have to answer is, how do we do this? How do we find big primes? The basic technique is pretty simple. You run through a loop something like this. Pick a large random number, see if it's prime. If not, repeat. Okay. So how long is this going to take, first of all? Is a very important theorem in number theory called the prime number theorem that says how many primes there are. One out of every .7n n-bit integers is prime. That .7 actually approaches natural log of 2. That's where that comes from. But if you're looking for 100-bit primes, one out of every 70 100-bit primes. If you are looking for a thousand-bit primes, 1 out of 700, so if you just follow this simple loop, you have to go through it about 700 times to find each of the a thousand-bit primes that you need to select to set up RSA, for instance. Okay. I have begged the question though of how do you check to see if it's prime. Now that we know the prime number theorem you know how many primes there are. How do you do this check? If you remember from last time, we talked about Fermat’s Little Theorem. This nice little theorem that says if you have a prime number then if you take any x, x to the Pth power is congruent mod p. If you reduce it mod p you get the same thing as x or it’s mod p. In particular, if you take something smaller than p, raise it to the Pth power, you should get back to where you started from. This is true for all primes. If you take an x and raise it to the Pth power and you get something else back, you know you've got something that's not prime. We got a really good way of finding not primes, which isn't exactly what we want here, but it turns out that this, while it's true for whenever you have a prime, it's almost never true in a practical sense if p is not prime. There are a few very screwy minor exceptions and such, but in practice, if you pick a big random number and then you pick something smaller than it, it's going to have to be bigger than one, but pick something random in the range of 2 to p-1 or p-2 probably, but anything random is going to be somewhere in the middle. Raise it to the Pth power. If you get x back, it's prime. This is how we check for primality in our code and I think we've done it millions of times and I would wager that we have never randomly picked something that was not prime and had this check come back even once. In practice, we run it several times to be sure, but it's almost never the case. We can actually do a little bit better than this. I just want to say quickly, that we can speed things up by this trick. Instead of just picking a large random number, let's pick a large random number that we think has some good chance of being prime and test only those instead of just testing every random number out there. So check if that's prime and go back. It's a slightly smarter version of the prime generation and protocol. And the way of being slightly smarter is to introduce a sieve. The idea of the sieve is we pick a random starting point and we figure out, we have this array. Maybe it's a thousand bits long; it's just a bit array, maybe a thousand long. And we find the first value in this array; just do a quick calculation that's divisible by two. So all this is his take n mod 2 and figure out okay, that's where it is, and then we say okay. Once we have this first multiple of two, that's also a multiple of two, that's also a multiple of two. We just go through the array and set all of those values to one. And then we figure out where the first multiple of three is. We set that one and then we go through every third set. We don't even test what's already there. If it's zero or one, it doesn't matter we set it to one, set it to one set it to one, go through there. Go through five, set it to one, set it to one, set it to one. Go through small primes. In practice, maybe we see keep a list of the thousand smallest primes and go through this. But you see after not very long, you have only a couple of reasonable candidates left and we will do the primality testing just on the remaining candidates. This is a good way of thinning the herd very quickly so you don't waste a lot of time doing this large exponentiation when you could just quickly find the likely primes here. Now this does, we have to be a little bit careful and introduce some skewing. We're not getting random primes anymore. If you want really random primes, you shouldn't use a sieve like this because I imagine these are both prime here. If what I do is take the first prime I run into, I checked that and that's prime. I stop. Then I am much less likely to pick that as my prime than that. I could say maybe I will pick the second one first. Well, you are going to get into problems. If you've got just one prime in the range of your sieve versus two primes in the range of your sieve, the range that has just one prime, that prime is far more likely to be picked than one of the primes in the range that has two primes, no matter how I do it. Yep, question? >>: How much space would say the first thousand bit primes [indiscernible]? Can you store them all once, store them once and pick one at random? >> Josh Benaloh: No. Definitely not, because first of all, there are huge numbers of thousand bit primes. Basically, if you do the calculation, divide two to the thousand by natural log of two, I'm sorry. Natural log of two to the one thousand, so a thousand times the natural log of two. So you've got more than you could possibly store total. If you've got an amount that you can store and somebody knows these are the ones you've stored, then they can just go through and look and find all those. You want it to be really random each time. You don't want to store them or keep them someplace where somebody else might run into them. >>: [indiscernible] and just pick one at random from the pack? >> Josh Benaloh: Do you mean really the whole set of thousand bit primes? >>: Yes. >> Josh Benaloh: You can't possibly store that. There are not enough atoms in the universe to store all thousand bit primes. There are something like 2 to the 250 particles in the universe and there are something around 2 to the 900 something thousand bit primes, high 900s, in fact, for whatever that matters. We introduced a little bit of skewing when we do this. It doesn't really matter. We bound the skewing. We don't skew arbitrarily. Also, let me just quickly mention a little story here because it's kind of fun. One of the first things I did when I came to Microsoft, literally almost 20 years ago, not quite, about 19, was look at the code that was doing this and looked at, it was doing exactly this. You start out, you sieve out twos, and I said wait a minute. Do we really have to sit out even numbers? We kind of know. There's a sophisticated theorem in mathematics that says there are no large even primes, right? [laughter]. There's a small one, but if you are looking for a big… So the step way back here of sitting out multiples of two, all we had to do was just compress the array, take the array basically half-size and then every third odd integer is divisible by three and every fifth odd integer, and effectively just use the same table, exactly the same process. I literally changed four lines of code because it was just interpreting where the starting position is a little bit differently and then it's the nth odd integer afterwards instead of the nth integer afterwards, so the ith integer that you find. Almost exactly, four, literally touched four lines of code. I got a Ship It award for NT4. [applause]. That might be a record. But people come into me the next day and say, hey it's 30 percent faster at finding primes. What did you do? I applied this sophisticated theorem. [laughter]. Okay. >>: That's a lot. >> Josh Benaloh: Yeah, I was surprised it was that much faster, but it's spending a lot of time at the beginning just going through. Anyway, that's it on primes. Any prime questions before we go on? Okay. Now we can move into elliptic curves. The first question if you want to deal with elliptic curve cryptic systems is just what are elliptic curves? Just a note, don't expect to see ellipses here. These have practically nothing to do with -- the connection with ellipses is conic sections in a very bizarre way, so don't think about ellipses here. Elliptic curves are something different. By the way, there are a few ringers in the audience who are going to catch me on anything I say, so we'll see if -- anyway. So what's an elliptic curve? So we go to the source of all knowledge [laughter] and get a definition. In mathematics, an elliptic curve is smooth projective algebraic curves of genus one, with this specified point O with, in fact, abelian variety with multiplication defined algebraically with respect to what is necessarily communicative group et cetera et cetera. Okay, good. We know what an elliptic curve is [laughter], right? Maybe we can do this a little more easily. At elliptic curve something that looks like that. That's an elliptic curve. And you could actually have an x squared term here or non 1 coefficient. You could have a y, a linear term of y there, but it turns out that all of those things, you can remove by simple change of variables and these sort of translations of things, so this is considered the general form for an elliptic curve. Any elliptic curve can be shifted to look like this, so there are just two constants to worry about and that describes the curve. Okay. So what does one of these things look like if we try to graph it? They are really weird, right? There's x cubed -4x +.67. Those are the two constants there, -4 and .67. You get something that looks like this or something that can look like that. Here are two. They look very different, but the only difference is +1. Another, another really strange looking one, another, there. Why does it look that way? Where'd you get that strange shape? To understand this but start off by eliminating that square and just start looking at this. This is a curve that you probably all graphed in high school. It's a simple cubic polynomial. It looks kind of like this. In particular, since the coefficient here is one, then it starts off, there's a negative infinity down here on the left and positive infinity down here on the right, up on the right. It looks something like this. It's got a couple humps. In some cases those homes can move together and merge, but generally not. The general case looks kind of like that. What happens when we put that square back? The first thing that happens is now we only care about values that are positive. We want to take effectively the square root of this curve. All of the places where this wound up negative goes way. We sort of flatten this out because we are taking the square root. I'm not going to show it because it looks kind of the same. It's kind of flattened here. And then notice that if positive y is a solution, then negative y is a solution to the same thing, so we have symmetry across the xaxis. That's why we get something that looks kind of like that. The various forms come from where these pumps are in the cubic. If we start here and do the same thing, cut off the negatives and reflect, we just get a simple curve over here, no extra parts. If we start there with both of these, the local max and the local min above the x-axis, then when we cut things off we get something that looks like that. These are the reasons for these different odd shapes, but they all come from basically the same thing. Now, things do get kind of weird if you get either the min or the max exactly touching the x-axis and not crossing it, so we just want to eliminate those cases from our consideration there and to do that we just eliminate that possibility. We just rule out that for the constants A and B. Now you know what an elliptic curve is. I have to spend a couple of minutes telling you about some math, telling you about mathematical groups. If you ever took a discrete math course or basic college algebra course, you may have seen mathematical groups. A group is a set of objects together with an operator, a single operator. I'm writing it as multiplication here. You could write it somewhat differently, and it satisfies four properties. The first property is that one of the elements in this group is an identity, such that if I apply that operator and the identity to some element in that group, any element in that group, I get that element back, whether I apply on the left or on the right. That's the identity property. There are inverses in groups always. Every element has an inverse such that if I apply this operator to the inverse in the element, I get the identity, both ways. The third property is associativity, which means I can group things in either way. Basically, if I just say A times B times C, whether I do it is A times B times C or A times B times C, I get the same thing. If you think back, this is why Diffie Hellman works. Diffie Hellman you are taking G to the A to the B or G to the B to the A. You're still doing A times B Gs, but you're grouping them differently and you need associativity to work. Whenever you have a group you have associativity and Diffie Hellman will work. The final property is closure, which just says if I apply the operator to two things in a group I get something in the group. Okay? So just to get a quick understanding of these things, I'm going to give you some examples of some groups and some not groups. If I take the integers, whole numbers, positive and negative including 0, 0 is the identity if I have addition as my operator, right? I add 0, I get back to where I started, no problem. And all the other properties, inverses exist. The inverse of 3 is -3 and et cetera. The integers with subtraction, multiplication or division, and none of those are groups. Maybe subtraction is a little subtle. Is it clear why subtraction doesn't work? The property it loses on is that associativity property. 1-1-1. 1-1-1 is different from 1- 1-1. Associativity doesn't work there. Multiplication, you don't have the inverse of 2 even. 1/2, that's not a whole number, that's not an integer so you don't have inverses there. Division just messes up on all sorts of things because associativity doesn't work. You don't have inverses. Division is not close. Okay. Rational numbers, fractions, things in the form A over B. Again, with the addition 0 as the identity still works there. Again, subtraction, multiplication, division don't work, basically for the same reasons except note, we get kind of close with multiplication. We have an inverse for two, one half, that's in there, so we have it. The problem is we don't have an inverse of 0. If we take that out, the nonzero rationals with multiplication, 1 is the identity, we do get a group. Okay? A couple of other examples. The integers mod n, the finith set 0 through n-1, we do our addition mod n as our operation. Zero is the identity. That's the group. The inverse of 1 is n-1. You add them together and you get 0. The inverse of 2 is n-2. Okay? One other group I'll mention is the integers with multiplication mod p and no 0, 1 through p-1, if p is prime. If p is prime then that turns out to be a group. If p is not prime, it won't be a group. You won't get in versus of some of the elements in there, but if p is prime it will always be a group. I'm not telling you how to compute inverses, but you can always find the inverse of 2 is going to be p +1 divided by 2 and since p is odd for most primes, p +1 over 2 -- p +1 is even divided by 2 and you will find something, whatever. You can generalize that. It would take some time to show you how to do division in there, but it's all doable. It works. Now we can get to elliptic groups. And the way we're going to get there is we are going to look at what happens when you take an elliptic curve, here's our generic elliptic curve, that form and intersect it was just a straight line. Here's a typical straight line, any straight line that's not vertical can be written this way. Let's see what happens. If we take the elliptic curve and that non-vertical straight line, we've just got these two equations here. Substitute this in here for y and you get ax + b squared equals this. If I just move the x’s around, I don't really need to do the calculation. This is cubic in x. X cubed plus something x squared plus something x plus something equals 0 if I just moved things around here. How many solutions are there to this? This is our friendly cubic equation again, or cubic polynomial again. The solutions are forever this crosses the x-axis, wherever 0, zeros of the polynomial. On a typical case there, we're going to have three solutions, but in general, if the curve is up here, there's only going to be one solution. If it comes down to write where it just touches and goes back up, the tangent case, very narrow case, you get two solutions. Here's another common case. You get three solutions as it goes down to there. There are two solutions again, but that is just a very narrow case. And then down here there's one solution. There's always at least one. You've got either one intersection point or three intersection points most of the time, between that curve and the straight line, but you can in this tangent case, these exceptional cases, you can get two intersection points. Just want to bring in vertical lines also. A vertical line is x equals C is a vertical line. How does that intersect this elliptic curve? You have something very similar. We can substitute in here x equals C and you just get y squared equals some constant. If that constant is positive, there are two solutions, y squared equals 4 has plus and minus 2 as resolutions. If the constant is negative, you get no solutions and if that constant is exactly zero, you're going to get one solution. You've got those three cases, just one fewer intersection points effectively. Zero and 2 are common; 1 is uncommon. Why am I telling you all of this? Why should you care in the slightest? What I'm going to do, and I'm not claiming that this is anything -- I just learned this from others. I'm going to take these relations and make a group. The way I'm going to do it is I'm going to take two points on the curve, any two points and the operator that I'm going to form is to say what happens if I take those two points and I draw a line through them. If I've got to separate points, then I've already got two intersections with a line, so the typical case is going to be a third intersection. Here's the place where the third intersection is. Two points gives a unique third point, but just to make things a little weird, I'm not going to take that point as the result. I'm going to take that point and take its negation, flip it over the x-axis; that point is going to be the result. The group operation is going to say take these two points, draw a line through them, hit the third intersection point, flip it and that is going to be your result. Okay. Weird thing to do. It turns out though this gives you a group. You go through all of the associativity and inverse stuff, you'll get a group by doing this. How do you add a point to itself or multiply a point by itself depending on how you label the operation? Here is where I'll use that tangent thing. It's sort of getting arbitrarily close to two points, getting closer and closer together. The line going through those two points as those two points actually merge becomes the tangent here that goes right along the curve. That tangent case hits at exactly one other point, great, because I want a unique result. Take that one other point, flip that and I've got the result. This handles almost everything. There are a few things that are left and I just have to describe what to do in those few cases. Here's the point, sorry. Here's a point and it’s negative. It's inverse. There is the vertical line case when I draw a line through that, that doesn't fit anything else. What I'm going to do for that case, or what is done for that case, is create one more point and attach it to this elliptic curve to create my elliptic group. That point is an artificial point. We call it I. Sometimes it's called the point of infinity. It's going to handle that vertical line case. This special point also serves as the identity of the group. Let's just see what happens if I take this point off of infinity and map it through this point here that kind of goes down here because it was infinitely higher infinitely far away it comes straight down here, hits the opposite and then when I flip it like I've done with all the others, I get back to where I started. It serves nicely as the identity, just intuitively. Here are all of the operations on this curve to create an elliptic group. Once we have done that, you can go back to high school for a while and do some geometry. You take two points here, x and the y-values compute the third point. The main case is when x and y are different, two different points. These are the equations you get. You can work it out for yourself. Tenth grade students should be able to work this out. I am not going to do that here; I promise. There are a few other cases when x1 and x2 are the same and y1 and y2 are the same and non-zero; this is the tangent case. This is the case of adding a point or multiplying a point by itself. That, you get these equations. Okay. We get similar equations. The final equations are you get the identity if the things are negatives of each other and the identity composed with any other point is that point. The identity composed with itself is the identity. These are all the rules for an elliptic group. Hooray. At this point forget about all those curves; forget about all that geometry. These are just equations now. You've got some equations. They form a group. I'm not proving that you. I'm just asserting that, but if you use those equations on points in the group you will get other point in the group and everything will work well. Now you can do computation in elliptic groups. For any two points, you can now compute their composition. You can compute u times v. For any point and any integer, you can compute x to the rth power; just multiply it by itself r times. I want to be a little bit careful here. I'm using the multiplicative notation here. I'm describing the group operator as multiplication and saying repeating it is exponentiation. I think it works better for cryptography in the things I'm going to show you to represent it multiplicatively. Most mathematicians like to represent elliptic groups additively, so they'll talk about the operation as addition and repeating the operation many times is just multiplication. It would be scaler multiplication. Either way, it's exactly the same thing. It looks very different, but it's exactly the same. We can do large exponentiations now if we want to. We just do the repeated squaring trick that we saw in the first session or the second session. Early on this repeated scaling trick gets us to a large power very quickly, so if we want to compute x to the 360th power where x is a point on an elliptic curve we just sort of square things up and take the side multiplies of the things that we wanted to get to x to the 360 without doing 360 of these elliptic curve operations. It's reasonably quick. One more thing to say before we show how to use this in crypto is we in computer science like things to be finite. I mentioned that earlier. These elliptic groups are typically very large or potentially very large. They could be infinite. What I described before would be infinite. Picking a random number from an infinite set is very hard. Pick a random integer. If I don't give you a bound that random integer is infinitely large because there are very few comparative integers that are not very, very large. We want things to be finite. What we are going to do is we've got a set of equations. Let me just finish this and I'll get to it. We're going to do all of those calculations mod some prime, keep things finite, just the way we were doing with RSA and Diffie Hellman and whatnot before. We'll do those operations mod a prime and for some technical reasons we want that pride to be bigger than three because if you look at those equations before, if you've got twos and threes in there you start dividing by 0 and things get ugly. Pick a prime bigger than three, typically a large prime and we'll do exactly those algebraic computations that are inspired by geometry but they are algebraic computations, mod some prime. Question. >>: Are you dealing with integers or starting points here? >> Josh Benaloh: Integers. Well, I'm dealing with integers effectively. Dealing with integers mod p, so these are integers and I will, whenever I divide I do a mod division which gives another integer, but it's another integer always smaller than p. >>: The solutions, the x on the elliptic curve are floating-point numbers, right? >> Josh Benaloh: Nope. Before I take that mod p, they are rationales, so they would be represented as floating-point. We don't want to go there because that gets really ugly. >>: [indiscernible] >> Josh Benaloh: Instead of doing these equations over the reals or over the rationales and getting sort of arbitrarily messy things with precision issues that we would deal with, we will instead every time we pick a number it's going to have, a point is going to be xy where x and y are both integers smaller than p. And every time we do these computations, we are going to do those computations mod p and we're going to get results which are another point, which is to integers less than p. >>: [indiscernible] if x and y are points on the elliptic curve, you probably have a choice to pick only one integer. You pick x advantage here, then y is not another integer. >> Josh Benaloh: Y is not an integer, but if you do -- let me go all the way back to just a picture of the equation. This equation here, if I take this mod p, so I take x cube it, take some big integer x, cube it, add x, add b. a and b are integers. I can take a square root, mod p. I'll also get an integer mod p. I could spend some time, and I would recommend if you really like this stuff, play around with sort of small things, mod 11 mod 5 and whatnot. You'll find things like -what's a good example? Two cubed mod 7 is 1. Two cubed is 8. So 1 has three cubed roots. It's got 1; it's got 2 and it's got -2, which is 5. 5 cubed is 125. If that’s right mod 7 should be 1. I think I got that right. Something seems wrong there, but anyway. 4, or 4, yes, because negative it's not going to be, because other, yes. Thank you. See I told you I had ringers in the crowd. 5 cubed is -1. >>: [indiscernible] operation for these curves and we call it a proof, but now we are calling it a finite field. Don't we need another operation to call it a finite field? >> Josh Benaloh: I haven't called it a finite field. I've carefully avoided the term field here, or finite field. We're doing one operation and only one. A field has two operations. Do I have field anywhere here? If I do I didn't mean to. >>: On the next one? >> Josh Benaloh: Oh yeah, finite field. Okay, sorry. Yes. I didn't say whether a field is, so basically you have two operations in a field. Think of the real numbers with addition and multiplication and that gets you a field, but I don't want to go into fields. Basically, all the things we have to do here are just mod p with a one elliptic group operation on top and the arithmetic at the base is all mod p. Think of it that way. What it gets you is division works and back to this equation, see, over here there is a division. Here there is a division and divisions there. You need to be able to do division, you know, mod 7 with three divided by two? Three divided by two is the thing that you can multiply two by to get three. It turns out to be five. 2x5 is 10; mod 7 is 3. How I computed that, that is mostly trial and error, but there is a way of computing that with big numbers. I could show you. We could spend some time, but it's called the extended Euclidean algorithm. It's not that hard, but I don't want to spend the time. We're doing these operations mod p. Everything is mod p. There is just algebra now, no geometry, doing all this mod p. We're good. Once we have that we have this notation, E sub p of A, B refers to the elliptic group that you get when you take this curve and you do the operations mod p. That's our notation. Now we get back to crypto. Remember Diffie Hellman. Long time back, Diffie Hellman is the process of pick a prime p and some starting point g. These are agreed to public values. Alice over here takes a random A; g to the A is her public key. Bob picks a random private key B; g to the B is his private key and they exchange the values and they apply their private keys to what they received and they get back a common key. Diffie Hellman. Can we do Diffie Hellman over elliptic groups instead of over the integers? What has to change? Here we are starting with a public point on elliptic curve of that form. G is that. Then we're doing the same exponentiation repeating the group operation and this is the group that we are working in. We're not doing multiplication over the integers anymore or multiplication mod p anymore. We're doing these multiplications in this elliptic group according to the equation that I showed you earlier. Down here there is another exponentiation, so we'll do that in the group. And then this comes out exactly the same. We still get a common key that way. Diffie Hellman works really the same way here or in any group, so why do we care what group we are doing it in? We care because of how hard it is to break Diffie Hellman. What an attacker sees this starting point g and the two public keys that are exchanged by each of Alice and Bob. That is supposed to be g to the A,B. I don't know what happened there. G to the A, B. Take a marker and okay. Sorry about that. >>: On the previous slide I was confused by A and B. You have capital A and capital B in two places. I can't see why they would be the same. >> Josh Benaloh: Capital A… >>: The one elliptic curve definition. >> Josh Benaloh: Oh, yeah. >>: Those are different A’s? >> Josh Benaloh: Those are different A’s and B’s. I'm sorry. >>: Those A’s and B’s are preset public? >> Josh Benaloh: Yes. My bad, my bad. You know, I had a previous version of that where I changed the variables to u and v and I changed them back here because I wanted to be consistent with the Diffie Hellman I did earlier. My bad, so okay. This A and B has nothing to do with this A and B. They were completely different, right? [laughter]. >>: Okay. For the entire elliptic curve including A and B are public and shared ahead of time? >> Josh Benaloh: Yes. Initial points g, p, A, B here these are all public values. Once you've done that please forget about that A and B and start with a new A and B down here. I'm sorry. >>: Does that change the one elliptic curve that everybody uses? >> Josh Benaloh: Generally there is a small set of elliptic curves that have been well vetted. I'll talk about that in a little bit. In fact, we have a little project going on to find really good elliptic curves. But generally, yes, you could generate a new elliptic curve every time, but we all typically use one of a few. There are some that have come through NIST from the U.S. government that people don't seem to want to use as much anymore as they did six months ago, but still the most common ones. There are a bunch of curves around, common curves, and we support the common curves. Back to this. The most effective attack on Diffie Hellman is basically to compute discrete logs over the integers are over elliptic curves. If I could get one of little a or little b then that's enough for me to compute. If I can get little a I have g to the b so I can raise that to the little a power. So I have to compute one discrete log and that's the best known attack of any kind. Over the integers there are some ways of doing discrete logs better than effectively exhaustive search. It's something called the index calculus. It's a sub exponential algorithm. It's very slow. You get over a thousand bits and it becomes wildly impractical, but you can do a thousand bits and you can't do an exhaustive search through a space of 2 to the a thousand. You can get some improvement. It's a real improvement. There is no similar sub exponential algorithm known for discrete logs in elliptic groups. Therefore, we can get away with smaller primes, smaller sizes of things and still feel secure, at least secure against best currently known attacks. Why do we want to use elliptic curves? Really, it’s efficiency. The elliptic curves are a hot thing. You've heard a lot of people say yeah, that's the new thing. We should use them. The big benefit is efficiency. Here are just some numbers, 160 bit elliptic curve takes roughly the same amount of time to compute discrete logs on as a 1024 bit integer. It's roughly equivalent to 1024 bit Diffie Hellman or 1024 bit RSA which we feel is no longer very secure. We want to go up a little bit higher. 256 bit elliptic curves are what we typically use and we feel comfortable with that. That has roughly the strength and when we use RSA now or Diffie Hellman now over the integers, we're typically using 2048. If we use 256 bit elliptic curves, then we have shorter keys, shorter ciphertext. Everything is smaller than our 2048 integer algorithm. Okay? So there's a real opportunity for improvement there, especially if you are on small devices. Why not? Elliptic curves have been studied I'm saying far less. Over recent years they've gotten a lot more studyings, so still less. Integers have been, integer factorization has been studied for centuries since the time of Gauss, probably even before that. Not so much for elliptic curves, but they are seeming more and more robust, so we're feeling more and more confident about them, but I would say still not quite as confident as with integers. There's no fundamental reason why there could not be a sub exponential elliptic curve discrete algorithm process found similar to integers. Ringers in the back, challenge me if I'm saying something wrong there. We don't know of any. The trick is really sort of a notion of smallness. There are small integers. We can look for small integers and take advantage of small integers and do good things. There is no notion of a small point on in elliptic curve. Therefore, all of our methods for integers which are targeted at finding small values and taking advantage of small values doesn't seem to make sense there. But somebody might come up with some notion of smallness which has equivalent properties, maybe somebody will find something. That's possible. If that were the case then elliptic curves wouldn't suddenly become insecure for cryptography, but 256 bit elliptic curves would suddenly become insecure and we would have to go up to a larger size and we would lose benefits. To get elliptic curve operations to work because they are more cumbersome, we only do get a benefit if we have much smaller key sizes. If we have, if things come sub exponential, then all bets are off. Getting good performance often requires use of special curves and there have been a lot of special curves in the past that have been proposed and determined to not be so secure, so we have gone through a litany of special curves that, we should use this because it's very fast. Yeah, it's really fast for attackers too, not so good. In answer to what was said before, elliptic curve crypto requires the use of let’s say sophisticated processes that generate really good curves that we can agree on and we've got, here are the ones that have been agreed upon by the national Institute of standards and technology and recommended for everybody and maybe not; maybe we should these for those. We don't expect individual people to just pick your own favorite elliptic curve. With RSA, pick two primes, multiply them together and you are good to go. We don't want people trying to do that with elliptic curves. They are a little bit more delicate. There are some trade-offs. Okay. That said, I want to say a little bit more about other crypto algorithms and elliptic curves. The digital signature algorithm that we talked about the end of last time, just like Diffie Hellman goes through very nicely if we are doing things over elliptic curves and elliptic groups instead of over integers. RSA also works over elliptic groups, but it's insecure. The trick of only I know the factorization and therefore only I can do the conversion, doesn't apply with elliptic curves, at least any way that anybody has found. RSA as it's known doesn't work there, so that's why we use DSA for signatures; RSA signatures don't work there. I want to take a few minutes unless there are any questions on elliptic curves? Yep. >>: What exactly makes elliptic curves so fragile to using your own curves? >> Josh Benaloh: There are a couple of things. As I said, we have a project that I'm not directly involved in, so I can look at the people in back and see if you want to say anything. It's mostly a matter of trying to balance performance very carefully against security. If you are willing to go with larger curves, not a special form, then you can do pretty well in most cases. It's not a big problem, but if you want to get sort of the optimal performance and squeeze everything out, then if you take things that are too good in some ways you get into problems. As long as you don't try to push the margins to much you don't get too much fragility, but if you try to squeeze out every little bit of performance, that's where things get a little risky. Yep. >>: So you mentioned [indiscernible] encryption with elliptic curves? >> Josh Benaloh: The way it's typically done is you agree on a key and then you encrypt with AES or symmetric cipher using that key. There is, you could do what's called ElGamal encryption which is implicitly the thing you agreed on is your key immediately and you use it as a one-time pad and just send over ciphertext with that as a one-time pad so there isn't a separate step. Effectively, you generally do some sort of a symmetric step with the agreed-upon key, which is what you do with integer Diffie Hellman as well. >>: Channel encryption between two parties. With RSA you could do public-key encryption of a key which is used to encrypt your text at rest, right? >> Josh Benaloh: You can, although… >>: Is there a way to do that with elliptic curve? >> Josh Benaloh: You wouldn't typically use RSA in that forum because the idea is you have a public key. I can use the public key to encrypt, to get a key or to encrypt data and send it to you. If I'm just encrypting locally for my own purposes… >>: I'm in correcting it. You are my intended audience. I would use your public key to encrypt the key which I could then leave at rest without communicating with you. And then I give you the drive at some point and you can read the encrypted data. >> Josh Benaloh: Right. We talked a little bit about different versions of Diffie Hellman and if I have a static public-key, you can do the same thing with Diffie Hellman. You have seen my Diffie Hellman in g to the a; I'm Alice here. You've seen my g to the a. Now if I have the static key you pick a b. You send me g to the b and you encrypt with g to the ab and you can do effectively the same thing. Okay. Let me spend a few minutes and I'm really not going to spend a lot of time talking about lattices, but they are interesting to know about. I have to answer the same question that I answered before about elliptic curves. What is a lattice? Of course, once again, we turn to the source of all knowledge and we get another beautiful definition. Lattice is a discrete subgroup of Rn which spans the real vector space. We generated from a basis, from linear combinations et cetera et cetera, or let's make it a little easier. A lattice is something that looks kind of like that. It's a set of points in a regular pattern and I'll say a little bit more about what that regular pattern is. If we want a two-dimensional lattice, the nicest regular pattern is just a square lattice like that, but they don't have to be square. This is rectangular. It starts to look like an optical illusion if I do this. This is still rectangular but it's not square anymore. You can have other nice tiling patterns effectively. They don't have to be triangular, things like that. Where it comes from is a basis. Basically, a lattice is formed from a basis which is a set of vectors and vectors, think of it as just points, from point zero to some point in the plane. What we do is we take these vectors and take any linear combination of those with integer coefficients and the points we get to are the lattice. A simple basis for a lattice, the square lattice would be here is vector one. Here is vector two and then I can put it here and I can easily count how I get from one point to another. To get from here to here it's two v1’s plus three v2’s. It's really easy to see most of the time how to get from one point to another point. You have a nice simple basis. But you can have a more complex basis for exactly the same lattice. Here's the same lattice. It's generated by this basis, but now how do you go from this point to a neighboring point? You can figure it out. You can work it out but it's not going to be quite so obvious anymore. You have to add some of these and then subtract that. Sort of the difference between this and that is a Knight’s move if you think of it. Over one, up two, now how do I get from a Knight’s move? If I go to there, I've got a Knight’s move, maybe down to there and I can get back to here. You can work it out, but now imagine this with not just to dimensions but a thousand dimensions which is the kind of thing we do for crypto and it can get really ugly. Here's another case. Here's a simple basis for this lattice. This is a pretty clean lattice and this is a pretty clean basis. And here is an uglier basis for the same lattice. You generate exactly the same set of points by integer combinations of these things. It's just that some are easy to see and some are harder to see. There is something called the closest vector problem in a lattice. With a good basis, finding nearby lattice points, if you are somewhere in the plane say, or somewhere in space, is easy with a skewed basis, more elongated ones. Finding nearby lattice points can be very difficult. We can use that for crypto systems in a couple of ways. Lattice-based cryptosystems typically looks like you generate a key by picking a nice clean basis for your lattice, things that are almost rectilinear, things that have nice big angles between your vectors and you might have literally a thousand different vectors. It might be a thousand dimensional lattice. You then transform your basis into something that's really skewed and ugly and that's what you give to other people to work with. They can't manipulate this lattice very easily. Once you have that you can do a couple of things. Encryption could just be I give you the skewed basis. You use the skewed basis to pick some lattice point and you perturbed by a little bit and what, your perturbation is actually your message. It might be that it's a little down and right of a point or a little up and left of a point and that might be a bit or there might be a few bits of sort of what direction you go from that point. You decrypt by using good basis to figure out where the nearby lattice point really is. A more common way to do this is actually a little bit more complicated. Is you use your message; you get a higher data rate if you do something like this. You use your message to take the skewed basis and let linear combination becomes your message, so let's say I put my message as zeros and ones. 0 times the first basis vector +1 times the second base vector +1 times the third base vector et cetera and sort of combined them and I get some point in the lattice and then I perturb that a little bit. And I give that to you as the encryption knowing the good basis I can annoy how I transformed from the good basis to the skewed basis, I can go back and figure out exactly which vectors you combined and how and that's the trick. So that's the basis here. Once again, we have finite lattices. We don't want infinite things, the usual problem, so we do our computation mod sum large prime. The same questions. Why should we use lattices? It's nice to have different algorithms that have very different designs. It turns out that discrete log and factorization and integers are closely related problems and if somebody figures out how to factor, Diffie Hellman would likely fall apart. In particular, lattice methods seem to be a lot more resistant against quantum attacks, so I'm not planning on talking about quantum computers now. If you want to hear more about quantum computers bother me another time when I'm not about to run off to the airport. If quantum computers develop lattices would be a very good thing to have as a cryptographic method. Why not? They are unwieldy. The public keys tend to be enormous. A thousand vectors, vectors tend to have each -- this much in the first dimension and this bunch in the second dimension and there are a thousand dimensions all from a thousand-bit prime and now you have a thousand of those vectors. That's your public key. Sheesh. Similarly, the encrypted data tends to be enormous compared to the amount of data that you are actually transmitting, so they are kind of unwieldy. We don't really use them in practice a lot, but it's nice to have them in our back pockets just in case. >>: [indiscernible] the method is unwieldy so like a thousand vectors, how big is [indiscernible] >> Josh Benaloh: It depends. I've sort of sloughed over it because there are different cryptosystems using lattices, but it could be that it's something like a thousand bits that you transmit with a basis point that you've gotten in a thousand dimensions. So it's, what, a factor of a thousand. Do you know roughly a factor of a thousand and something like [indiscernible], for instance? The difference between a payload and the size of the encryption? Michael, do you know? >>: [indiscernible] >> Josh Benaloh: Okay. Something like a thousand, factor of a thousand. Yeah, it's hard to work with. Okay. Next session, a lot less math but we're not getting completely rid of the math. Sort of vulnerabilities, attacks, tactical considerations, these actually tied together because of a lot of practical tricks that we use to make things more efficient that leads to our vulnerabilities. [laughter]. And leads to attacks and whatnot, so it's important. We want to know about them. We want to do them but we want to do them carefully because that has been our bane many times. The final session, what I'm planning on is talking about some applications, some of these things. If I'm permitted I'll squeeze in election protocols. I am flying out right now to Austin to talk to them about, to work with them on their new election system design, so something I would like to talk about if there's time. But if there are other things that people want to do we can do that in addition, or instead, just let me know. Okay. Any questions? Good. Then I can get off and make my flight. Okay. Thank you. [applause]