21348 >>: Okay. Welcome back everybody. Please be...

advertisement

21348

>>: Okay. Welcome back everybody. Please be seated. You can be in the back this time since we have a normal talk again with big characters. [inaudible] as you may see already. The next speaker is from Turkey, who is Husevin Hisil. And he'll speak on faster formulas for elliptic curves. So a good title, I'd say.

>> Husevin Hisil: Thanks, Peter. Thanks for the organization committee for the invitation. The title of this talk was fixed -- it was determined pretty much a time ago, but the slides just finished an hour ago.

So now I will change the title to this one, a road map for formula hunters, because I will mostly be speaking of a way of driving a lot of formulas and selecting between them, rather than just displaying formulas, which will not be so meaningful.

And since everything here is done on computers, computer algebra using computer algebra, let's be honest, this will be a better title, I believe. For an hour talk I have a very simple outline in the first half of the talk I will be talking about how we can make life easier for driving additional elliptic curves, any kind of elliptic curve. And the second part we'll be looking at some timing results, operation count, sort of thing, implementationnal stuff.

And I'll start with explaining the motivation of this research. I think it was back in 2003 or something when I was a freshman. I first came along -- I first learned about elliptical cryptography, and I asked the instructor why we have to use Weierstrauss curves in elliptical cryptography. The answer was it's a standard choice.

The answer really didn't satisfy my needs, and I go there and find out these classic papers. I'm sure you're very familiar to these papers. What is all common on these three papers is that the result always motivates you towards using the Weierstrauss form of elliptic curves, because at those times, with the formulas we know the Weierstrauss curves were really the fastest ones once you use the Jacobean coordinates.

But after I saw these three papers I said they're the fastest. I'm not going to go through that direction. I don't want to look at the additional anymore.

But later on, around 2003, I came -- I saw this paper, and this paper was actually written for a different topic. This is about side channel attacks, preventing them. But at some portion of the paper the authors claim that the formulas for the Jacobi form can be faster than the formulas were Weierstrauss form.

And still I wasn't that concerned about the topic. But it was the start of the motivation. And from this date, after 2007, there's been other papers appeared. And finally the sparkling bit came with the introduction of Edwards curves to the committee. And Edwards curves solidly broke the speed limit of Jacobean coordinates at that time. So I said, okay, this is the motivation for me.

Probably there is more things to be out there, so I will try to concentrate on this topic.

Over three years, I looked at these five famous curve models and for each of them I was lucky enough to come up with concrete results.

And along the way I also checked many more elliptic curve models for efficient group loss as well.

I even tried, tried intersecting hyper surfaces until I get a curve, elliptic curve, and looked at group laws on such objects. But these five are still the fastest ones, if you believe me.

And it would be possible to give examples for all of them, but it's only going to complicate the talk.

So I will be giving examples using only the Jacobi quartic curves.

So at the end I will also be mentioning the outcome for the other four. So I will start with giving some basic properties of Jacobi quartic curves, just to remember.

So this curve is called Jacobi quartic curve, and this fine curve is non-singular, provided that the delta is non-zero. The usual projective closure is given by this homogenous projective equation, and unless, if that is not zero, a triple corresponds to the fine point in the usual sense. I gave this to show the weights of the coordinates.

And there's only one point at infinity. It is this point. And this point is singular. And if you resolve the singularity here, we end up with two more points. I wrote them down as omega 1 and omega

2. And the minimal field of definition for these two points are a field which contains the square root of the curve constant T.

Of course, this extension can be K itself. So if you select the field containing the square root of the points that can be defined in this fashion, here I will really abuse the notation.

I hope algebraic [inaudible] can forgive me. I will add these two points, omega 1 and 2 to my point sets and call this L rational points over the curve.

Once we have this curve it's equal to the Weierstrauss curve given below, where we are the maps below here. And we will have a very quick look at the properties of these two maps.

To satisfy the durationnal equivalence, we can trivially check that the composition of the maps gives identity maps on the relevant curve. And if you look at the problematic points, we can easily say for the first map that if you have X equal to 0, we have a problem.

We say that the .01 on the Jacobi quartic curve, at that point, our map is not defined. It is not regular. And it is regular at all other points.

There's another point, which is 0 minus 1, and at first glance it might seem that 0 minus 1 is also not regular. But actually it is. Because these maps are actually elements of function fields so we can always find alternative representations for these rational expressions. And here is an alternative map.

And using this map, I can always map 0 minus 1 to the .00 on the Weierstrauss curve. Let's look at the other map that is psi. For psi we have a problem again whenever we have V equal to 0.

There are three such points. And before stating which ones the map is not defined, let's see that

00 can be sent to the previous curve, the Jacobi quartic curve to the .0 minus 1 with this alternative map here, and the remaining two points with V coordinate equal to 0 actually corresponds to the points at infinity omega 1 and 2. And at this time technically even quartic curves are not elliptic curves I'll treat them as elliptic curves by adding omega 1 and 2 and I'm not going to mention it again.

An interesting property is that psi because a morphism whenever these non-square K. That is, you can take all points to the other side, because the points at infinity are only defined when these are squaring K. If it's non-squared then we don't have points at infinity at all. Reversely, I can say that this Weierstrauss curve is durationally equivalent to this Jacobi quartic curve, standard Jacobi quartic curve, and we very well know that this Weierstrauss form elliptic curve covers all elliptic curves having at least one point order 2. Therefore, since they're operationally equivalent, we conclude that Jacobi quartic curves covers all curves of even order. And it is well known that approximately two-thirds of equivalence classes fall into this category.

So always done. The next question now is what is the additional on Jacobi quartics? The answer is no. The answer is not new, but I will explain it in an alternative way, in the way I did in my research.

So I used a lot of computer algebra to automate the group law. And my motivation was to find minimal degree expressions for additional. And prove the minimal degree whenever possible.

And also look at other slightly higher degree formulas as well in case they might be useful.

And along the way I also checked the formulas which computer software and also I applied techniques to find alternative ones. So this is the backbone of what I used very frequently. So if you have two curves, W and M, you defined over K. And if these two are -- we assume that these two curves are operationally equivalent and we have explicit maps phi and psi, and once we resolve the singularities, if you have one and have the distinguished point, which we'll later act as the identity element of the group, I let plus W be a map defined from this related to W, be the addition map on W.

And I seek an answer for the additional on M. And that is given by the composition of these maps here. And the answer is -- this map plus M is regular at all but finite remaining points on M.

This is rather an observation. So here is a numeric example, again using Jacobi quartic curves.

So we had these maps defined already. So these are the curve constants and their relation to the Weierstrauss curve. This is Jacobi quartic curve and this is the previous map I displayed and this is the inverse map.

We well know the addition formulas for the Weierstrauss curve. This one. And what remains is just to make the composition. Once we do the composition, yeah, we gotta map defined from M to M. M cross M to M. And which describes the group law on this, Jr. and this map is regular at all but finite many pairs.

Okay. The map is not that cool, I know. We'll come to that point. I was actually lucky to fit everything in one page. The really worse ones. Okay. The next step is we at least have something in hand that works, that is something. But the next thing is to simplify these algebraic expressions.

And I was so lucky that there was a solution out there, which is a more general solution than I need, which perfectly meet my needs, satisfy my needs here. And the algorithm is called the minimal total degree algorithm of Monagan and Pearce. It was from 2006, so it's not from too long ago.

And the idea goes like this. Let me explain it in a very brief way. So what we see here is the composition of maps. And I didn't expand the algebraic expressions. Otherwise it's going to be some 100 pages long. But just assume that you do that computation and get a numerator and denominator as polynomial expressions.

And you're write down here as N and D. N is numerator, D is denominator. So what this algorithm does is actually substitute lower degree polynomials as eta and theta.

And tries to solve the resulting system of equations. And if a solution is not found, that means that you don't have any formulas at that degree. So you have to look at higher degrees, eventually. And once you find the solution, that is the minimum degree solution for you. The algorithm goes this way.

But we shouldn't forget that this minimality is about the sum? I mean, I have the total degree for the denominator. I have the total degree for the numerator.

I'm summing them up and I'm trying to minimize this number. And the algorithm, this is why I write lazy bit in the title. It is -- it's already implemented, and open source implementations is available in Pearce test. And if you run the algorithm you immediately get these reduced algebraic expressions, which are much nicer than what I showed moments ago. And the credits goes to Chudnovsky and Chudnovsky, back to the paper from 1986. These algebraic expressions are actually defined by them with a slight difference that they use the projective weighted model, which is nonsingular, and they place the identity at the point at infinity. That's the only difference. But basically the algebraic expressions are just the same.

The only difference is things like this denominator goes to this -- this numerator goes to the denominator. The denominator goes up, that sort of shuffling, nothing else.

So this is the minimal degree addition formula for Jacobi quartic curves. There's no other forms, there are no other formula having lower degree than this. The algorithm actually proves this. But we can still have lower degree formulas and formulas which have degrees equal to this one.

Here is one of them. I found this formula along the way. For computational purposes, these are actually refined formulas but later we'll look at the projected versions. This formula will be more useful for us, because that term is reused here, squared and there's a multiplication here.

Here on the first formula it takes one multiplication more than what we need for the second one.

So how did I obtain this second one? So here are the main core ideas. If you regard the map I give as addition formulas, and we immediately see that those distinguished algebraic expression by itself doesn't explain the group law, because there are special cases that we have to take care of. So solution might be finding other law degree formulas which might be defined for those exceptional cases.

So let's take the denominator from the previous X coordinate here. This X coordinate. And let's take the denominator. If you think of them as elements of this ring and the ring is defined over this field here, since the GCD of them is 1, the fraction doesn't really simplify further or there's no -- yeah, it doesn't simplify further than this in this function field.

Now, assume that N over D is a function on M cross M. So the curve equations, the relations that we obtain from curve equations will now come into play so that we can find alternative algebraic expressions for this.

And if I compute -- this is the main idea. If we compute this column ideal, D plus K divided by N, and look at the reduced Gröbner basis of this ideal, we will always see a minimal total degree denominator.

This guarantees to have a minimal total degree denominator, but it doesn't guarantee to give us a minimum total degree formula itself. A fraction is a fraction. But the denominator will be of the minimal degree.

And most of the time, since because we are using a graded monomial order when computing this

Gröbner basis, the degrees tend to get lower and lower in the generating ideal, the generators of your ideal.

And if you compute this Gröbner basis, you can find these elements inside as generators. And each one of them is a good alternative for being a new denominator. All you need to do is to look at the corresponding numerator for that algebraic expression.

So let's pick up this one. Not so symmetric ones but they can be fast. I really didn't check this one. So this one. So what we need to do is compute this. This is not a polynomial operation.

It's not a straightforward operation. It is a multi division algorithm. The first implementation also goes to this [inaudible] as far as I know. And if you compute the corresponding numerator for this

X coordinate formula, we get the alternative formula as this, with credits to Ola who found this formula some 100 years ago.

Okay. We can have a quick comparison between two formulas. They look different but they do exactly the same thing. And if you're after obtaining more and more formulas, you can simply change the lexical graphical ordering of your Gröbner basis of the system.

And that was the X coordinate. And here are some more alternatives for the Y coordinate I found using such kind of ideas. And this is really a very short list. This list can go pages long. Slightly increasing the total degree.

And one of them is actually spotted by Chudnovsky and Chudnovsky and what is interesting about these is that it doesn't depend on the curve constants.

But for maximum submit computations this formula is also not so helpful for us. I said that these maps are considered as addition formulas, have problems. Let's look at those problems. So if I have two fine points, X1, Y2, X2 Y2 and I'll be adding these two points using this algebraic expressions, I say that if the result is a point at infinity, then the denominator is always zero but the converse is not true. That is, once I substitute these coordinates, if I obtain a zero, it doesn't mean that adding these two points gives you a point at infinity. So we have to be careful here.

So I set up the parameters here as usual. Nothing really surprising and the answer is just simple root solving here. I fix a point X1 Y 1. I exclude a special case where X1 is 0. I will mention Y later.

And I have the second point X2 Y 2 that X2 Y 2, if that point is inside this set here, or I forgot to put the set notation, let's say sequence, one of the elements in the sequence, then this is 0. And these are all four cases. There are no other cases.

And I really don't know now that which one of these ones led to point at infinity, which one of those ones led to in a fine result.

One of them is really interesting for cryptographers. That is this one. So it says that you can't use this denominator, the first formula, the minimal degree formula, for point doubling.

You can't add a point to itself using this formula, because you'll immediately get 0 in the denominator you can't divide by 0 you have a problem. And we need a solution. But we have a lot of those formulas. So if you look at this alternative formula, remember this was Ola's formula.

This auxiliary part is due to Bedan and [inaudible] from 2003.

And for this formula, the story's pretty much the same. I'm not going to tell it again. This should have been 1 minus the X1 squared, X2 squared. That's a mistake.

In the same setup here, if you write them down, you'll see that the sequence, another sequence for this denominator is different. And whenever this denominator fails for these two cases, I can always use the previous formula to cover those cases.

And whenever the first formula, the minimal degree formula fails, for these two cases, I can use the alternative formula to cover those ones.

But I have always a problem with these two. Both formula fails at these parts. And this is good news, actually because just by counting I can say these are points at infinity, because why of two points at infinity, nothing left. It has to be.

So the story is pretty much clear now. We are going to, with some more work we will just put everything together into an algorithm here.

In the start of the talk I said I would treat Jacoby quartic curves as elliptic curves that's why I added omega 1 and omega 2 in the point set and I enforce the existence of omega and omega 2 by selecting a square, selecting the parameter D to have a square root in the underlying field.

So the algorithm goes like this. It starts with a lot of if than else bits, which are basically corresponds to the two lemmas I just showed, the exception handling part is here. As you can see, if the first point is let's say omega 1, the second one is omega 1 as well. The output is 01 and it goes in that fashion.

And in some cases the output is not a point at infinity. So we have simplified formulas for that coming from these lemmas. And these are only very limited number of cases.

I mean, they're finite number of cases that we have to care about. The rest of the part is covered with these two formulas here. And what is nice about the second formula is that if D is nonsquaring K, we don't have points at infinity.

And all of these branches just disappear. It's only enough to have these two lines. This is analogous result what we know for [inaudible], for which the complete laws were proved by

Bernstein and Lange, and this is just analogous result.

And also what is nice about this formula is that I'm not going to give the [inaudible] of the proof here but if the point you're using, if the subgroup that you're using only has points of odd order,

the denominator also never becomes 0. That is, it's again enough to have these two lines to at least have your arithmetic computed without any exceptions.

And what is nice about the first one is that again if you have -- if you're computing in an odd order subgroup, again, the only exceptional case that you will hit is going to be this one, the point doubling, the obvious case.

You don't have to worry about other cases. So as long as you have special doubling formulas, you can use just two of them. For doubling you use that special formula. It is not here. And for all other cases you can use this minimal degree algebraic expressions.

The total degree is if you count here for the numerator, it is two. For the denominator, it is two as well. So the sum is four.

But for the second one, the sum is six. That's why we call this one the minimal degree formulas.

So this was the second part of the talk. Now, we will have a very quick look at the immersion three point edition. Because for most implementations we don't like inversions. Inversions are really expensive. I'm a computer engineer myself, and I like implementation a lot.

And for most implementations, if you write your code in C, most of the time inversion is not that expensive in comparison to multiplication. But if you optimize everything to its limits, things like if you do things like selecting special fields with low-hanging degree or if you do assembly optimizations, the IM ratio just goes up. I mean, inversion is really expensive.

The best ratio I could hit in my implementations were 121. So 121 multiplications makes just one inversion. So in the rest of the talk I'll assume that inversion is really expensive and we always want to get rid of even from one inversion.

So it is typical to use projective coordinates, some sort of projective coordinates in cryptography applications, because it is very, they naturally eliminate the need for inversion throughout your scalar multiplication operation. In the end, if you meet result in fine coordinates, you can do the final inversion if you want.

So for the Jacoby quartic curve that we saw, just saw in this talk, I tried to count operations in different -- when we embed it into some sort of projective space. Of course, the infinite number of them but meaningful ones according to me are these ones.

And if you do your operations in a fine coordinates, then you always have to perform this inversion operation here. So I eliminate the first line directly. There are other cases where some of them appeared in the literature, some not.

For example, this one is operation count in projective space. This is my operation count here.

And if you sort them down, we see that the best alternative appears when we embed our curve over P 3. In P 3. For general edition, we only need several multiplications, three squares and one multiplication by a curve constant for this curve when A is equal to minus 1 over 2.

So this was the addition. Let's look at the doubling. I am not going to write down the doubling formulas or the projective addition formulas. We'll just have a look at these operation counts.

What is surprising here is that doubling, the best of doubling that we can get appears for the case homogenous projective coordinates. So we have a kind of problem here.

Because the best doubling is in homogenous protective coordinates, but the best edition is some something completely different than that.

A nice observation here is that if I have a point, which is a quartic blitz here XYT and Z, as soon as I discard the T coordinates, what is remaining XYZ actually satisfies the homogenous projective coordinates.

So therefore, for practical reasons, I will think that coordinate T is an external coordinate is a redundant data that I have to carry along.

In this fashion, I will be calling that sort of coordinate systems X standard coordinate systems, and these ones as projective coordinates.

So if you have implemented scalar multiplication before, you will see that for variable P and K you are trying to do K times P where P is your point K is your scalar and both are variable, then you will see that algorithm goes with multiple doubleings. After multiple doublings you do just one edition and you continue with doing multiple doublings.

If you implement your algorithm in a Windows fashion. So these are technical bits but this is the story. The same for all query models.

So what we do is for all those repeated doublings, except the last one, we perform the doubling here, using the fastest formulas we have. As soon as I need addition, I will generate the necessary coordinate, that extra coordinate on the fly. So once I do that, I will need another operation. Multiplication or squaring, doesn't matter. But actually it is possible to do better than that.

That is, we can finish that operation that is doing doubling plus generating the extra coordinates in just eight squarings. But, still, this one is the fastest. So I do for repeat doubleings I use this formula. But for the last one I use this one. Now I'm ready to be able to use the addition formula here, because I have all four coordinates in my hand.

I use this formula. What is nice here is that I really don't need to calculate the exstandard coordinate this time because I'm going back to repeated doublings again. So although the addition here produces four coordinates for you you only need three of them. Discard the T coordinate.

So you don't really have to make 7 and 3 as 1D. It's actually 6 and 3 as 1D. And I just shift that extra squaring from here up to here back just to write the course in a simpler way.

So it is here at the last line. I'll be explaining this one. These are basically the same slide from, the same as the previous slide. The difference is that here I added the X standard, the mixed coordinates.

So what I did in the previous slide was mixing homogenous protective coordinates with X standard projective coordinates. I was jumping from one to each other at the same time. Just to have my arithmetic faster. And I denote it by Q to X. Q means quartic, Jacoby quartics. I didn't

use J because it might have been mixed with Jacobean coordinates for Weierstrauss curves.

That's the reason.

So the figure here tells me that for projective coordinates, I can do doubling in 3 and 4A or alternatively 2 and 5 and 7 A. Depends on your implementation, the one which is faster.

I have two sets of formulas. Unified means the second set of formulas I showed with the denominator 1 minus D times X times X1 times X2 squared. That is a unified -- that is called a unified denominator, because it can be used for point doubling. And in the jargon it is the first set of formulas is called dedicated addition formulas. Dedicated addition formulas are usually slightly faster than the unified ones. At least it has been this story for all [inaudible] that I've studied so far.

And for Jacoby quartic curves the best coordinate system is the mixed coordinates here. So you only need that much of operations for doubling. And this much of operation for effectively for additions.

So Jacoby quartic curves, although I give examples over them are not the fastest ones in most cases. We have as this one. So these are a summary of literature results for adverse form and

[inaudible] form. So E means homogenous projective coordinates. These are the operation counts by Bernstein and Lange from 2007, and this I here means inverted coordinates, which was a surprise removing a multiplication from addition here.

And I considered other cases here. Again, this notation is similar. E stands for Edwards, actually twisted Edwards and this E stands for X standard coordinates.

And for this coordinate system, the doubling is slow. That's a problem. Addition is faster than the previous ones. That is good news. And what I do is I apply the same solution.

I mix the coordinates. I jump from one to the other. I use the faster operation every time. So that is denoted X as X standard coordinates.

What I also find out that is if you use curve constant A minus 1 instead of 1 or some other value, we can further decrease the timings for addition. We need only seven multiplication and one multiplication by a curve constant.

If you use the unified formulas for addition. And still we can benefit from fast doubleings as you can see here. And in the last bit, I found out dedicated formulas, minimal degree formulas, and these formulas can eliminate another multiplication by a curve constant here. So in total you only need eight multiplications for a general point addition.

And further if you are making, this is called in most publications as mixed addition. That is, if you have Z 2 coordinate or either of these coordinates equal to 1, then it is enough to make just seven multiplications.

And so far the Edwards form has been the fastest for implementations. As far as I know. I have analogous results for Jacoby intersection form. That is representing elliptic curve as an intersection of two quartic surfaces.

And for these ones I have results here. Most of them are not documented in the literature, but you can find them in my thesis, which is available through my website.

And that is quite interesting to remove a lot of multiplications here, as you can see. And doubling is as fast as doubling, Jacoby quartics and for Edwards. But still it can't beat Edwards.

And the cubic form has same form as the most stubborn one for speedups. I really did a lot of crazy things, but I really couldn't manage a lot. The classical algorithm for doubling was 6 and 3

S. I was able to find 7 and 1 alternative and 3 and 6 as alternative.

Similarly for the conventional algorithm for addition is, addition requires 12 of them. But I was able to trade it with a lot of additions, one multiplication with a lot of additions here.

And I have the X standard version. I have two versions of this one of them uses six coordinates to represent the point and another one uses nine coordinates to represent a point. You may say it's going to take a lot of placing in your registers, but it is really nothing. It is just a few bites that you require.

So with modern architecture, with desktop processors, it's really nothing to have nine coordinates to represent your point. Unless you have a very restrictive device. And another stubborn model was short Weierstrauss form. I really couldn't find anything faster, but I found unified additional formulas which are faster than the previous ones. And unified addition formula was not tried for

Jacobean or Chudnovsky Jacobean coordinates. I've also developed those formulas which are faster than the ones here we have for projective coordinates.

And for a scalar multiplication, where everything is variable, I can say that if we fix these constants, curve constants, when I started research, you will have to need to do this much of multiplications per cue bit when doing the scalar multiplications. For example, for 256 bit, for each bit you need to do this much of multiplications on average.

And with the new formulas, they are shifted to these newer ones. What is nice here is that we were bounded to this efficiency back in 2005. And earlier. And now we hit these values, which are very, maybe not surprising, but I'm happy to see these values.

And I believe that maybe in the future it will be possible to make them even faster. Here is an implementation. The source code is open for this implementation. You can download it from my website again. These are the cycle counts I measured on my PC at home. And these are the

Kerr models. The parameters are selected to make them faster. And you need using those newer formulas, you need this much of operations.

This is just an implementation to compare the speed of different forms. It is not -- it doesn't really target to make it new [inaudible] but this one is at that time was very close to the [inaudible] or maybe it was the best. I can't remember.

But what is important here is that by simply changing, if possible, by simply changing your curve representation from Weierstrauss to twisted Edwards, you can speed up your operations, decrease your cycle counts a lot.

Thanks.

[applause]

>>: Okay. Any questions?

>>: So once you consider the Montgomery power lender to reduce the number of coordinates and speed up the [inaudible]?

>> Husevin Hisil: I tried to work on that but I couldn't have any success. Whatever is there out is the best. We have an alternative version came up in 2007 or something like that, using converse surfaces. You can have a look at that. There's an experimental database which is the easiest way to find out those formulas.

I really couldn't make them faster. But if you say they might be faster than these ones, I say depends on the implementation. For some kind of implementations you can make really nice assumptions or restrictions about the way you do arithmetic there, which can be competitive with these results. But I truly believe that these formulas, the fastest formulas were Edwards can beat

Montgomery type letters, unless you fix some constant to 1.

>>: What's the rationality conditions.

>> Husevin Hisil: Can't hear.

>>: Elliptic curve when you put in wire stress form, but what's the rationality conditions like having four versions, three versions?

>> Husevin Hisil: Okay. All right. So to keep it simple I didn't really mention. So if you have a point of order, too, you can represent your curve in X standard Jacoby quartic form. If you have three points order 2 you can put it in a simpler form of Jacoby quartic form. I think it was the same story for Jacobean intersection form.

If you have a point of order 4 you can write in an Edwards form. What else? Weierstrauss covers all cases anyway. Yeah. I really didn't prepare those cases here, because the way I'm looking at it is which one of those things are the fastest ones. Having a small torsion that I really don't care about it, as long as I have a big prime order subgroup which can resist all known attacks, the most important thing is the speed for me.

I hope that's an answer.

>>: With no further questions, then, we'll reconvene at 4:45 and we'll thank the speaker again.

[applause]

Download