>> Josh Benaloh: Welcome everybody it is a pleasure to... I botched your name. I thought I could do...

advertisement
>> Josh Benaloh: Welcome everybody it is a pleasure to have Olivier Pereira.
I botched your name. I thought I could do it right, Pereira, right? Thank
you, my apologies. You are from the Universite Catholique de Louvain in
Belgium where he is a professor.
Olivier has done a lot of great work on election systems, especially the
Helios system where he has done a great deal of development and was principal
in getting it used in an election with I think 25,000 eligible voters,
something like 5,000 people actually voted for the university president
there, so this took an extraordinary amount of work to make it happen, but
this is the largest ever real use of an end-to-end verifiable voting system.
And we are all very pleased about this. Olivier has also done a lot of work
in related systems and making privacy better and I think that is what we are
going to hear about today. So, welcome.
>> Olivier Pereira: Thank you very much for this very kind introduction. I
will indeed talk about privacy voting. Please feel free to interrupt me at
any point if you find it useful. So this is joint work with David Bernhard,
Véronique Cortier, Edouard Cuvelier, Thomas Peters and Bogdan Warinschi.
So I would like to talk about privacy in general. It was introduced in most
of countries in the 19th century as a requirement for public elections in
order to avoid this kind of situation with public voting where you have a lot
of people strongly suggesting to voter to vote in one sense or another. And
since the vote was public they could just actually check what the voter was
doing.
So in many places in the 19th century people decided to introduce secret
ballot elections in order to prevent coercion and bribery. It has also some
other benefits, but I will comment on that maybe at some later point. So we
want to have secret ballots, but what does it mean?
The first thing with voting is that the secrecy of the ballot is not
something that is just completely absolute. If all the voters vote in the
same way, in most of the voting systems that are currently used we will know
that’s all voters supported the same candidate one of the person of the
votes. And that’s not seen as a breech of privacy from the voting system.
So it’s not something like an absolute privacy definition like we have for
all the secrecy of an encryption scheme.
We want something that’s weaker. Another approach of defining the privacy of
the votes would be inspired from the work on secure function evaluation. The
idea would be to then say something like the voting system should not leak
anything about the votes besides the outcome of the election as it’s written
in the bylaws. So that’s certainly a good approach for defining the secret
of cryptographic voting. We would like to say more about privacy in the
sense that we would like to see how much a voting system actually leaks when
it’s used in a normal way. We would like to know how much the outcome of the
election actually leaks in terms of information about the votes.
Another approach for defining vote privacy was proposed by [indiscernible] in
privacy last year. And I measured privacy off a voting system as the maximum
probability that an adversary has to be, to win the game, where he has to
decide whether a voter voted in one way or in another way. That looks like
[indiscernible] definition and [indiscernible] for encryption.
We don’t require a negligible probability if we need a game there, for that
reason essentially. But it seems that this definition is too strong in many
cases. For instance if you have approval voting on complex you have, I don’t
know, 50 choices. That makes a lot of different ballots. If you will not
have enough voters to make sure that all of the possible ways of filling the
ballots are represented. And if you use that definition in that context this
definition will just tell you that you have no privacy at all, which is not
quite what we want.
We want to say, “Okay, we have this set of ballots and we still don’t know
which of those ballots was submitted by that voter”. So the metric should
say that there is some amount of privacy left.
So what is it that we really want to measure --?
Yeah?
>>: Before you get into privacy, what is the election outcome? Is it that so
and so won by a lot, the majority, and you stopped counting? If there are
1,000 votes and after 700 votes and you have got 650 can you stop counting
there and then? Can you make a decision? And then less information leaks.
>> Olivier Pereira: It could be. It depends on how you decide to have the
regulations. Typically the election bylaws are required to say how many
votes each of the candidates receives.
>>: What I don’t understand is why [indiscernible].
they can have a change the outcome.
They only get counted if
>> Olivier Pereira: Okay, that might be part of the definition.
>>: [indiscernible].
>> Olivier Pereira: Certainly.
I will have examples on that line later.
>>: But as long as we are stopping, I can’t resist mentioning one of my
favorite examples, which an election that’s almost unanimous except for one.
Because in that case that one person knows exactly how everybody voted, but
nobody else is quite sure. And that’s accepted from the outcome.
>> Olivier Pereira: Yeah, that’s true. Okay, so we want to measure privacy
for any voting system. So what do the candidates think that we would like to
know? So one first question we may want to ask is, “With what probability
can my adversary guess my vote?” So that’s the kind of thing we can measure
with min-min-entropy. That might be a natural approach for that.
Second question is, “In how many ways can I pretend that I have voted?” So
if, in case, somebody wants to force me to vote in one way or another I would
like to know which amount of freedom I have into claiming that I voted in one
way or another. So that’s the kind of thing that’s measured if you use
Hartley-entropy. So we seem to have at least here two natural questions and
two answers that look like entropy seems to be a right, an interesting
direction.
So we explored that direction. I will just introduce a few notations before
coming to our definition. So we suppose that we have a distribution D for
the honest votes. We may know the distribution or we may not know it. It
will depend on the context, but if we know it we can capture it. Then we
have a target which is what the addresser would like to learn about honest
votes. So the target is a function that starts from the support of the
distribution of the honest votes and goes to something, whatever. It’s what
the addresser will try to guess.
So one example would be okay the target of the addresser is to guess the
votes of the [indiscernible] voter. Another target of the addresser would be
to say, “I would like to know whether those two guys voted in the same way or
not”.
Another target would be, “I would like to know if the population in that
specific area voted in that way greater than in any other way”. So that’s
something extremely general.
And then we have this function row; that is any function that takes all the
votes and provides the outcome. So that might be each of the candidates
received that many votes, or it can be this guy is the winner. It can be a
lot of things.
And then eventually we have the view of the adversary which is a random
variable. We have to distribute one specific adversary. We will have one
distribution on the honest votes and we have a specific protocol that we play
of the election. So that captures the view of the adversary here.
And from that we propose family of measures of privacy which are as follows:
So we have here measure M with X which can be a variant of that measure. We
will come to that later. So we want to measure the amount of privacy on this
specific target for this specific distribution of the honest votes and in
that specific voting system. And we set a distance, the minimum amount of
entropy taken on all possible adversaries. That is left on the target when
the adversary has this view and also views the outcome of the election.
So this FX is something that is left open for the moment, but [indiscernible]
is how much information I have about the target given the view of the
adversary. We take the worst possible adversary.
So the question now is, “How do we define this FX?” The general idea would
be to say FX will be some sort of conditional raining entropy on A given B.
That’s very fuzzy so let’s go to maybe some more concrete examples.
So here are possible choices: if you go for [indiscernible] entropy that
doesn’t seem to lead to any practical interpretation; and if we go for
traditional [indiscernible] entropy that doesn’t seem to provide anything
useful either.
One notion that seems to be informative is average min-entropy. That’s a
notion that comes from a paper by [indiscernible]. Essentially that gives
you expressed in bits, the probability that the adversary guesses the
targets. You just have the traditional min-entropy here and you take the
expectation here on all possible views that can happen on that side. Okay,
so that’s one measure.
Another one that seems to lead to practical interpretation is what we call
min-min-entropy. We still have min-entropy, but here we can consider the
worst possible view that is given to the adversary. So that captures the
case where we would like to know the amount of privacy that the voter’s keep,
even if all honest voters do the exact worst possible choice in terms of
privacy.
So if all the voters vote for the same candidate we would like to know how
much entropy is left on the targets. So what’s the probability that the
adversary guesses his targets in the worst possible case?
A third
a worst
but now
in this
possible notion is what we call min-Hartley-entropy. So that’s again
case. We take the minimum over all possible choices by the voters,
we simply count the number of possible values that a target can take
worst case.
So an interesting thing of this measure is that it has no probability at all.
It’s just taking into account the support of the distributions, so the
possible votes essentially. But we do not need to assign any probability to
any specific vote. So when this is not known that’s a convenient measure to
use.
Okay, so now let’s see some simple examples on the different feelings we get
from those three measures. So we will take something extremely simple,
approval voting, and one single question. So I answer either a yes or no, or
a zero or one. Three voters, I assume that they vote uniformly at random.
And the target for the adversary is the result of the first voters. So we
would like to know how the first guy voted.
And we consider four different tallying worlds. So in the first case the
tally is just a constant. It’s something that is independent of the choices
of the voters. And in that case whatever measure we take as expected we get
one bit of entropy. There is one bit, the voter chooses uniformly between
two different values, so one bit of entropy.
The second thing is just seeing who the winner is. So we tell whether there
is more one choice than zero choices, which is essentially what you proposed.
Then we see a few changes.
If we start from here which is the easiest solution we see that in terms of
Hartley-entropy we still have one bit. Because if I know that I have more
votes for one than votes for zero there is still a possibility that the first
voted either one or zero. The two remain options, though we still have one
bit of min-Hartley-entropy.
If we go for the min-min-entropy then we see that we have a substantial
decrease. This is essentially because if I see that outcome I have a fairly
good probability of guessing how the first voter voted. What happens is that
I have, if I know that there are more ones than zeros I can see that there
are essentially three possible ways of getting more ones than zeros that give
a one vote for the first voter and one single that gives a zero vote for the
first voter. So the .4 is essentially my [indiscernible] of three out of
four.
And if you compute here for the average case you obtain the same numbers for
different computation. So in the average case I have one case where, no
actually this is just symmetric. We know who the winner is and we do not
know much more and since this is complete symmetric we have the same numbers
on both sides.
So the third tallying function is how many yes votes happened and how many no
votes happened? So this time the min-Hartley-entropy becomes zero because
there is the case where everybody votes one or everybody votes zero. So
since this is worst case we have no entropy left on this side. And in terms
of min-min-entropy we have zero as well for the same reason. There is a
worst case in which I can guess with certainty.
But now in the average case I still have some entropy because there are cases
where I cannot guess. I can guess, but I can’t be sure. So I keep some
amount of entropy in that case, in the average case. And so eventually this
is the function that gives all the votes of everyone and I have no entropy
left anywhere. So these seem to correspond to our intuition at least.
So we can measure quite a lot of things. We can compare different tallying
rules using this kind of metric. See how they behave. And we actually
decided to go for one practical example. We considered the Scantegrity
voting system. So Scantegrity has been used in municipal election in Takoma
Park in Maryland in 2009-2011. It’s and end-to-end verifiable voting system.
It’s a paper based voting system.
So the ballots look like traditional paper ballots. You have bullets on
which you indicate your choices. The only difference is that when you want
to pick a candidate you use those special pens that reveal some invisible
ink. And so when you select a candidate here you will see appear some
letters that make a code. And the idea is that this code is something that
you should write down on the take home receipt and you will be able to use
those codes to verify that your ballot is properly recorded. And there will
be a noted trail that will allow you to verify that the outcome is correct.
So in those elections what was required by the bylaw is that the official
outcome is the official outcome is the number of votes that each of the
candidates received. So that’s one level of privacy that we may want to
measure.
But actually if you look at the effect of using Scantegrity we see that we
have more information that is revealed. One thing that is revealed is that
each individual ballot is revealed in the clear. So these are shuffled, so
you do not know which ballots come from whom. But you see, you see the
ballots essentially.
A second thing that you can see is that you have those take-home receipts.
So these are supposed to be something that you can show to everyone. The
system is supposed to be receipt free. So if you see someone with a certain
number of codes on each receipt you know how many ballots were filled by the
person.
So that gives you, again, a bit more information.
>>: In Scantegrity isn’t the case that the take-home receipt is filled out by
hand so they could add extra codes or wrong codes?
>> Olivier Pereira: They could, they certainly could. I am not sure that
adding extra codes would be extremely effective because there are, if you
checked on the bulletin board, you will see which code happened and did not
happen. But --.
>>: It certainly eliminates some codes.
>> Olivier Pereira: That would be, yeah. It’s not part of the instructions,
but you could. Okay. So we did some analysis. We took the Takoma Park 2009
election data. I just have some numbers gathered here, just to give us some
sense. There were six different wards. So I have first, fifth and sixth,
which are represented here. I just took three of them to show the number of
voters differ quite a lot between the different wards.
So if you take the first ward we have 470 votes that had been collected and
in the fifth ward you only had 85 votes that have been collected. There were
two questions for each ward, so question zero and question one. And then we
can check the official outcome and we see that actually in all the wards all
candidates received at least one vote.
So we have the maximum amount of entropy everywhere, 6 bits of entropy
corresponds to three possible choices: 3.17 bits of entropy corresponds to
two possible choices. You can rank the candidates so that’s why you have
more possibilities than just for approval. Then you can wonder what’s
happening if you take a look at the receipts.
And then we see that the amount of entropy decreases quite a lot. And in
particular we see that in ward five for question zero there is no entropy
left. This actually means that for one of the voters, at least if that
person shows her take home receipt we know how that person voted; because
essentially that person is the only one to have filled a certain number of
bullets on his ballot.
So if we see the size of those ballots it seems that nobody, we would not
expect this to happen, but in practice people really do not vote uniformly.
So there are really specific ballots that happen all the time and very few
people vote differently of the others, in those elections at least.
So I think it’s an interesting thing to maybe do some routine evaluation of
those entropy metric and election data. To see how much the privacy might be
impacted by some, maybe unexpected, effect due to the non-uniform voting ways
or things like that.
Okay. So that was essentially the first part of my talk. Now for the second
part I would like to talk about cryptographic voting. So the O metric now is
just purely information theoretic.
>>: Before we go into that would you like to take some questions?
>> Olivier Pereira: Sure.
>>: I have a few things that I am trying to ask about. What is --? Is there
a way of accounting for partial information? If I know how some of the
voters voted, but not others how does that figure into the election? Like
the example I gave. I know the one voter who voted yes and I know everybody
else who voted no in that case, but if I have partial information --.
>> Olivier Pereira: If you have that, essentially you will include this in
the D.
>>: Okay, I see.
view or is it D?
So that’s the distribution. That’s the view.
Isn’t that the view of the adversary or --?
Is it the
>> Olivier Pereira: What you know in advance --. Well actually, it’s not, if
I understand properly your question, you know that some guy voted in one
specific way. That’s not given by the system. That’s something you know in
advance or that is revealed outside of the system. So that’s something that
we would typically include in the distribution of the honest votes.
>>: Okay. Another question I have is another way that this could be done; I
am just trying to see how the two fit together. If you take an election and
you take two voters, who voted differently in some way, and you reveal their
cleartext votes and you say, you ask an adversary whether an adversary has a
chance substantially greater than 50 percent say, in determining which voter
cast which of those captured?
>> Olivier Pereira: Okay.
>>: Does that sort of capture --?
information given by the outcome.
system reveals.
You don’t need to worry about the
You are just looking at the, what the
>> Olivier Pereira: It will tell you something --. It will not tell you the
cases where you want to avoid signaling, for instance. You want to measure
2/40 ways of filling the ballots and you have a refute voter so you would
like to say, “Okay I do not have 40 bits of entropy”. So that part you will
not have it captured. But your suggestion will actually come something
inspired by this a bit later.
>>: Okay, great.
>> Olivier Pereira: Any other questions?
move to cryptographic voting. Yeah?
Okay.
Okay so now I would like to
>>: One thing, everything here applies to private individual voters and only
individual voters.
>> Olivier Pereira: The target could be more than one individual voter.
>>: Sorry?
>> Olivier Pereira: The target could be anything.
>>: [indiscernible] where the threats are in groups of voters.
single [indiscernible].
I recall a
>> Olivier Pereira: That makes sense. Well it could be captured --.
heard about stories in Sicily as well.
>>: [indiscernible].
Yeah, I
It was very popular and they pulled out of Africa.
>> Olivier Pereira: Okay, so cryptographic voting protocols. I would like to
consider a class of voting systems, probably the most common class of
cryptographic voting schemes which include Helios for instance, which follow
this general pattern. It is totally simplified, but that’s a general idea.
We have trustees, a set of trustees; they generate a public key/private key
pair. They keep the private key for them. They send the public key
broadcast with the public key. Then each of the voters take the public key
and then using some encryption mechanism they can, in a very broad sense
something that is essentially a ballot preparation protocol, they take the
vote, they take the public key and they produce encrypted ballots which might
contain some zero knowledge proofs or all that stuff. And they send that to
the trustees. And then the trustee will run some tally protocol. That can
also be a lot of things and they will publish the outcome of the election.
So that’s just the general idea of many schemes.
>>: You can probably attribute that to [indiscernible] in 81 when
[indiscernible].
>> Olivier Pereira: Okay, yeah that makes sense.
top of the list.
Okay I should add him on
So the problem if we want to apply all entropy measures and that is that we
will just not get any progress yet, because if you take any information
theoretic we have an adversary who sees a public key and then an encryption
of the vote using that public key. There is no entropy left on the votes
from that. The vote is just completely determined from the ciphertext. So
that’s not quite useful.
So for that reason we decided to move to some computation and [indiscernible]
of entropy. There are many ways of defining this set up here in
[indiscernible]. We took one way; essentially we say that for some kind of
measure of entropy F we have an amount of computation and entropy that is at
least on the value A given B. If, and only if, the reason this some random
variable B prime that is computationally indistinguishable of B and we have
at least our bits of entropy using the corresponding information entropy
metric, but with the distribution B prime given to the adversary.
So in particular if you want to apply this definition to that example we can
say that there are at least bits of entropy on the votes here given this
ciphertext. If there are at least our bits of entropy when the view of the
adversary [indiscernible] an encryption of a constant instead of having an
encryption of the votes, because traditional encryption schemes will make
this negligible of that.
So we can then have computation analog of [indiscernible] metric. This is
exactly the same as before, except that we have computational metric here and
a C over there. So that’s just a definition. The problem is that it seems
much more complicated to evaluate a metric like this. Essentially because we
need to find or to define somehow this B prime. And that’s something that
you cannot just sample maybe or, it’s more complicated.
So if you would like to have some convenient way of moving from computational
entropy to traditional end formation [indiscernible] entropy and at the same
time maybe removing all those stuff that is part of the protocol. So we
would like to get rid of the cryptography and say all protocol is good if we
have a amount of computation and entropy that matches the level of
information theoretic entropy that is obtained when we just see the tally.
So for that purpose we decided to go for a classical cryptographic secrecy
game which is something that looks maybe a bit more complicated, but it’s
much easier to use in cryptographic secrecy proofs. So the game is as
follows: we have and adversary, we have a challenger or an oracle. This
oracle chooses a private key keeper, just like in the [indiscernible] voting
protocol and it chooses a bit better. The goal of the adversary would be to
guess what would be better.
So the first thing that the oracle does is send a public key to the adversary
like in a regular election. And then the adversary can do two different
types of queries. So the first type of query is vote query. So these are
essentially muddling the votes of the honest voters. Those vote queries
contain two different votes, which is kind of in the direction of your
intuition before.
Now what the oracle will do with those two voting tensions here, he will run
the normal ballot preparation procedure from the protocol. So he will
produce two encrypted ballots and then he will have two bulletin boards. A
bulletin board zero and a bulletin board one. And he will place those two
ballots on the two corresponding bulletin boards, but depending on the value
of better the adversary will just see one of those two bulletin boards. And
so that is where the adversary sees something that is related to better and
he needs to guess, somehow, which of the two bulletin boards he sees. So
that’s one type of query.
The second type of query is ballot query. So here that somehow counts for
the adversary votes. The adversary can just produce what ever he wants and
maybe not following the ballot preparation rules and sends those encrypted
ballots. That might just mean nothing at all or could be computed from
things that the adversary sells on the bulletin board. He has a lot of
freedom for that and now the oracle will just apply the protocol to check
whether this is a valid encrypted ballot; if there is something describing
the protocol in terms of verification procedure. And if that is valid he
will post this B on the two bulletin boards. Again the adversary just sees
the bulletin board better updated.
>>: In the first case the votes are compared by the adversary?
>> Olivier Pereira: Here? No, by the oracle, these are “I vote for this
candidate” these are clear votes VO and V1. And the oracle performed there
the encryption computes proofs and all those things. While you are here this
is a real ciphertext and the proofs are in there. So essentially in this
case the oracle knows what will be displayed on the bulletin board.
In that case that might not be so clear. So we have the adversary can place
those two queries many times in whichever order he wants and then eventually
when he is done he says, “Okay I would like to see the tally”. And now the
oracle will just show the tally of the bulletin board zero, always the same.
So that’s essentially preference trivial distinctions that could happen with
the adversary. I don’t know we might be willing to say, “I want to see the
tally better”. But if we do that with just one vote query we have two
different votes from the tally the adversary would be able to know the
better. So he sees one outcome.
And we say that the voting system is secure for privacy. If the priority
that the adversary gets is better A is essentially one-half plus something
negligible. So, that looks like a traditional cryptographic game, more
convenient to proof re-using cryptographic tools. And one can actually show
that if a cryptographic protocol satisfies that definition then we have what
we expected.
So what we expected more formally was something like this: consider a
tallying function row. So that’s the outcome that’s expected to be computed
from the votes. And from this we have an ideal voting protocol in which
voters submit their votes perfectly privately to the trustee. And the
trustee reveals a row of all the votes. So that’s the ideal voting system.
And what we can prove is that if you have voting protocol that satisfies our
cryptographic game before then for any target in any distribution the metric
that we have on this is equal to the metric if we evaluate just the outcome
of the election, so to the ideal protocol. So no cryptography on that side
which is what we wanted to achieve.
We started with something that is computational here. In most particular
cases there computational will be just equal to the [indiscernible]. We have
it because the tallying functions. We did not say anything about it. It
might be something that in itself contains crazy cryptographic stuff. And in
that case we still have something that is computational. But for most
practical tallying functions that will just be the information theoretic
fashion.
Okay. So its 2:10, yeah maybe I can --. Now the next natural question will
be to say, “Okay, I have this cryptographic game, we would like to know
whether a voting system that follows this line satisfies a cryptographic
game”. So what do we need to do essentially in terms of encryption at this
step to have it that no adversary will be able to win here?
Okay. So the first intuition might be, okay we just have two votes sent.
The adversary needs to guess which one. This looks like NCP security or
something like that. So maybe Semantic security would be good enough.
Actually it’s not the case. So here is the kind of thing that could happen.
First the adversary queries for a vote for zero and then a vote for one. So
what the adversary sees on the bulletin board is an encryption of eight, or
zero, or one, depending on what would be better.
An adversary says, “Okay I will just take that
it encrypted here, but I will just use it in a
will ask for the tally. And we can shape that
better. So that’s certainly something that we
prevent this?
ciphertext, I don’t know what
ballot query”. And then I
the tally will be equal to
want to prevent. So how do we
One way would be to say that the [indiscernible] is to just reject all
duplicate ciphertext’s. We do not want people to take a ciphertext and resubmit it. The problem is that if we use just traditional NM-CPA encryption
scheme it might very well be homomorphic in which case people can take
ciphertext and does some re-encryption procedure and transform one ciphertext
into another encryption of the same thing. And that would not be detectable.
So we want to have something stronger from the encryptions. You want to make
sure that people cannot transform a ciphertext that they see. And then if we
are sure that they cannot transform to ciphertext that they see we can apply
a rule that says, “We reject the duplicates, because they are the only thing
that is left”.
So what we want is some form of non-malleability from the encryption scheme.
So other people have looked at this before and in at least two different
papers, recent papers, the conclusion was that if you use IND-CCA2 encryption
which is a fairly strong form of non-malleability we are good. That’s okay,
that’s good news. At least we have a notion that we know how to have
satisfied by an encryption scheme. And we have to security proof all that.
The difficulty is that if you look at most of the actually deployed or most
of the protocols that have been proposed for doing voting using the lines,
the approach as I presented before; that people use an NCP encryption scheme
ElGamal/Paillier encryption. And then they use sigma proof knowledge of the
time text, usually something that is made on interactive using the FiatShamir [indiscernible].
The problem if you use that is that we do not know how to prove this yield
IND-CCA2 encryption. This is well explained in a paper by [indiscernible].
There are some kinds of proof, unusual I would say. But we do not know how
to prove that this provides [indiscernible].
So there are two approaches to that, either we say, “Okay, let’s change all
those schemes and move to a real IND-CCA2 scheme or let’s check whether this
is actually necessary. Can’t we have some weaker form non-malleability?”
And what we proved is that actually we do not need IND-CCA2 encryption an
NM-CPA encryption which is a much weaker form of non-malleability is actually
good enough. And we also proved that if the strategy of the authorities is
to reject exact copies of ballots than that form of non-malleability is
actually necessary as well.
So NM-CPA is essentially a characterization of what is needed to submit
ballots in a voting system. So we have a weaker notion of non-malleability
and then the good news is that it’s much easier now to show that if we have
an NCP encryption scheme [indiscernible] or whatever, and then we have those
traditional sigma proofs [indiscernible] then we have an NCP security.
So the conclusion of that, or a corollary of that for practical encryption
voting scheme if you take Helios for instance, is that if we do the FiatShamir form slightly differently from the way it’s done there, but very
minimal change, then the ballot submission security procedure using Helios
satisfies the cryptographic game that I have presented before.
So, okay, that’s what I wanted to say about cryptographic privacy.
>>: Did they mention what happens if a transform is not done?
>> Olivier Pereira: Yeah, actually if, in the way it’s done, not all the
expected inputs are part of the input of the hush that is used in the FiatShamir transform. And actually an impact of that is that is if
[indiscernible] proves that I use their internet sound and that essentially
breaks [indiscernible] the scheme.
So it’s really an important thing to do that change in Helios. It’s not that
we cannot proof the security of the scheme; it’s just that the scheme there
is insecure and we need to change it to have that proof working.
>>: Is that going to come up in another speck soon?
>> Olivier Pereira: Exactly, it’s on the way.
>>: Okay.
>> Olivier Pereira: Thank you.
What Helios has planned?
>>: And it reveals vulnerability in previous uses that Helios --.
>>: Absolutely, yeah.
>> Olivier Pereira: Okay. One last thing I wanted to discuss is the balance
between privacy and verifiability. So let’s just say that I have been
involved in a fair amount of university verifiable elections and discussed it
with a lot of people who were considering adopting end-to-end verifiable
voting systems.
And one of the big concerns that essentially made some people at least not go
for universally verifiable encryption was the presence of the bulletin board.
So if you take the first ideas that the people have when they want to go for
an internet voting system for instance is okay we will just encrypt all
votes, we will send our votes to a server and a server will announce a tally.
And now you would like to take that verifiability. Okay, that’s good.
And now you also would like to display all the ciphertext on the web page.
Why would you want to do that? Because that’s what provides verifiability;
from those ciphertext’s we can verify that the outcome matches all the
ballots, encrypted ballots that have been submitted by the people. Oh no, we
don’t want that because at some point your cryptography will be broken or the
machines will become too powerful. Everyone will be able to decrypt
everything.
And we do not want the voting system to just provide the vector that makes it
easy for everyone to just download all the ciphertext and wait until some
cryptographic break happens and allows it to just decrypt everything, or wait
until there is enough computational power to just do some kind of brute force
attack.
So one questions that came there was, “Okay can we now offer a universally
verifiable voting system without impacting the privacy of the original thing
that people have in mind? Can we just add verifiability as a new feature
that doesn’t impact the other previous features of the scheme?”
So one idea for doing that would be as follows, we consider first a private
bulletin board. You can view that private bulletin board as essentially the
voting server. It’s where the people submit their encrypted ballots and
that’s essentially the view that people have for the non-verifiable voting
system. We have computational privacy with respect to this much selected
authorities who control the voting server.
Now we would like to add this public bulletin board. That’s what will be
used for universal verifiability, but now there does not need to be any way
of actually computing the outcome of the election. We can just have
something that allows verifying the outcome of the election.
So we could have information theoretic privacy on that public bulletin board.
And that way the public bulletin board just contains something that’s
statistically independent of the actual votes.
So in practice, if we have information theoretic privacy on the bulletin
board that means that we will also have computational verifiability of this
board, but that seems much less dangerous because it means that if the
cryptography is broken in 20 years people will be able to forge proofs or
claim that the public bulletin board indicates another winner, but nobody
will believe that because the cryptographic proves I am not sound anymore
after 20 years.
So that seems much less damaging than being able to say, “Okay, you voted for
that guy and that guy happened to have I don’t know much about the story,
could happen”. So to that purpose we designed, we proposed the user for new
cryptographic primitive, which we called Commitment Consistent Encryption.
So Commitment Consistent Encryption scheme is essentially a traditional
threshold encryption scheme that comes with two extra functions. The first
function Extract C takes a ciphertext, an encryption of the votes, and
extracts from that ciphertext a commitment on that exact vote. Typically the
commitment would be something perfectly hiding.
The second
function.
commitment
intent, so
feature of the encryption scheme is that we have and Extract E
That also takes a ciphertext and that gives you an opening of the
that you have got there that is consistent with the initial voter
the message that the voter encrypted.
So, just to have an idea on how to build something like that one way will be
to say, “Okay, I will take an encryption and a commitment scheme. I will
generate two keepers with the public key encryption scheme. I will generate
one key from the commitment scheme. And now when I want to encrypt a message
or a vote I will use a commitment scheme to compute a commitment and
[indiscernible] value that is needed for opening the commitment. And then I
will use my two encryption keys to encrypt the actual message and to encrypt
the [indiscernible] value that is needed for opening.” And so the ciphertext
is now made of the commitments and the two encryptions.
So that’s a simple way of building this.
building this, but it actually works.
Not a very efficient way of
>>: Do we know that the two encryption schemes have to be different, or is it
just necessary for the proof? Is it the same public encryption key
[indiscernible]?
>> Olivier Pereira: It is certainly easier for the proof if they are
different. I did not check. It might work with the same key. Yeah, it
should check.
So if we have this Commitment Consistent Encryption scheme then we can vote
as follows: we take the general idea of the voting scheme I presented before.
We still have the same first step, generation of the public key for this
Commitment Consistent Encryption. The voters now encrypt their votes using
that Commitment Consistent Encryption scheme. And now for verifiability the
authorities will not publish the whole ciphertext, they will extract the
commitment and publish the commitment part.
And now I will consider two different, verifiable ways, of computing the
tally to the two most common ways. One is homomorphic tallying. So the idea
is to take all the ciphertext, compute the product of all the ciphertext.
That provides an encryption of the outcome of the election and we decrypt
that outcome. So we can do exactly that thing. And if the commitment scheme
is homomorphic we will just let the auditors compute a product of all the
commitments. And when we decrypt the product of all those ciphertext’s we
will get an opening for the product of those commitments.
So now if we take the traditional approach what the authorities need to do is
to compute a zero knowledge proof that they actually decrypt the product of
the ciphertext. Here we have something that is simpler. We just need to
reveal the opening of the commitment, no proof needed. And now the security
follows from the binding property of the encryption scheme.
Another approach is mixnet-based tallying. So here the idea is that the
authorities take all those ciphertext’s here and then they do a verifiable
shuffle for those ciphertext and they decrypt the outcome of this shuffle.
So we can do exactly that here. We include verifiable shuffle proofs for the
commitments. We use the statistical zero knowledge proofs of the correctness
of the shuffle, that I would say most of the classical verifiable shuffle
comes with statistical zero knowledge proofs, so that’s not an
[indiscernible]. And now again we open the shuffled commitments and the
verifiability follows from the binding property from the commitments.
So how to build more efficient Commitment Consistent Encryption schemes? One
extremely useful tool that we found is what people call Structure Preserving
Commitment schemes. You might have seen this paper over there
[indiscernible] this year. These are nice commitment schemes that have
things that make life much easier and computation much more efficient.
So these are commitment schemes that are based on EC of groups that support
parings. In our case we go for symmetric parings which are the simplest
ones. So we have two groups, G1 and G2 and [indiscernible] map from G1 and
G2 to GT. And now here is an example of a commitment scheme that works along
those lines.
For the public key we take one generating from the first group to one
generating from the second group. And when we want to commit on a vote the
commitment part is actually computed just like a [indiscernible] commitment.
The big difference is that now the opening is not made of R, the opening is
made of G to the R. And that’s the, the convenient thing is that now it’s
incorporating elements. And that’s much simpler to mix with encryption, to
mix with [indiscernible] protocols. So all the zero knowledge proofs become
very straight forward when we use this.
So when committing we do not need to perform any commitments. So the voters
do not need to compute any commitments. It’s just part of the audit if you
want to check that the outcome is correct, that you need to compute some
parings that this is working fine.
In terms of security the binding property of this scheme is implied by the
[indiscernible] assumption in the two groups. Actually weaker computational
assumptions are needed, but since we will mix this with a gamma later
everything holds if we have DDH.
So here is the first to work with all two schemes. This one is designed for
homomorphic tallying. So it means that we need to have something that is
[indiscernible] homomorphic. And also that we can tolerate something where
the encryption procedure is not efficient, because what we will need to
decrypt is the outcome of the election and that’s not something that is too
big.
We can do an exhaustive search, for instance, to find it, which is what we
have to do with [indiscernible] gamma anyway, so two generators of the two
groups, selected randomly. And the private key is actually the
[indiscernible] of G1 and the base G. And now very simple the commitment is
exactly what I presented in a previous slide. The opening was G to the R and
so we use this [indiscernible] key to encrypt G to the R, so traditional
[indiscernible] encryption.
So the commitment is this part. It is what is published and for the
encryption like I said we need to extract some [indiscernible] in GT.
can be done efficiently by an exhaustive search.
That
Okay so we have all the properties that we expect. This is IND-CPA so that
means that we can add the sigma proof to make it an IND-CPA and have all the
proof I presented before working well. It is compatible with traditional
[indiscernible] protocols.
Okay. So maybe I will go faster on this one. So we have a second encryption
scheme that is more designed for mixnet-based tallying. So the difference
now is that we have a large message space. We want to be able to encrypt
whatever we want, but we do not need to have an additive homomorphic now. We
just need to be able to do re-encryptions. We just need to have a neutral.
And so by tricking the commitment scheme we have something that is still
quite similar; a mix of [indiscernible] and commitments, same properties as
before.
So we considered the efficiency of those solutions. So one previous solution
used in a different context also for voting, but slightly differently was
based on [indiscernible]. So you can see that if you take all two encryption
schemes compared to the solution there we have roughly a factor of 100 in
terms of efficiency benefits.
And I think that’s probably not the most important benefit. I think the most
important benefit of those solutions compared to that one is that the
threshold key generating procedure for [indiscernible] is something that is
quite complex. Here we do everything in primary groups. So we have a much
simpler key generating procedure. From a practical point of view that’s
important.
So to conclude we have these [indiscernible] information theoretic and
computational definitions of privacy for voting. We showed a cryptographic
game that can be used to bridge the two notions fairly easily. We identified
an NM-CPA security as the necessary and sufficient notion that an encryption
scheme must satisfy for submitting ballots. We showed a traditional approach
of using an IND-CPA homomorphic encryption scheme with a sigma protocol using
[indiscernible] offered for an NM-CPA security. And then eventually we
proposed new encryption schemes that proposed, we think, an interesting
compromise between information theoretic and computational privacy.
Are there any more questions?
>>: [indiscernible].
Yes?
Okay one second.
>> Olivier Pereira: Yeah, in JavaScript; in JavaScript not Java.
>>: I’m sorry?
>> Olivier Pereira: In JavaScript.
>>: Oh, JavaScript, okay.
that [indiscernible].
So you expect it to be 50 milliseconds if you use
>> Olivier Pereira: Yeah, by doing a [indiscernible] is implementing this and
this is just by using standard [indiscernible] and if we just try to
implement it more efficiently we can have it much faster.
Thank you.
>>: You managed to weave together --.
Download