22000 Vinod Vaikuntanathan: Okay. So we're very, very happy... about noninteractive verifiable computation. Bryan was a graduate student...

advertisement
22000
Vinod Vaikuntanathan: Okay. So we're very, very happy to have Bryan here to talk
about noninteractive verifiable computation. Bryan was a graduate student at CMU, but
his advisor was Adrian Perrig. And he now is one of our own. He works in the security
and privacy group of [inaudible], all thew ay across the big chasm from our building.
But welcome.
>> Bryan Parno: Thanks, Vinod. So like Vinod said today, I'm going to be talking about
noninteractive verifiable computing. The first part of the talk is based on work that went
into my thesis and also into a crypto paper.
And it may be familiar to some of you. And then later in the talk I'm going to get into
work that I've been doing here with Seny and Vinod and David Molnar to try and extend
the work a little bit further. You'll probably notice the transition where we go from lots of
pictures to little bit rougher material, stuff that's still being worked on. I'm very interested
in ideas or suggestions you may have.
So to dive in, the general problem that we're trying to address is that there's some
person out there, say Alice, who is a scientist. She has a lot of work she needs to do but
she doesn't have a lot of funding. So what does she do? She outsources her
computation to the cloud. So she has some problems she wants to solve and data that
goes along with it, and she hopes to get an answer back. And based on that data she's
going to choose some new data and submit a new query. And she's going to get an
answer back.
And unfortunately from Alice's perspective, it's very hard to tell is the cloud doing what
she's asked it to do, or is it doing something a little bit cheaper and simpler from their
perspective.
So obviously this is a hot topic right now. There's all kinds of people that want to
outsource computation. There's sort of the distributed systems that have been around
for a long time, study at home, boink, there's our own personal offerings here at
Microsoft, Azure and also Amazon's, and even the other element which you might not
immediately think of is mobile computing. So you have this sort of weak client device,
you'd like to outsource your computation to something stronger.
But the question is can the results you get back be trusted. There's no point in doing this
outsourcing if you can't rely on the results you're getting back. We'd like to provide some
higher level of assurance about them.
And so to be a little bit more precise about the goal, we'd like Alice to be able to specify
some function to a third party who is in this case untrusted and supply an input for that
function and get back the result of applying the function.
And then she should be able to adaptively choose new inputs based on that output and
get back additional inputs, hopefully polynomial many. And the key requirement here is
we would want to have integrity. We want to know the results coming back are correct.
An additional requirement that you might have is secrecy either for the inputs, the
outputs or both. Some applications like study at home don't care about secrecy.
They're doing scientific computing, it's an open process. Secrecy is not important.
If you're doing medical data or outsourcing rendering of your latest film, then secrecy is
very important. Of course, the key constraint that makes this interesting is that you have
to do less work to prepare and verify the information than computing the function
yourself, otherwise why bother.
So there's been some previous work in this area, and it falls into two main categories.
The first is verifying specific functions.
So, for example, early I guess 2000s, there was some work on verifying the inversion of
a one-way function. And so for that specific class it's a little bit easier because you put in
some answers you know and ask for the inversions on the one you don't know and do
some comparisons when it comes back.
Obviously that doesn't work for all functions. Similarly, anything in NP presumably is
easier to check than to do yourself.
But there's lots of other interesting functions that don't naturally lend themselves to this
kind of approach. Then there's work on general functions. They have largely come from
the PCP family. So the prover generates a large PCP proof, commits to it in some
fashion through random Oracle, for example, and selectively reveals bits about it so the
guy outsourcing the computation can verify the proof actually exists and is correct.
There's been some work on reducing how much you rely on the PCP. Anytime you get
PCP involved, everything gets more complicated and much larger and harder to deploy
in practice. It would be nice to avoid this machinery if we could.
The other interesting point is none of these consider data privacy. All this previous work
was just about the integrity property, not about secrecy. For some of them it's somewhat
complicated to think about how you might go about adding secrecy on top.
So in contrast, the protocol that we developed within this framework is generic, works for
any function. It avoids all this complexity of PCP, CS proofs, and it's asymptotically
optimal in terms of the amount of CPU and bandwidth utilization.
It's noninteractive. So I hand off the problem and eventually the answer comes back. I
don't have to help you along the way. And it has this nice property of preserving input
and output secrecy. So looks pretty good. We'll get into later in the talk why there's
some drawbacks to this approach though.
In general, I'll go into the general protocol we developed and explain how the
construction works, what the advantages and disadvantages are. And then in the
second half of the talk I will talk about some of our more recent efforts about making it
even more practical and trying to get it somewhere where you might actually envision
running it on a computer.
So high level, one of the changes that we made in this problem area is to change the
model a little bit. And we did that by introducing an off-line phase in which you perform
some amount of key generation. And so the idea is that you're going to do this one time
for, say, for a function, and then amortize that over many inputs, which you want to
evaluate that function.
And so when you're in the online phase, you're going to choose some new input X.
You're going to generate a problem instance from that using your secret key.
You're going to give a portion of that to the worker and keep some secret information
potentially to yourself. And the worker is going to execute a compute function on that
input using the public key that you supplied and eventually produce some alleged output.
You have a verification function that's going to tell what the actual answer is or that the
worker has tried to cheat you somehow. By doing a whole lot of these instances for a
given function, we hope to amortize the initial work we have to do for the setup phase.
>>: You don't even decode the answer?
>> Bryan Parno: So the decoding is part of the verification. So verification spits out the
decoded output or some bottom.
>>: [inaudible].
>> Bryan Parno: Yeah. So one of the key insights here was that if you look at Yao's
garbled circuit computation, make a few tweaks, we can turn it into what we call a
one-time verifiable computation. So you can verify a single input computation. At a high
level what we do here, we choose the function we want to compute. We convert it into a
circuit. We apply Yao's garbled circling technique, which I'll go into in a few more slides.
And we send the circuit to the worker. We then choose input, garble the input, following
Yao's techniques, send that to the worker. The worker then uses Yao's techniques to
apply the circuit. But it's important here we're not doing any oblivious transfer. In
traditional Yao we have two parties, they both have secret inputs. Here we only have
Alice supplying inputs.
Second -- and so of course the worker is not supplying any inputs, just using the ones
that Alice gives him. When he gets the response, traditionally in Yao you hand off a
decoding table so the worker can tell what the answer was. But here we don't care
about the worker except as far as he does work for us. So we're going to send the
encoded output back to Alice and then she's going to use the decoding table to check
the answer and make sure it's a legitimate answer for this function. So that's the high
level view.
Let's go into a little bit more detail and to refresh your memory about how Yao's
construction works. The first step you have to take is to convert the function you want to
compute into a circuit.
>>: Is it a one-time computation you didn't really say the computation, you could have
computed --
>> Bryan Parno: Yes, exactly. One-time computation is not very useful. But we have
sort of a generic technique for transforming a one-time computation into multiple.
>>: Following [inaudible] somebody chest, somebody need Yao, it was a million to one,
are you in the same ballpark?
>> Bryan Parno: For using Yao.
>>: For using Yao. Standard Yao, not modification.
>> Bryan Parno: Yes, quite possible. We're using Yao fairly generically. There's
certainly been some work on how you can do Yao more efficiently. Benny Pincus's
group and whatnot. But, yeah, we still have all the slow down you would get from
converting to Boolean circuit and computing that way on computers that are not
optimized for that.
So you do have to convert to this Boolean circuit. You crank it through your favorite
compiler, produce that. And the next step is to do this garbling. So this is what Yao was
using for two-party computation back in the '80s. And the idea is that for each gate and
for each wire in the circuit you're going to choose two wire labels.
So for wire A, we choose A0A1, B0B1, and Z0Z1. For each of these values chose some
large space dictated by security parameter.
Then we're going to write down the truth table for the gate. So this is an N gate. You
can check, make sure I did that right.
And then we're going to replace all the bit values with the corresponding labels that we
chose in the first step. So then to actually compute the garbled version of the gate,
we're going to encrypt each value of Z with the corresponding values of A and B as keys.
And so for the first row in the table we're going to encrypt Z0 using A0 and B0s keys and
so on throughout the table. We'll say that the garbled representation of this gate is these
four cipher texts.
>>: So [inaudible].
>> Bryan Parno: What's that.
>>: Also randomly permuted.
>> Bryan Parno: Yes, for Yao it's important you randomly permute this. For our
purposes it's not as important. Because we're just doing the integrity part.
So then to garble the input it's very simple. You pick the bits. You convert them to their
corresponding labels. So A1 and B0 in this case, and you can do that for all of your
label values. And so Alice is doing this whole process for the entire circuit, the entire
input, sends this whole big collection of cipher text over to the adversary along with the
input and then he has to do the work.
How does he do the work? For each of the four cipher texts, he takes the first garbled
input he's been given and tries to decrypt all four. So for this example it will work on the
last two. Then he takes the second one, tries to decrypt all four and, it only works on the
third one. As long as you use a decryption function that makes it evident whether
decryption has succeeded or failed it's very easy for the worker to say this is the output
of this particular gate. And clearly you can propogate this through because the Z0
becomes the input value for the next one and you use that to decrypt the next set of
cipher text.
The worker can do it all the way through the circuit and wind up with some
representation corresponding to a series of wire labels for the output. In this case just
Z0. So then Alice has to check to make sure he did the work she asked him to do. The
way she does that, she compares the values she got back. If it's Z0 she concludes the
answer was Z0. If it was Z1 it's a 1, if it's none of the above, then she concludes the
worker was trying to cheat her.
Simple security analysis. This is to say, if you don't want to get rejected as a worker,
you have to produce either Z1 or Z0, there are large random numbers, so chances of
guessing are small. The only recommendation you have about ZI value was the cipher
text to which you don't have proper keys. As long as you choose a good encryption
system then your information about this is computationally negligible. Good?
So but we do have this problem. Somebody mentioned that it's insecure to reuse these
circuits. And it's easy to see if Alice chooses a new input, say 1-1. Computes the
garbling, whatever it happens to be and sends it over to the worker. The worker can
simply ignore whatever he was given and send back the same thing he returned last
time.
And so in this case it was Z0. Alice will look at it. It's a legitimate value for the output
wire label. It's one of the two Z values but in this case it's the wrong Z value. So Alice is
tricked into accepting A0 when it should have been a 1.
And so the interesting observation here is that the only reason he's able to cheat is
because he's recycling this old knowledge. It's because we gave him some bit of
information he didn't have in the first round that he's able to cheat in the later round.
So what we said was, all right, well, let's add another level of encryption to get rid of this
information we gave up in the first round.
Oh. And sorry, the first point is that you can't just simply throw away the circuit and
compute a new one because obviously that's as expensive as doing the computation
yourself. So what we really need is some manner of recycling this circuit so we can use
it over and over again.
And, in particular, the way we're going to do that is using fully homomorphic encryption.
So fully homomorphic encryption means if we have two cipher texts, we can take an
arbitrary function, evaluate it over those cipher texts, and get an encryption of the
function on the underlying plain text. In particular, an interesting function you might want
to apply is the decrypt function for the Yao encryptions. So what this gives us is an
encryption with a homomorphic encryption system with the decryption of B with A as a
key.
And, of course, it's important to note that fully homomorphic encryption on its own
doesn't give you integrity, because if I give you a bunch of homomorphically inputs, you
can compute whatever function you want over them, I will get a legitimate set of cipher
text on the return.
>>: Do you have any estimate how slow that would be?
>> Bryan Parno: Yes. Fully homomorphic encryption is very slow. So ->>: You built it on top of Yao.
>> Bryan Parno: Yes, so current -- think Craig and Shy have an implementation of
Craig's fully homomorphic encryption scheme, and it runs anywhere from I think a
minute -- minutes to lots of minutes to do a bit of this function F. And then so for each
gate -- for each F gate that you want to do, it takes on the order of minutes. And so for
our purposes if we want to do, say, decrypt, which AES has maybe 10,000 gates without
a whole lot of optimization, then you're looking at days to weeks per verifiable gate.
>>: But the costly work is only the evaluation, right, the encryption and decryption are
fairly simple and the evaluation is only done by the server anyway.
>> Bryan Parno: Yes so ->>: It's still not practical because even for a server it's a lot of work. But ->> Bryan Parno: Especially for the person doing the outsourcing, the trade-offs are
good. So Alice doesn't have to do a lot of work compared to the amount of work the
worker does. On the other hand, Alice may be paying Amazon for each compute cycle
so expanding the work factor by weeks is probably not optimal.
But, yeah, it's a good point. So how do we apply this? Well, we do the same thing we
did before. We compute the garbled circuit, give it to the worker. Compute the same
garbled input. This time rather than giving the input in the clear we choose the public
key for the home more for if I can system and send the input to the worker. The worker
uses the key to homzly input the circuit and apply it to the input that we've given and that
will result in an encrypted version of the garbled output.
And it's important to note if you sort of naively apply Yao in this case you get something
that's very inefficient. In a few slides I can show you how you can do this a little bit more
intelligently. On the order of magnitude we're talking about, it probably doesn't matter
that much. But it's a nice optimization.
But the worker can't make anything out of this. All he has is a cipher text. So he has to
return it to Alice who has the key and hence can decrypt this result she got and do the
checks that she did before.
So now, of course, when she wants to outsource a new input, say W, all she has to do is
garble it and choose a new public key for the homormorphic encryption system, provide
the public key, the worker has to convert the circuit into an encrypted form again and
apply it to this new input. Once he's applied it, he's going to get a new encrypted output,
return it to Alice.
Now the nice thing about this system is we can repeat it polynomial number of times by
choosing a new public key each time and. If the worker ever tries to recycle something
he saw before it's very obvious he's cheated because it's going to decrypt badly with our
newly chosen key.
So before I go into sort of the Yao optimization, this last step -- question?
>>: Question, so what property -- why do you need to public key [inaudible].
>> Bryan Parno: You need a new public key every time because if we recycled the blue
key, for example, then we could give him the blue version of W, and then he's just going
to give us back the blue version of Y again. So we run -- you'd run into the same
recycling property.
By changing this key, we guarantee that if he ever tries to give us back an old output, we
know, because it's going to be encrypted with a different key than it was before.
>>: So what property do you need for the encryption scheme to do this?
>> Bryan Parno: From the verifiable from the verifiable homomorphic scheme?
>>: [inaudible] it's semantic secure. It's nonmalleable. It's different keys, right? If I give
you encryption of a message under one key, you can't construct encryption [inaudible] a
different key. That's semantic properties guarantees, right? That's the thing you need to
do. Semantic security does ->> Bryan Parno: NCPA gets us through, which is important because fully homomorphic
encryption is never going to be CCA secure, at least not in the traditional sense.
>>: I see.
>> Bryan Parno: So this transformation that we've applied here actually is a generic
transform. So we can take any scheme that's one-time verifiable, apply this
transformation, and wind up with a generally verifiable computation scheme.
>>: [inaudible].
>> Bryan Parno: Sure.
>>: So another thing if you encrypt under the same public key then whatever time you
can carry on if I give you the [inaudible] without encrypting, you can also apply the same
attack inside the encryption.
>>: Yeah, that's true.
>>: Okay. Now the [inaudible] cannot be taken care of, can it be taken care of even if
you change?
>> Bryan Parno: What do you mean by an attack?
>>: It's the same public key you can get a setting. If you change the public key between
multiple executions, I assume there's ->>: [inaudible].
>>: So it's like what we all said before. Just think of a case if you could do anything like
this then you would just -- even if you had just one public key you could generate
another public key for your homomorphic, if you could do anything else between the two
keys you can also do it with a second key that you generate by yourself. So semantic
security itself guarantees that ->>: No, no. So here the goal would be to force incorrect output. The goal would be to
force correct output. I never go the same public key, I could force the incorrect output.
>>: Yes. But the attack is to correlate what you learned in the first execution. Either in
the clear or in the [inaudible] text to do something about it in the second. What I'm
saying is what we're trying to say is you use two different public keys, then you get
nonmalability. What you learn in the first case, inside or inside the cipher text doesn't
help you, in the second case.
>>: I guess you cannot get it under the new public key.
>>: Under the new public key you can't, but that's what -- if you know it, then ->> Bryan Parno: It might help -- so there's actually -- we can look at a second example
where we go from a one-time verifiable scheme to a fully verifiable scheme. So this was
from a scheme from crypto that was -- if you're familiar with the recapture system where
you enter two words on the Internet they know the answer to one and the other is used
to help with OCR, it's very similar to that.
So the idea is so we're going to build it up from a very simple scheme. So Alice wants to
verify the computation on a single input so she chooses some random input herself, and
precomputes the function on that input.
And then when she's given a random input X, she sends both X and R to the worker and
the worker's expected to compute F on both of those and they're randomly permuted,
when it comes back Alice just compares the results she computed and makes sure that
that one's correct and then checks, and accepts that the other one's correct.
So soundness one-half so you could repeat this a whole lot of times. All right. But we
don't want to compute on arbitrary, on random Xs, we want to compute on specific Xs.
So the way you get rid of that is you again pick R at random. You precompute F of R.
But this time we homomorphically encrypt, this is not the transform, this is layer one of
homomorphic encryption. The input R and homomorphically apply the function F. So
this time we can pick an arbitrary X and we're going to send an encrypted version of X
and encrypted version of R to the worker.
Essentially we went back to the system we had what appears to the workers two
randomly chosen inputs. Except that now we can pick X arbitrarily. Even X worker
might be what the worker might anticipate. So now the worker has to return encrypted
versions of these two and we compare this homomorphically encrypted version with the
one that came back.
There's an interesting catch that you might think you could precompute just F of R and
sort of decrypt these two that come back and do the comparison but that actually leaves
you open to an attack. That's a bad idea. So it's important to do the check this way.
>>: I think you mean compared as far as what you get, like ->> Bryan Parno: Compute the homomorphically encrypted F of R.
>>: Doing as much work as a worker and paying for all the homomorphic stuff, too.
>> Bryan Parno: She's not winning yet. This is building up a one-time variable system.
This gives us soundness one-half. How do we get up to better soundness? You do it by
sticking I here. You have some KR values, KX values, you intersperse them, and based
on how many you choose, you basically get 1 over 2 to the K on that order. Now we've
got soundness. This is one time verifiable, because these Rs look the same. And so if I
send the same Rs each time, the worker will know which ones he has to do the real
computations for and can cheat on the other ones.
So this is where the -- Melissa?
>>: [inaudible].
>> Bryan Parno: No. So this is so you can do multiple -- oh, yes, sorry. I'm sorry. Yes.
Same X value. So you check that not only are all the R values correct but you also
check that all the X values match.
>>: [inaudible].
>> Bryan Parno: Yeah. The trouble with this fuzzy notation here.
So how do we convert this to a reusable scheme? Well, we apply a layer of fully
homomorphic encryption on top of it. Basically we generate a new public key. Encrypt
all of this stuff fully homomorphically and he does all the stuff that he was going to do
before fully homomorphically. And this way each time we choose a new random public
key we can recycle these R values and the F of R values that we calculated here.
So if you thought that ours was slow, you can try adding yet another layer of fully
homomorphic encryption on it and this part is going to be age of the universe
computation.
On the plus side, this scheme lets you get away from the very large public key that we
had before. So with the protocol I described, we have to transmit the entire garbled
circuit into the worker and he has to do the encryption. Here, I just have to transmit the
inputs. Of course, they're doubly fully homomorphic encryption so they're large.
Theoretically it's a nice property.
>>: Two approaches are really, can you think of them as really the same, except that
you replace Yao by fully homomorphic encryption compared to [inaudible] so you both
have fully homomorphic encryption on the top layer. Instead of Yao you use ->> Bryan Parno: Yeah, so I guess theirs is a combination of this pick-and-choose, plus
fully homomorphic encryption, whereas Yao is a slightly different property, I think.
But it's two different ways of arriving at a one-time verifiable computation and the same
mechanism for doing the recycling. If you're doing the Yao approach sort of a naive way
would be to say take whatever gigantic circuit you would use to ordinarily evaluate Yao.
So whatever that big program is and evaluate fuelly homomorphically that's expensive
because the eval step is the size of that circuit.
So you might want to do it simpler and do it in pieces. So, for example, you might
encrypt each one of these cipher texts with a fully homomorphic encryption key, and
then restrict yourself to only evaluating the decrypt function.
So that's naturally going to be much smaller than the larger function that calls decrypt,
looks at the outputs, does some work and so on.
And so that seems nice, because we're just mimicking what Yao would have done
before and it's just happening inside of the fully homomorphic encryption. So we're
decrypting with the second value. And we wind up like this.
What's the potential problem here?
>>: Can't check ->> Bryan Parno: Exactly. In the old Yao we could look at the old value and say aha this
looks different from the other ones. But now we can't check directly. So we are forced
to homomorphically uncheck.
But the advantage is that both -- we can do that via simple addition, and the encryption
scheme or the decrypt function can be very small, and so overall this gets you a more
practically efficient than doing the entire Yao process inside of the garbling.
Of course at the cost of fully homomorphic encryption schemes, it may not matter a
whole lot. But we'll take what optimizations we can get here.
So in terms of proof sketch, what we --
>>: That sigma is coordination? On this slide.
>> Bryan Parno: Actual sum. So we just assume that the decrypt function either returns
the plain text, if it's the correct key, or zero.
>>: Or zero. All right.
>> Bryan Parno: So the proof basically goes in two stages. First we show that Yao is in
fact one-time verifiable, and you can basically reduce that to the security of Yao as a
multi-party computation.
And that's because you can take all of these encryption values that we had before
representing a gate, and slowly replace all of the inner illegitimate values with the
legitimate values.
So at the end you wind up with a circuit where only legitimate output values are
available, so the adversary can't possibly produce the cheating output. As long as the
encryption scheme is NPCA secure, you can't tell the difference between the two worlds.
Once you know that Yao is one-time verifiable you have to take the second step and
that's based upon the semantic security of the fully homomorphic encryption system. So
there we can slowly replace these input values with randomly chosen values. So the
worker is going to proceed through the circuit just as he did before with these random
values. All the decryptions are going to fall. He'll wind up with some encryption of zero
but because of the NCPA property of the encryption scheme, he can't distinguish that.
So isn't going to learn anything helpful.
So just to summarize, we have these nice properties that we have a generic
construction, because any F that you can turn into a circuit will work. It's not interactive.
Preserves the input output seizuringsy just as survey side effect of using the fully
homomorphic encryption. And has nice asymptotic performance.
So from Alice's perspective she does this one-time computation of garbling the circuit,
which is linear in the number of gates. Then for each work instance that she wants to
outsource she needs to garble and encrypt the input which is linear in the number of bits
of the input.
The worker then homomorphically applies the circuit, which technically ignoring security
parameters is linear in the number of gates in the circuit. And finally we do the
decryption, which again is just a single operation per output bit. And so at the end we
wind up with something that's linear in the size of the input plus output from Alice's
perspective and it's linear in the size of the circuit from the worker's perspective which is
nice when you're paying for these computations. Theoretically it's a pretty nice system
but there are some drawbacks. One we're only achieving efficiency via amortization. If
you're only going to evaluate the function a number of times, then this is not the way to
do it.
>>: That's true also for the other solution, the [inaudible].
>> Bryan Parno: Yes, because they also do the precomputation to determine what the
correct output should be.
>>: Is there any solution that doesn't require modernization?
>> Bryan Parno: I believe the PCP ones.
>>: The proofs.
>> Bryan Parno: Somebody just shows up to your door and says hey I've got this proof
and they give you this short little hash. Then there's this big elephant that we're using,
fully homomorphic encryption, which we've already talked about how slow that can be.
And so interesting -- then there's this additional interesting question of how do you
respond when you catch a worker cheating? So, say, the worker does send something
back that doesn't match what you were expecting, you might think, okay, you're going to
send them a nasty letter, you're going to deduct some money from his account, what
have you.
The trouble is with our current construction, we can't make the proof go through when
we provide this extra feedback to the worker.
And the way you can think about it is that the worker can speculate as to what the
correct output label is. And using the fully homomorphic encryption properties he can
basically toggle one of the bits in the output label. And he can submit that to us. And
we're either going to say yes it's correct, in which case he knows what that bit was, or
we're going to say, no, you're cheating in which case he learns a bit.
And so you could say anytime you catch somebody cheating you regarble the circuit and
maybe it's worth paying that extra cost for having caught somebody. But it's not, seems
undesirable.
The other thing you could do is, say, in the morning somebody at home could send out
all their all units for work, then collect all results and late in the evening run the check
over a million people and catch all the cheaters that way.
And so ->>: Naively, if I catch somebody cheating I don't do any more business with him but
you're using the same encryption for a bunch of workers, is that the problem?
>> Bryan Parno: Ideally you'd like to be able to use the same encryption over the same
number of workers so you can't tell which ones are colluding. But you're right if you're
using it per worker then you wouldn't have this problem.
And then you could always throw lawyers at the problems then.
>>: Of course the lawyers. Just throw the switch.
>> Bryan Parno: Sure. And so some of the work we've been doing here at MSR is
looking at how can we make the system more practical. So sort of taking some of the
ideas but something that is not going to take on the orders of weeks to compute.
And so the high level idea is to say that there seems to be an interesting connection
between verifiable computing and various proxy schemes. So proxy signature schemes,
proxy reencryption schemes.
And so just as a reminder, proxy resignatures were a primitive introduced via Blaze,
Bloomer and Strauss [phonetic], and then later sort of more rigorously formalized by
Antnessy [phonetic] and Homberger [phonetic]. The idea is you take the standard
definition for a signature and you add two additional functions.
So one is a rekey operation, which takes in a public key and a secret key from two
different users and produces what's called a proxy key or a rekey.
And the idea is that you can take this rekey and feed it into the second signature called
resign. And what this lets you do if you have a proxy key from A to B and you have a
signature from A on some message, you can convert it into a message from B.
And so essentially by giving up your secret key here, you're basically saying, anytime
Alice signs something, it's okay to say that it came from Bob.
And so there's a number of properties you might want from this. So this is from antenna
antenna and horn horn defined a lot of properties external which means if you're not a
participant you should have the same forgery preventions as any other signature
scheme.
If you are participating, you want what they call internal security or limited proxy.
So the idea is that if I give a proxy my key, then or a proxy key, he should be able to
produce signatures that were legitimately signed by the person who is the target of the
proxy and that I -- that if the proxy holds the key he shouldn't be able to forge my
signature on messages that weren't signed by the person that I delegated to. And
similarly the person that I delegated to shouldn't be in danger just because I created one
of these keys.
In terms of functionality, there's a whole lot of different nice properties you might have.
So you might want to say unidirectional. So that means that I can proxy -- I can give out
a key that proxies from me to Vinod and it doesn't automatically imply you can proxy to
Vinod back to me. It maps nicely to real world relationships.
>>: That's security property, right? If I give you the proxy key from you to me, you can't
forge I guess your signatures.
>> Bryan Parno: You can think -- it depends. If you say you need this functionality, then
it changes what your definition of limited proxy is.
Then there's multi use. It might be the case that a proxy key can only transform a
signature once and nobody else can transform that signature again. Or you might be
able to say that signatures can be proxied in arbitrary number of times.
You might want a private proxy. You might not want it to be the case that you can
distinguish signatures that are the result of proxying from signatures that were sort of
generated straight from the source.
Transparent is very similar. Nontransitive means if you have a proxy key from A to B
and B to C you shouldn't be able to combine them into a proxy key from A to C.
And some other properties that probably aren't too important for our discussion. So
there's been a handful of existing implementations. The first one from BBS is largely
program -- it just doesn't stand up to hardly any of these properties that you might
actually desire for the scheme.
The more recent papers proposed one scheme that's multi-use, which is nice. You can
proxy arbitrary number of times, but it's bidirectional. If I give out a key for me and
Vinod, it's going to go back and forth. They also have one that's single use but it's
unidirectional or it's unidirectional but only single use.
More recently, in this paper, Lee Barren [phonetic] and Vinod came up with one that's
multi-use and unidirectional. That's nice, but it has a problem that your signature grows
every time it gets proxied and grows linearly. And you can think of this scheme in some
fashion as essentially giving out almost a certificate chain.
So I sign something that says it's okay to take from me to Vinod. Vinod signs something
that says it's okay to go from him to Sini, so to proxy twice you sort of staple these
certificates together.
That's why you get this linear growth. There's a little bit more subtlety to it so they can
have the transparency property so they can hide where the signatures came from but
that's basically it.
>>: The assumptions there in these works?
>> Bryan Parno: Let's see. I think these are both based on pairings. So various elliptic
curve ->>: Paired things or ->> Bryan Parno: Fairly standard things. This one is based on a slightly non-standard
pairing assumption. It's like triple DDDH or something like that. Looked plausible.
So one of our ideas in this direction has been to say what if we take this notion of proxy
resignatures and generalize it a little bit into something called threshold proxy
signatures.
So we're going to have two algorithms that are very similar to these but instead we're
going to have something called threshold proxy key. And that's going to take in two
public keys and a secret key. And generate a threshold key.
And the property of this threshold key is you can have this other algorithm that says if
you're given the threshold key and a signature from A and a signature from B on the
same message, you can convert that into a signature from C.
So very similar from before, but we had this property that you need two signatures to
advance to the third one. And so if you look at this -- you can actually see that once you
have these you don't really need these, you can build this from something like this.
And in fact for our purposes, we can take this set of properties and we don't actually care
about all this for circuit construction.
So all we care about is unidirectional. If you're going through the circuit it's important
that you can't get part way and sort of reverse your way back up to learn things that you
weren't supposed to know. We need multi-use because of course circuits can have
many hops and key optimal deals with how big your proxy keys grew.
And so the way you would construct this is you would say we're going to choose proxy
keys or choose signature keys for each one of these wire values, just like we did with
Yao, using this key gen function. And then for each one of the logic evaluations in that
gate we're going to generate a threshold key. So A0 and B0 lead to some value of C0,
Z, so we're going to set up a threshold key for that. And so then you're going to wind up
with four threshold keys that represent this gate functionality. So to use it, when you get
some input X, choose a new random message for each bit and encrypt the message
using the corresponding signature key and you give this to the worker who then applies
these threshold proxy keys to calculate a way through the circuit.
>>: Keep the message on the signature key, what's that mean?
>> Bryan Parno: So each one of these is a signing, is a signing key. So say the input's
01. She's going to use A0 to encrypt the message and she's going to use B 1 -- sorry.
Sign the message. Sorry.
>>: And your message is going to be what?
>> Bryan Parno: Randomly chosen message from some reasonably large space.
The intuition is you can use the threshold key proxy to a signature from C 0 or C1 but not
both because you don't have corresponding inputs. At the end of the day I can use the
verification key to check that you have a legitimate signature from either Z0 or Z 1 on the
message I gave you.
So this is a nice property that the proof of security basically reduces exactly to the
threshold proxy signature scheme, because the threshold proxy, you basically have to
say that security depends on that I never gave you a signature with some combination of
keys that you already have threshold keys that get you the output.
>>: Even though it seems like you're doing -- you can do an end but not an end or is that
something that can easily be solved?
>> Bryan Parno: No. So the threshold that you get "and.."
>>: In this ->>: You just switched.
>>: Oh, you just have a cross -- okay. Good. Good.
>> Bryan Parno: And then we also have this nice property because we're using
signatures, we get around the adaptivity property, because a signature scheme is
defined if I give you a signature on one message, it shouldn't help you generate a
signature with another message.
So this seems like a promising approach. The problem is if we look back at these
existing constructions of proxy resignature schemes, well, then one's broken, we're not
going to use that.
This one's bidirectional, no good. This one is unidirectional, but doesn't have this
threshold property. It's not obvious how to combine the two. And this one definitely
doesn't. So they're out.
And this one definitely doesn't have the thresholding property because you basically
have these two long certificate chains and you can't easily just merge the certificate
chains that are coming along. Okay. So can't use prior works. So we need to come up
with something on our own.
And so one way in which we've attempted to go about this is to define a slightly weaker
notion, which is additively key homomorphic key proxy schemes. So we still have sort of
the standard proxy definition, but we also have this additional property we want. That is
if you take messages that are signed by two different people or take the same message
signed by two different people and perform some operation on them, you can get a
signature on the message with the sum of the two keys. So a little bit of a strange
property, but not unimaginable.
And certainly weaker than the full threshold property. So one way you can think about
instantiating this is with RSA or something like RSA. So standard key gen and signing
procedure for naive RSA up here, then to rekey you just give out the division of the two
keys. For resigning, you take the signature you're given and raise it to the rekey, and so
sort of check the math, you'll wind up with the message raised to the new exponent, and
of course this has this nice key homomorphic property here.
Everybody good? Okay. And so RSA construction, you would pick all of your wire
labels to be RSA exponents or you choose the A, B and Z 0s to be RSA exponents and
you choose the other Z value to be the sum of the two.
And so the way you could do this is for each gate you're going to give out three or two
proxy keys. One that takes you from A0 to Z0 and one that takes you from B0 to Z0. So
that sort of gets you the or property and the sum will get you the and property. So I think
that speaks a little bit to your question from earlier.
So in this fashion we're sort of taking advantage of the fact that or shortcuts, if you have
A0 it doesn't matter what you have for B, you automatically get Z0. And, similarly, for if
you're given B0 you should automatically get Z0, regardless of A. Unfortunately, there's
some problems.
So there's some good properties and bad properties. Good properties are this is clearly
very efficient. It's just exponentiation per gate evaluation. It's easy to compute to do the
preparatory work and do the verification.
>>: You mentioned efficient. Now after hearing about Gentry, it's efficient, but as the
security minimum level, how much slower is it than ->> Bryan Parno: It's still much more efficient compared to previously, probably not in the
realm of you'd want to start a business selling this as a service.
>>: Okay.
>> Bryan Parno: But I would say it's significantly forward in terms of efficiency.
And you might even be able to get this adaptive security property because we're using
something that looks like an RSA signature so maybe you get that property as well.
The problem is we don't yet have a proof that this is secure. And in fact most of our
attempts at proving it run into this problem that you're giving out this rekey value, which
is derived from the keys, in the clear.
And so one way in which you might go about embedding an RSA challenge is to set one
of the Z0 values to be the challenge value on which you're trying to produce a signature.
So let's say we know he's going to cheat by producing A0. So we're going to embed the
challenge exponent here. So then we need to work our way backwards and make sure
we can produce signatures on the inputs that we're given.
And we can do that based on our knowledge that Z 0's not a legitimate value. So if Z0 is
not legitimate that means we're never going to give out A0 and B0 because they're not
legitimate either. So we can take advantage of that in our construction of the circuit that
we're going to give to the simulator. The problem is that the way in which you compute
those reveals information about the path that you took through that construction.
And so an adversary can look at the distribution of those K values and determine they're
not uniform. There's something going on with them. And therefore he can refuse to play
the game with you.
And that comes largely from the fact that we're giving out those division values sort of in
the clear, and from the fact that it's hard to randomize them. So one way in which you
traditionally randomize RSA exponents is to multiply them by some random number mod
in the whole group. The point of using RSA here we don't know the order of the group,
because if we did it would no longer be unidirectional.
So we don't actually have an attack on the system, but proving security seems to be
tricky in this case.
>>: Is it crucial to use RSA. Can you use other -- because of the simulation?
>> Bryan Parno: The reason why we picked RSA in the first place it gives you a nice
unidirectional property that you can go forward but not backwards, because RSA even
given the exponent you can exponentiate by it but you can't divide. Something like
standard Diffie-Hellman or some other group-based thing, if I give you Z0 over A0 you
can invert that and work your way backwards.
However, one area which that's not the case is pairings. So return to pairings because it
has this nice unidirectional approach. And again you can define sort of standard basic
signature scheme, you define the rekey, to look very similar, except now we can hide it
up in the exponent. And then the resign operation just does a pairing.
And we still have this nice additive homomorphic property.
>>: That's exactly what I was going to say because for the security proof, typically make
this interactive assumptions, right? And inferring sort of is usually easier to make the
assumptions, for example, if I was to ask you [inaudible] plus AI could ask you for any
random A and then for any different A you cannot give me [inaudible] specific proof for X
and it's okay to make such assumptions but it's hard to make assumptions for RSA. So
it might still be [inaudible] if you're there then there.
>> Bryan Parno: Well, the pairings get us our -- we're still efficient in the sense we
discussed earlier. We get the unidirectionality from the pairing. Secure and quite
possibly adaptively secure based on signatures. But it has this big problem that we're
using pairings. Right? So we don't get the multi-use one. Once you do the pairing,
you're stuck in the target group, and we can't get the unidirectional property again.
And so that sort of implies we can evaluate circuits that are Boolean formulas, because
Boolean formulas only require sort of one level of wire splits at the top. And once you've
done that, everything is sort of, you have no wire splits in the rest of the circuit, and so
we can use the pairing once at the beginning and then let everything run from there.
>>: Is there a [inaudible] homomorphic signatures? How does it permit this?
>> Bryan Parno: The recent paper from Dan Boneh and I forget the student.
>>: [inaudible].
>> Bryan Parno: Yeah.
>>: [inaudible].
>> Bryan Parno: So with their paper, they don't get the amortization property. So they,
for each computation you want to do you need this tag value that they came up with. So
the worker has to come up with this tag value that for each new computation and coming
up with that tag value is essentially takes as much work as doing the computation
themselves.
So he can provide these tag values that anybody can use to do computations but he sort
of has to do amount of precomputation equal to the amount of work that somebody else
does later.
So it has the advantage you can do a whole bunch of precomputation and later when the
results come in you can do very fast verified computing. But the total amount of work
you do to prepare is exactly equal to the amount of work the worker does, essentially.
Does that make sense?
>>: So what's the idea, I'm going to assign my input give it to you, you can compute a
function, you can compute a signature on the result and give it back to verify? But then
the second time I'm going to give you signature sort of different input, and you should be
able to like mix and match these signatures, yes? So we have after sort of assigning the
first one using the first tag and the second one using a different tag. To do that, I need
to -- that kills the amortization, turns out. There's no need -- no reason why that should
be the case, but...
>>: In a sense it's a limitation that you have to provide as many signatures as there are
computations you want [inaudible] so if I wanted to do K definitions I have to give you
signatures for at least K text [inaudible].
>>: Yes.
>>: So you don't get to amortize it the way these other schemes have been doing.
>>: More problems than that. That's why they can only do limited number of -- they can
do linear functions, they can do constant degree polynomials, and even for that -- I
mean, the property ->> Bryan Parno: So it has some nice properties but it's unfortunately not sufficient for
this.
>>: Low arbitrary number of times.
>> Bryan Parno: We can do arbitrary number of times but you're stuck with Boolean
formulas and Boolean formulas are sort of unappealing because the amount of work you
would save with this is a constant fraction of the work to do the preparation.
So but there is this interesting connection I think Vinod and his intern were investigating
this summer looking at this looks a lot like sort of the boiled down part of an ABE
scheme. And so you could think about an ABE schemes are also essentially largely
limited to Boolean formulas unless you do things like blow up the leaves to handle other
functions.
And so there may be some interesting connections as to why they're stuck with Boolean
formulas and why Boolean formulas seem to work well here.
And so one other approach you might take is instead of doing additively key
homomorphic and key proxying, you might have additively message homomorphic and
message proxying. This is a little weirder primitive, but if you imagine there's some
master key generation, standard key signature algorithm, and instead of doing rekeying
and resigning of keys, we say that we use this master key to say that you get a proxy
key from two different messages. And so for any signature on MA, with any key, you
can convert it to a signature with that same key on message B.
So, again, kind of a little bit strange. But you would also want this additive property
which is less strange if you have the encryption under the same key of two messages
you can add them together and get the sum.
And so if you wanted to build this, then looks very similar to before but this time we're
going to associate random message value with each wire, and we're going to give out all
these rekeys, and then to do the computation, Alice is going to generate one new
signature key for the entire input and she's going to sign all the input messages using
that key.
The worker can then proxy through the circuit, going from two input messages, proxy it
to a single output message, and at the end we're going to wind up with a signature on
one of the output wire value messages using that original key that Alice chose.
So this time we're going to be view the messages instead of the keys. And so we've
been working on ininstantiation of this based on lattices, in particular based on a
signature scheme from Vinod, some folks at IBM, and has the advantage that it does
have this proxy property. So you can proxy for messages, and it has this additively
homomorphicic property. Unfortunately there's a couple of limitations. So you can't do
full-sized circuits.
You can't reuse it a polynomial number of times. You have to reuse it, what did we
decide, logarithmic, polylogarithmic. So don't get as much reuse, but you get some
reuse. Then there's this much bigger problem that this particular signature scheme has
the property that if you give out a signature on two linearly related messages using the
same key, then there's the adversary has a good chance he can recover the secret key
corresponding function to your signing key.
It's important you only sign non-related messages. And of course when we're
constructing these circuits we're going to wind up with lots of linearly related messages
because we're taking advantage of this additive property. And so until you get rid of this
property, this construction is not going to work.
>>: So what [inaudible] the construction. The bonsai [phonetic] construction of
[inaudible], et al. ->>: We don't even know if it's additively homomorphic. So our construction is additively
homomorphic.
>>: I thought that was the problem, I guess.
>>: If you don't even have [inaudible] then what's the point?
>>: If the messages are shorter than 128 bits and you generate each at most at random,
you need two to the 64 messages before you get linear relation.
>> Bryan Parno: That would be if you generate all messages at random, but for this
construction, we're taking advantage of this additive thing so that, for example, the inputs
might be A and B and we're going to define the output message to be that. That's what
eventually gets you into trouble.
So if you could pick them, if you had sort of the full thresholding that we talked about
before, then you wouldn't need to pick them in this fashion. And then you'd be okay.
So this is lattice ininstantiation. So just to wrap up here, since we're running out of time,
this verifiable computation system of sort of doing a little bit of precomputation work and
amortizing it over lots of inputs seems to fit nicely with what everybody wants to do these
days, pushing stuff out to the cloud. We found that combining Yao with fully
homomorphic encryption yields this theoretically very nice protocol for doing this, and
that this method of applying fully homomorphic encryption does seem to be generic
given any sort of one-time verifiable computation.
We can get a lot more practical if we turn to these proxy-based schemes, but
unfortunately we sort of have high level approaches based on these proxy schemes but
we don't have very good concrete instantiations. So that's what we've been spending
most of our time on lately is to come up with concrete instantiations that match these
definitions better.
So if you have ideas or other possible approaches in this direction, we'd be very
interested in talking more about that. And that's the end. Thank you. [applause].
>>: Have you looked at the relapse problem that people may have crafted which is the
running of double agents in wartime, double-cross system.
>> Bryan Parno: So trying to check answers against ->>: Yeah, you think your spooks are working for you but maybe they are not.
>> Bryan Parno: So I guess the other one-time verifiable computation scheme that I
showed does take some flavor of that. It gives you some unknowns and some knowns
and then checks based on the knowns.
And that works well for one time -- in general, the problem with applying that notion that
you could get rid of all this crypto stuff and just give out the same computational problem
many times, in fact that approach that study at home it largely takes. But you're never
safe from collusion in that case. It's always the possibility you're giving it out to a
thousand people you think are different and and it's one guy sitting in his basement. But
that's certainly a popular alternative.
>>: Does this work with [inaudible] on clouding clouds, assuming we're doing clouds.
[inaudible].
>> Bryan Parno: Yeah, so if you make assumptions about the workers not colluding, I
think you can certainly make more ->>: Crosscheck.
>> Bryan Parno: The other interesting thing to look at is all of this stuff has been
building on the Yao construction, basically garbled circuits and just basic circuits. But
Yao is just sort of a very early secure multi-party computation protocol and a ton of work
done and other ways of doing secure multi-party computations. Might be interesting to
look and say can we take some of those techniques as well and adapt them to this
setting.
>>: Even if you just care about [inaudible] and not care about verifiability, then you still -then there's no hope of getting rid of the fully homomorphic encryption, right because
again even if you give up verifiability completely you still get fully homomorphic
encryption. So it seems like there's a limit to -- like there's a limit to how efficient this
thing can get without also informing the efficiency of fully homomorphic encryption.
>>: What if you give up privacy and you just want verifiability?
>>: No, I'm saying if you want privacy, if you want privacy.
>>: If you want privacy -- if you want privacy for general functionality, then that's exactly
what fully homomorphic is giving you.
>>: It assumes a lot of [inaudible].
>>: Sorry.
>>: Like compactness, right?
>>: You do need compactness, because if you ->>: Exactly, otherwise your communication complexity cannot relate to the size of the
computation, otherwise it's exactly like what you're trying to avoid, because then really
the answer is going to be ->>: So maybe like if you want efficiency then you need to like explicitly focus on
schemes that do not have, that do not give privacy.
>> Bryan Parno: In fact, these proxy schemes, at least some of the ones we've been
looking at don't give you privacy, because you're going to use a different key depending
on which inputs are coming in. It does seem intuitive if you give up privacy, that seems
like a big give, you should be getting some ->>: The point is you have to give it up, it seems, if you don't want to --
>>: Right.
>>: Again, an old problem, between the war, there was a lot o f communication bureaus,
battalions of people with hand-cranked adding machines, they must have made
mistakes and they must have had procedures to handle mistakes. And of course the
people you remember are dying off. And it may be useful to pick some brains on this
before they all die.
>> Bryan Parno: Sure. And though I think one of the differences in that situation is at
least assuming your computers have not been suborned, you're largely looking for
random mistakes.
>>: And laziness, of course.
>> Bryan Parno: And laziness. But I think if you're looking for sort of the ->>: That's harder.
>> Bryan Parno: Yeah. Any other questions?
Vinod Vaikuntanathan: Let's thank Bryan.
[applause]
Download