>> Melissa Chase: So we're very happy to have... Rosulek visiting us this week. He's worked in a

advertisement
>> Melissa Chase: So we're very happy to have Mike
Rosulek visiting us this week. He's worked in a
variety of different areas, but particularly on
secure computation, both theory and practice, and
today he's going to tell us about one of his
[indiscernible].
>> Mike Rosulek: Thanks. So thanks, Melissa and
Sonny, for letting me come up here and invite myself
and hosting a talk. So today I'm going to be talking
about some joint work with Arash Afshar and Payman
Mohassel from Calgary, sort of -- Payman is in
transition -- and Zhangxiang Hu is a Ph.D. student of
mine.
So this is a talk about two-party secure computation,
and I realized, like, half an hour ago, that I didn't
really have a slide saying what secure computation
is. I don't know if that's going to be a problem in
this audience. Okay. Awesome.
So this is a slide that -- or a visual that David
Evans likes to use where he estimated the cost of
doing human genome comparison or human genome
sequencing, so the blue line is the cost of
sequencing the human genome without any security,
just the raw computational cost historically. And
you see that's gone down. And then the other lines
are the estimated costs of what it would -- how much
it would cost to sequence a genome inside of a secure
computation achieving semi-honest security or
malicious security. And these are just estimates,
but I think it's impressive the amount of progress
that we've got so far. We've gone from secure
computation in the '80s being just feasibility
results to in the last decade it's actually practical
for some domains. Maybe not practical yet for genome
comparisons, but we've come a long way. And so I
just want to talk about the state-of-the-art -- yeah,
Josh.
>>: I was thinking, semi-honest [inaudible]
semi-honest now.
>>: I thought that, too, and I was hoping no one
would notice, because this is not -- this is just
something I shamelessly stole. And in fact, I
couldn't find the one that he used when he was here a
couple months ago for the MPC workshop. The graph, I
think it falls off the end. It falls down to zero at
some point because circuit garbling is so cheap now
and OT extension is so cheap, and so semi-honest
computation is like basically free.
I think there are other -- probably this is leveling
off due to just some economic forces and not due to
some limitations of technology. That would be my
guess. But yeah. But that's -- what a great win for
cryptography. It's cheaper than doing it out in the
clear. So come to us. Do things within a secure
computation. It's cheaper than without. Yeah. I
don't know how seriously we're meant to take this
graph, but qualitatively it's -- we've come a long
way.
The state-of-the-art in 2PC protocols, for general
purpose, 2PC protocols, they typically will start
out -- the papers in the area will start out like
this. So if you want to compute -- if you want to do
a secure computation of sum F, the first thing you do
is express F as a boolean security, you'll do
something with that circuit.
That's kind of where we are now. And for certain
things like AES, that's not so bad because AES, in
any case, is designed to have, you know, nice-looking
circuits because people want to implement it in
hardware. This is the AES S-box. I don't remember
exactly how many gates it is, but it's small enough
to fit on a slide. This is just one S-box. You need
208 of these to do AES 256. So it's not a terribly
large circuit. It's certainly not millions of gates.
Like tens of thousands of gates maybe. But for other
things like ->>:
[inaudible] 6,000?
>> Mike Rosulek:
>>:
Like the whole AS.
>> Mike Rosulek:
>>:
Yeah, it's --
Small.
Yeah, it's --
>> Mike Rosulek: Yeah, so actually it's 208 of these
plus the key schedule, so it's between 10,000 and
100,000. Yeah. I don't know the number. If you
think of a more realistic problem, like the stable
marriage problem, it's very iterative, memory
intensive. If you wanted to write that as a Boolean
circuit, you would not have a good day at the office.
So most of the computations that we care about in
practice are not really well suited to the circuit
model. Like we don't design algorithms for circuits.
Another thing that's as simple as binary search,
right? The binary search is -- well, when you write
something as a circuit, the circuit has to be as
large as the input, right? So you don't get any
sublinear time algorithms, like binary search is the
classical sublinear time algorithm, right? And
that's because for security in general, the secure
computation has to touch every bit of the input,
right? If you run a protocol and you notice that you
never touch this input, then that could potentially
leak information.
So we don't have sublinear time 2PC protocols for
that reason. So you might wonder why do we use these
circuits anyway? Well, there's a reason that we use
circuits. It's because we have this garbled circuit
technique that goes all the way back to the beginning
from when I was first working in MPC. So probably
most the people in the room know this, so I'll go
over it pretty fast. The idea is to pick two labels
for each wire, just random strings, cryptographic
keys, if you like. Randomly assign one of those
labels to be true and one to be false, as I've done
here. And then think of the truth table of the gate
and say here is an and gate and here is the truth
table of that gate, and I'm going to replace trues
and false with the corresponding labels, and then I'm
going to use the input wire labels as keys to encrypt
the output wire label. So these will become cipher
text, right? So I encrypt using A1 and B0 as a key
encrypt, C0. That's one entry in the encrypted truth
table. That's the basic idea behind garbled
circuits.
The reason this works is because, first of all, the
association between true and false is random, so the
wire labels, themselves, don't leak whether -- I
can't look at a wire label and just decide that it's
the true wire label or the false wire label. And
also if I give you just one wire label for each wire,
then you can only open exactly one of the cipher
texts and you can only learn exactly one of the
labels on the output. So it kind of by induction you
only learn one label on each wire, and at the end we
can reveal the true/false association.
So for our purposes, it's important that if I look
at -- if I just know one of the wire labels, I don't
know whether it's the true or the false one. I have
no idea what just happened. Okay. I'm back. I hope
that doesn't keep happening.
If I just have one wire label, I can't tell whether
it was the true or false one, and I can't guess the
other wire label. Those are the two properties that
will be important. I'm foreshadowing right now, so
keep that in mind.
So that's the garbled circuit technique. We'll keep
that, but we want to move beyond circuits as the
basis for 2PC protocols because of the inherent
limitations of circuits.
So let's talk about a different model called the
random access machine model, and it really is a
realistic model of modern computer architecture. So
you imagine there's a CPU with some small internal
state. Think of those as the registers, the small
constant number of registers, a very large memory and
in between the CPU and the memory there's a memory
bus. And what happens along the memory bus? Well,
the CPU will ask to read some data and the memory
says okay, here is it is. The CPU might ask to write
some data and the memory says yeah, okay, I'll do
that.
This is the basis of a model computation called the
RAM model. Usually in cryptography we care about
something called the oblivious RAM. So just looking
ahead, we'll be trying to simulate the execution of a
RAM program, and these messages along the memory bus
will be kind of out in the clear. So we need a way
to protect those.
There's a way to compile any RAM program into what's
called an oblivious RAM where these read/write
locations and the data on the memory bus reveal
nothing about internal -- the secret input, so the
computation.
Those constructions are very interesting. They're
not really the focus of this talk, so in this talk
I'll just assume that I have an oblivious RAM. I
won't describe the constructions that get you that.
There's some good practical constructions starting
from the work of Elaine Shehee [phonetic] and others
that are actually relatively practical compared to
the earliest constructions, and they only blow things
up by a polylog factor in the size of the memory.
There's another thing that I'll gloss over is that so
to do oblivious RAM, you set up some data structures
in memory with some secret keys in the CPU state. To
set up those data structures requires an
initialization phase where you have to touch all the
bits of memory. That's kind of inherent to ORAM, and
I won't talk too much about that. Most of the work
we'll imagine there's like a preprocessing phase
where you do that.
So if you want to think of where RAM programs can
really do much better than circuits is if you have
like a repeated queries on a big database, so you
take a long time, you set up the ORAM structures in
the database, and then you can repeatedly use that
large memory and run even sublinear time queries on
it.
And also looking ahead, we will use other -- so along
this memory bus, you have whether it's a read or a
write, you have memory addresses, and you have data
going back and forth. We're going to use some other
mechanism to protect the data. We only need the ORAM
construction, itself, to protect the read/write and
the address. I'll call it the metadata
obliviousness. So we'll use a different mechanism to
protect the raw data, itself.
Okay.
So I'm going to assume that we have one of
these ORAM constructions that computes the function
that we want, and we want to use this as the basis of
the 2PC protocol. So in the semi-honest model, this
has been done. This was work by Gordon, et al. This
appeared at CCS in 2012.
This is kind of our starting point. So they showed
how to emulate a RAM program in a 2PC protocol with
semi-honest security. So I'll show you this and then
the rest of the talk will be about how to achieve
malicious security while doing this.
Okay. So again, I'll totally gloss over the ORAM
initialization phase. Somehow Bob gets -- so ORAM
initialization initializes the memory of the RAM
program and also initializes the state. The state
has some, like, secret key that unlocks, you know,
decodes the data structures in the ORAM memory. So
we'll start out with some initialized memory and
internal state of the CPU, and we'll secret share it
between the two parties. And we'll let Bob, in their
protocol, Bob is the one who, like, takes care of the
memory. He's, like, he's the keeper of the big
memory. But they'll both jointly keep the internal
state.
Then they're going to use a circuit 2PC to repeatedly
evaluate the CPU circuit, okay? And it's a slightly
augmented CPU circuit, so they both put in their
shares of the state, the shares get reassembled, they
go into the CPU. If there's data to be read by the
CPU in the step, Bob feeds that in. He's the keeper
of the memory. The CPU executes. It outputs a new
state and it gets secret shared among the two
parties, and let's say Bob learns the memory access
for the next step. This is the message that goes
across the memory bus, like read or write. So the
large box in white is something that they execute
using like a garbled circuit 2PC. Okay?
And if the CPU is the CPU circuit for an ORAM, then
it will be safe to have Bob handle all these memory
accesses, right? Because the memory accesses don't
leak anything about the inputs of the computation.
Okay? So just as an example, if the CPU says that it
wants to read something, then in the next time step
Bob will fetch that thing from memory and give it his
input to the CPU, we're in the semi-honest model, so
Bob promises to do this. There's no problems. This
is a full-fledged ORAM, so the data in the memory has
kind of been encrypted by the ORAM, so it's safe for
Bob to know the raw contents of memory. The CPU
wants to write, Bob will just do the writing. He's
the keeper of the memory.
If the previous instruction was a write, then the
data coming in is not used in the next step. So they
just repeatedly execute this circuit over and over
and over again. It spits out commands to talk to
memory.
So in the semi-honest model, this is nice.
The rest of the talk I want to just discuss how to
bring this from the semi-honest model to malicious
model, and I guess the way I approach it is in the
end the techniques that we use are not, you know,
that -- they're not like difficult techniques. We'll
just find the best techniques from what we already
know about circuit-based MPC. But there's a lot of
ways to do it wrong, so I want to focus on, like,
there are easy ways to do it wrong and there are easy
ways to do it right, but have to have really high
overhead. So I just want to find the right selection
of techniques from circuit-based 2PC that are most
appropriate in this setting.
So in the malicious model, we're still going to
evaluate the CPU circuit over and over again. We
have to do that inside a malicious 2PC now,
obviously. That's kind of the main thing. So that
will protect what's inside this white box, if the
white box represents malicious secure 2PC, but that
doesn't protect what goes into the box. The parties
are still malicious, they might send different stuff
into the box, right? So if we're secret sharing the
state, then a party might, you know, lie about what
his share of the state is. Bob might lie about
what's in memory when the CPU is reading from memory.
Okay. So this is kind of the major thing we have to
deal with.
One way to fix it to use a MAC. So we can say that
the CPU circuit outputs some stuff and it all gets
MACed with some secret keys, and then when you put
stuff back into the CPU, you verify that it -- you
verify the MAC to make sure that it hasn't, you know,
it hasn't been changed from last time. You can
definitely do this, and this will lead to a malicious
secure protocol. But I don't quite like this because
now I'm doing a malicious -- you know, this white box
got bigger on the page and it got bigger as a garbled
circuit also, because now I have extra circuitry for
all these MACs and all these verifications, right?
If the CPU is kind of like noncryptographic, I don't
want to like add extra crypto on top of it, right?
It's already expensive enough to do a malicious
secure 2PC of the crypto algorithm. Even a one-time
MAC is not really a crypto algorithm. You should
think of like a one-time MAC as like a
multiplication. Still not a small circuit, by any
means. And in any case, the solution that we
eventually come up with has us adding nothing to the
circuit. So adding anything to the circuit is kind
of not fun, yeah.
>>: [inaudible] crypto through the ORAM [inaudible]
encrypting data [inaudible].
>> Mike Rosulek: So if it's a full ORAM, then it
does encrypt. It's like the contents of memory. The
way an ORAM protects the addresses, though, is purely
combinatorial, and so we're actually going to -- we
actually require an ORAM that only protects the
addresses and doesn't need to even protect the
contents of memory. So actually you're not already
having -- you may not already have an encryption
circuit inside of the ORAM.
>>:
Is that true for the previous [inaudible]?
>> Mike Rosulek: The previous one, they had to do
some extra tricks to get the -- they're in a
semi-honest model, so they could factor out some of
the encryption, like let Alice be in charge of the
encryption somehow. So with a specific ORAM
construction, they were able to move the encryption
circuit out of the 2PC, but as their general paradigm
they would have the encryption circuit inside of the
CPU.
>>:
[inaudible].
You handle the instruction stuff,
[inaudible] encryptions must be a [inaudible] is that
also outside of the ->> Mike Rosulek: So the reshuffling, I mean, so
we're not -- I guess in our work, we're not really
looking inside the ORAM, but that's all
combinatorial. I mean, you're not doing a shuffle
of, like, based on a pseudo random function. If you
imagine like path ORAM or trio RAM.
>>:
[inaudible].
>> Mike Rosulek: Yeah, if you're using one of the
other ORAMs, it does some serious shuffling.
>>:
The original ones.
>> Mike Rosulek:
>>:
Yeah.
Okay.
>> Mike Rosulek: Yeah, so if you're thinking of the
path ORAMs and you take away the encryption circuit,
then it's all just like assign something to a random
leaf and just traverse buckets. It's all very
combinatorial. There's no crypto circuitry inside
that. I guess that's what I have in my mind for the
kinds of ORAMs you might use. But certainly some
ORAMs would have bad circuits, for sure. Yeah.
>>: The things you're writing to ORAM, Bob can cut
them or they're protected in some other way?
>> Mike Rosulek: Yeah. So in our protocol we'll
protect them in a different way. It's not clear what
that way is yet, but just I guess the point of this
slide is we don't want to add extra crypto circuitry
to this malicious secure 2PC. And also, we really
don't want to add lots of extra inputs, all right?
So we'll be inputting shares of the secret state and
we'll be inputting MACs. And in a malicious secure
2PC, there's a high cost to pay for input consistency
checks, if you're familiar with those techniques. So
inputs to a 2PC are expensive. You have to have OTs.
You have to have these kind of various checks to make
sure that the same -- the input is used and all the
cut-and-choose phases.
So, you know, raw garbled circuit material is
expensive, and inputs are expensive in the malicious
world. We want to minimize those as much as
possible, so that's kind of the theme.
So what is this mysterious way that I'm protecting
the data of the ORAM? So if you look at the CPU
circuit, it's outputting some internal state. It's
outputting some stuff that's being written to memory.
I need that. All that stuff needs to be kept
private. So no one should know, like, the actual
contents. And you need to prevent the bad guys from
tampering with it, from saying that it was something
that it wasn't.
I'm already doing the CPU. I'm already evaluating
the CPU inside of a 2PC protocol. It's already a
garbled circuit. The garbled circuit is outputting a
garbled representation of the state and memory in the
form of wire labels. Wire labels have these two
properties already, right? So if I have a garbled
circuit outputting wire labels for the -- these
wires, those wire labels hide zeros and ones of those
values. And if I -- if you evaluate that circuit and
you just get, like, the false wire label for this
wire, you can't then guess what the true wire label
was for that wire, right? So there's this
authenticity property. You can't guess the other
wire label given just one of the wire labels.
So that's kind of the theme of -- I'll have more to
say about this, but this is kind of our theme. We're
already using a garbled circuit mechanism. Garbled
circuits already have privacy and authenticity of
their garbled values. Those are the properties we
need to protect memory and state of an ORAM as it's
executing.
So that's our theme. We're going to reuse these
garbled values, these wire labels. So that's what we
do. We want to get malicious security. We don't
want to put any additional overhead inside the
garbled circuits because they're expensive to begin
with. We want to do as well as we can, basically
steal all the best ideas from circuit-based 2PC and
steal all the efficiency parameters, right? If you
can do security with overhead of 40 and malicious
secure circuits, we want to do overhead 40 for RAM as
well.
So I'm going to describe two protocols, if I have
time, with kind of different tradeoffs of their
parameters. And when I talk about efficiency, I
really care about how much more expensive is it than
semi-honest. So to get 2 to the minus S security for
security, the state-of-the-art is to do -- is to make
it S times more expensive, and we match that here, as
well, by stealing the appropriate ideas from
circuit-based 2PC world.
If we allow an online/offline setup where there's an
offline preprocessing phase, which is actually very
natural in the RAM setting, then we can even do
better than S. We can do 2S over log T where T is
the total running time. So think of T being very
large. So log T is greater than two. In fact, like
TS over log T can be something like 7 concretely.
Okay. So malicious security that only costs like
seven times more than semi-honest.
>>:
[inaudible] preprocessing depend on [inaudible].
>> Mike Rosulek: Yeah. So in general, yeah. So you
need to know the function that you're evaluating but
not the inputs. And actually, in the RAM setting it
becomes a little bit more flexible because really,
you only need to know what RAM construction you're
using in the offline phase. In the online phase, you
can decide -- so in the offline phase you decide, oh,
I'm using this path ORAM construction. Then in the
online phase I decide, okay, now we're going to
evaluate F on these inputs, and I could reuse the old
preprocessing, because most of the steps are ORAM
construction steps.
Okay. So hopefully some of this makes sense as we go
along. So there's still more to say about -- yeah,
go ahead.
>>: [inaudible] simple that I must be
misunderstanding. So doing this RAM model, you only
want to be as good as circuit-based PC? Don't you
want to do better than that?
>> Mike Rosulek: So the thing that I am measuring is
how much more expensive -- so in the RAM model, how
much more expensive is malicious than semi-honest.
>>:
Okay.
>> Mike Rosulek: And that's this S or 2S over log T.
So that is the -- that ratio is what we're matching.
But yeah, so if you have a RAM program for some F
that's much faster than a circuit for F, then -- so
this is S times semi-honest RAM computation. So
presumably, the RAM model could be better than the
circuit.
So just comparing the time of a RAM 2PC to a circuit
2PC, it could be better depending on the RAM and the
circuit. But this -- so this is a protocol for like
general purpose, general -- like any RAM program. So
it makes more sense for us to just talk about this -the overhead of malicious to semi-honest.
So let me say a little bit more about what I mean
when I say, like, reusing wire labels, reusing
garbled values. So here is this CPU circuit. Has an
input for the state, an input for the data that's
being read from memory. It has outputs for the
state, for the memory access, like the read versus
write and address. It has an output for the data
that's being written to memory if it happens to be a
write instruction. Okay?
So we're in the -- we're going to be evaluating the
circuit over and over many times for each time step,
and so when I say that we're reusing wire labels, I
really mean that, you know, for these wires going out
and for these wires coming in, I choose the same wire
labels, right? When I garble a circuit, I get to
choose the true label and the false label for each
wire. I'll choose them to actually coincide right
here, right? So whatever I get out of here, I can
directly use as an input to this garbled circuit
without -- like, none of us need to talk or do
anything, right? Whatever garbled output you get
from the T minus 1 circuit, you can feed right into
the T circuit. So that's what I mean by these orange
arrows.
So conceptually, those wires are like the same wire
because they're the same wire labels in the garbled
circuit. So for the state, it's pretty
straightforward, right? For the state, the state
from time T goes into -- as the input just to time T
plus 1one. With the memory it's a little bit less
straightforward. So if I write to some address at
this time and then later I want to read from that
same address, I need to connect this data into this
data input. Right? So whatever was written here
needs to be read down here. I need to connect those
wires. And by connect, I mean they share common wire
labels in the garbled circuit. So this is what I
mean by reusing wire labels.
Since we have a small audience, I have a quiz.
Everyone's been paying attention very nicely. It
kind of looks like I just took the RAM computation
and unrolled it into a giant circuit, right? But I
just claimed that RAMs can do things that circuits
can't do, right? A RAM can be sublinear in its time.
That would give me a sublinear size circuit. So,
like, where is the paradox? There's a paradox that
needs to be resolved, right? It can't be true that I
just unroll a RAM computation into a circuit and get
something that's -- that doesn't touch every bit of
the input, right?
>>: You unroll it, but you still have memory access
[inaudible] right?
>> Mike Rosulek:
Uh-huh.
>>: Whereas if it was just a big circuit, all the
inputs would be coming in at the beginning.
>> Mike Rosulek: Yeah. There's -- it's this issue
of the memory addresses. When I was unrolling this
circuit, I was unrolling it -- these -- I was doing
it in a way that depends on these addresses, and
these addresses are determined at runtime. So
there's this idea of, like these outputs come as the
RAM program is running. Only then will I know how to
connect the parts of this circuit, right?
So good. We resolved the paradox there. I think
it's worth thinking about that. You know, you get to
a certain point writing the proof and you go, geez,
what have I done? I've proved something impossible.
So these connections are easy to predict. They don't
depend on any runtime behavior. The state just
always gets passed forward, but these, the lines that
connect up the memory accesses are determined at
runtime. So you can't just unroll the RAM
computation before you actually run the RAM. So this
is kind of -- I think this is an interesting way of
thinking about RAM computation in general. It's like
you're sort of building a circuit on the fly somehow.
>>: Some programs [inaudible] static and react to
this [inaudible]. They're not determined at runtime
[inaudible].
>> Mike Rosulek: Yeah. The RAM programs with
polylog overhead, though, so it's not just that these
depend on runtime, because maybe they could be
predictable, but they depend on the preprocessing
phase, at least, right? The preprocessing phase puts
some secret stuff in the state that tells you what
these connections will be and there's a distribution
over these addresses that's oblivious, but the actual
addresses, themselves, you know, maybe they only
depend on this secret key that's in the state or
something like that.
There's also this idea that you could have a RAM
that's like giving outputs. So aside from these
outputs, it could just be giving outputs as it goes
and taking inputs as it goes. It's not really kind
of a circuity thing to do, like giving outputs and
taking inputs as it goes. You think of a circuit as
like a one-shot deal.
Okay. So this is -- I want to do secure evaluation
of a RAM program. I really want to use this trick of
reusing wire labels, right? Because there's nothing
to do. Like I don't have to do any input consistency
checks. Like you already have the thing you need
that you're going to put into the next CPU. This is
awesome. So I need to find a way to get everything
else to work in a way that doesn't clobber this idea.
So that's going to be the challenge. So the first -yeah.
>>: Can you explain how you connect, like, the
[inaudible] this curvy line.
>> Mike Rosulek: So, yeah. Maybe if I haven't
explained it well on this slide, maybe I need to do
it better here. So imagine we just executed this,
and I saw that output was read to this particular
address. And I look back in time and say when was
the last time I wrote to that address? Okay. It was
time T minus 1. So now when I garble this circuit, I
choose the wire labels for these guys to be
coinciding here, and I choose the wire labels for
these input wires to coincide with these wires.
>>:
So the address is in the clear?
>> Mike Rosulek: Yeah. The address is in the clear.
If I didn't know the address, I couldn't hook these
up.
>>:
Yeah, but those addresses may be data dependent.
>> Mike Rosulek: They're data dependent, but they're
oblivious, right? They don't -- information
theoretically they depend on the data, but they don't
leak information about the data.
>>:
So [inaudible].
>> Mike Rosulek: Yeah, because it's in ORAS. So
input dependent is maybe not the best term because
it's also dependent on the initialization phase, and
maybe that's the -- a better way to think about it.
>>: And so do you need to keep the whole history of
reads and writes as the program executes?
>> Mike Rosulek: You need to keep -- for each
address, you need to know when was the last time that
you wrote to that address.
>>:
Oh, okay.
>> Mike Rosulek: One person has to keep track of
everything. So say that Bob is the keeper of the
memory. He has to keep these wire labels around
because that's his representation of what's in
memory. So he has to keep the entire memory, which
for every bit of memory, there's a wire label now.
That's some overhead for sure.
And the person who's generated the garbled circuits
has to know when was the last time I wrote to this
address because I have to remember what the wire
labels were when I -- now it's time for me to garble
this circuit, I have to fetch those old wire labels
and make sure they're the ones I'm using here. Yeah.
So people have to keep track of some of the history.
Maybe not all of it, but some significant history.
Actually, the person who's generating the garbled
circuits can derive these wire labels using a pseudo
random function. So he doesn't have to remember the
wire labels. He just has to remember the last time
they were accessed. There's some optimizations you
can do where he can remember it a little bit less
stuff, but still one of the parties has to remember
for every address in memory, the corresponding wire
label that he has. Times some overhead factor. This
is malicious security. It's not cheap.
Okay. So this is the reason why the CPU doesn't need
to encrypt the data. The garbled circuit is already
protecting the data from someone looking at it. So
we do at least have a chance of the garbled circuit
being like noncryptographic in nature. That's always
nice.
Okay. So how do we, in the circuit world, how do we
a achieve malicious security? Well, we use a
technique called cut-and-choose. So I'll just
refresh everyone's memory about. The idea is, let's
just say I'm evaluating one garbled circuit, but the
parties involved might be malicious. In particular,
the person who's generating garbled circuit might be
lying about it and generating some bad ones. So I'll
make him generate lots of garbled circuits. I'll
pick some random subset of them to open and check.
So he'll show me all the secrets in these garbled
circuits. If any of them are bad, I'll give up
because I caught him cheating.
So let's assume that all of them checked out. Then
I'll take the remaining garbled circuits and I'll
evaluate them somehow. There's more things that can
go wrong, but this is the general idea. And I have a
statistical guarantee that let's say I open half of
the circuits, all of them were good. And with very,
very high probability, at least the majority of the
unopened circuits were good. They were correctly
generated. So I can take the majority output with
very high probability that I'll represent a correct
computation.
>>: Do you get any difference in the output when you
know it's been checked?
>> Mike Rosulek: So the beauty of this -- the beauty
of malicious security is sometimes you know you're
being cheated, but you can't say -- you can't reveal
the fact that you know you're being cheated. He can
cheat in a way such that you'll only notice if your
input was particular thing. So if you say, oh, I
see -- if you tell me, Mike, I see that you're
cheating me, then I'll say, aha, you could have only
known that if you had used this particular input.
Now I know your input.
>>: Okay. So I guess the other way that this could
be done in theory, at least, is all the unselected
circuits are proven to be equivalent to each other.
>> Mike Rosulek:
>>:
Yes.
Which is less efficient, but --
>> Mike Rosulek: Yeah. Yeah. So this is kind of
the standard paradigm for what you might even
consider implementing. But, yeah, there's lots of
different ways to prove that you have -- the whole
purpose is to prove that there's some garbled circuit
is correct, basically. This is one way to do it.
And there are lots of subtleties that can go wrong,
like, oh, I see that you were cheating me, but you
can't say it. So that's why you have to take the
majority. You can't just give up and -- yeah. So
I'm glossing over most of the details just so
conceptually we can see that this is the
cut-and-choose approach.
So now let's take this to the RAM setting, okay. So
now we're going to -- this garbled circuit is the RAM
CPU. We're going to evaluate it many times over and
over again, okay? So let's say I did the
cut-and-choose for time step T minus 1, and now it's
time for me to do a cut-and-choose for time step T.
I claim this is going to be a problem or it's going
to conflict with my desire to reuse these garbled
wire labels. So let's look at one of the circuits
down below.
Suppose this circuit ends up being a checked circuit.
That means I'm going to open up the circuit and I'm
going to see all of its secrets. So that means it
better not share any wire labels in common with these
evaluation circuits. Right? The protocol is really
counting on these garbled values to be protected.
This would reveal its secrets.
On the other hand, if there's a circuit who's going
to be used in evaluation, I really want it to share
wire labels with some previous one, right? That's my
whole trick that I want to use. But at the time that
I'm generating these garbled circuits, I don't know
whether they're going to be checked or not. That's
the whole point of cut-and-choose. You generate them
all with the possibility that any of them could be
opened, right? So this kind of leads to a
contradiction, right? Does that make sense?
So the point of this is that like a naive application
of cut-and-choose is incompatible with this idea of
using wire labels. We have to be very careful by
what we mean by cut-and-choose in this setting.
So the thing that saves the day is this idea of blind
cut-and-choose, which Sonny knows all about. So the
idea is it's basically like one cut-and-choose to
rule them all. Instead of a cut-and-choose at every
time step, we do one cut-and-choose at the beginning.
So I want to call the different columns here threads
of computation, and each thread is going to be either
a check thread or an eval thread. And the receiver
is going to secretly determine whether it's check or
eval, and the guy who's generating garbled circuit
won't know which thread is -- you know, whether any
thread is a check thread or an eval thread.
So in time step T we generate garbled circuits. So
we generate garbled circuits in each time step, and
within each thread we reuse the wire labels. So I'm
only drawing like one arrow, but, you know, this is
this funky wire labels hooked up from all over the
place, but only within a single thread. Okay?
Within the check threads, the receiver checks that
the garbled circuits are good and also checks that
the connections are good, right? He's checking
everything that the -- list off everything that the
sender could do wrong. The receiver checks that in
these threads. So somehow it's a arranged so that in
a check thread, the receiver will get enough
information to check all of these things, including
the connections between them. He'll abort if any of
those turn out to be wrong. Right?
So the property of cut-and-choose says that if -let's say you open half of the threads are check
threads. If all of them are correct in their
garbling and connections, then the majority of the
other threads will be connect in all aspects as well.
So using an oblivious transfer, you can set it up so
that the sender doesn't know which is the check
thread, which is an eval thread. He sends the use OT
so that if this was a checked thread, the receiver
will pick up the checking information. If it was an
eval thread, the receiver only picks up enough to
evaluate on one input, which is the sender's input.
Right?
So there was this -- yeah, we sort of
compartmentalized this sharing of wire labels. So
wire labels are only shared between, you know, check
circuit to check circuit or eval circuit to eval
circuit this way. You don't have things crossing
over, which would sort of mess things up.
Probably need to move on a little faster from "oh, my
goodness, what's going on?"
So once you identify this as the cut-and-choose
technique you use, that's pretty much most of the
work. You have to solve all the other problems that
are inherent in cut-and-choose, but those are kind of
standard. If you want security two to the minus S,
you need about 3 times S threads for the statistical
argument to work.
But using new techniques that Yehuda Lindell
presented at Crypto, you only need S threads. So he
showed that in the setting of circuits. We show how
to do it in the RAM setting as well. It's this trick
where at the end if someone was cheating, then you
recover their input and then you just compute the
function all by yourself. So we show how to do it at
the end also with some small bootstrapping
computation that its size is independent of the RAM
running time. So that's kind of amortizes -- it
amortizes this trick away very, very nicely.
Okay. What else? So there's this problem that I
need to remember the wire labels of the previous
circuits, like we saw, and I can't -- this is a new
PDF viewer that rewinds differently than I'm
expecting.
I can't garble these circuits until I've garbled
these circuits because I don't know how I'm going to
hook these up yet. So I can't preprocess all these
garbled circuits. It really has to go sequentially.
Right? I have to know what these -- I have to know
where I'm going to hook up this other arrow that I
didn't draw. I have to know that before I start
garbling that circuit.
>>: So you mean you can't garble [inaudible] until
you've actually run the previous?
>> Mike Rosulek: Yeah. So I need -- oh, boy. I
need to get the outputs of this guy, which says what
address I'm reading from, before I garble this, which
because I need to know where this other wire comes
from. So we can't preprocess the garbled circuits.
Why would we want to preprocess the garbled circuits?
Because two papers at Crypto last week showed that
when you can batch preprocess a lot of garbled
circuits, you get a good savings.
So imagine we're in the circuit world again. We want
to evaluate the same circuit lots and lots of times,
right? So we're going to do a cut-and-choose. Lots
of circuits are going to be generated and thrown
away, but we know that we're going to evaluate the
same circuit over and over again. Maybe there's a
savings to be had. So their idea is, okay, you're
going to garble -- you're going to evaluate a circuit
end time, generate a ton of garbled circuits for that
circuit, check some of them all at once, and then
every time you want to evaluate the function, just
pick a random subset of the remaining ones and just
evaluate those. We'll call that a bucket. Right?
So I want to evaluate F the first time. I'll pick
these five circuits. That's my bucket. The next
time I want to run F, I have this pool of unused
circuits. I'll pick a random subset of them.
Evaluate those. There's a bucket of five.
And the cool thing is this extra step of taking the
unused circuits and randomly assigning them to
buckets, you know, the balls and bins problem is a
little bit different and it allows these buckets to
be significantly smaller, right?
So as before, if it's just one cut-and-choose, I
really can't do much better than having, like, S
circuits. But here I can do, like, S over log N
circuits. So in practical terms when N is like in
the thousands, then the size of these buckets is like
five or seven or something for security of 2 to the
minus 40 instead of the size 40 buckets. Right?
So this is why we'd really love to be able to
preprocess garbled circuits. It seems to be
incompatible with our idea of reusing wire labels.
just want to talk about that a little bit.
I
The other reason it makes sense to preprocess garbled
circuits is like we are evaluating the same circuit
over and over again. It's the same RAM circuit over
and over again. We already have a preprocessing
phase for the ORAM, so it's not crazy for me to
propose that we do this huge cut-and-choose at the
beginning of time as part of a preprocessing phase.
I'm already doing preprocessing to set up the data
structures in the ORAM. It would be really great if
I could also make these buckets really small and only
deal with a couple garbled circuits at each time
step.
But again, this seems incompatible with our way of
hooking up wire labels because it's -- the connection
of the circuits is determined at runtime. If only we
had a way to connect wires on the fly. If only we
had a way to generate a garbled circuit and then
later hook up the wires, that would be ideal. And
fortunately, we do. It's called the LEGO approach
invented, I guess, by Nielsen and Orlandi in a LEGO
paper. They had the approach of generating a bunch
of individual garbled gates, do a cut-and-choose on
all the gates, and then sort of assemble them
together in a circuit by connecting the gates
together after the fact.
So we take that idea and extend it to garbling a
bunch of circuits and then kind of connecting
circuits up on the fly. So we call this the LEGO RAM
approach. I tried to find a picture of, like, an
actual RAM, you know, like the animal made out of
LEGOS, but this is all I could find. It's the
battering RAM, though. This is the LEGO RAM. I was
thinking, I may have had that set as a kid. I think
it was a different one that was -- I had a lot of the
castle LEGOS, in fact.
Okay. So how does the LEGO approach work? Let me
describe this idea of generating circuits and then
later hooking them up together. So the main
ingredient is this thing called an xor-homeomorphic
commitment. So it starts out being a regular
commitment scheme. So I commit to A, I commit to B,
I commit to C. The stuff in the dotted boxes is a
commitment that's sitting there waiting to be opened.
So I can open A. It gets revealed to you, and I can
also open the xor of two of these values so that you
only learn the xor of the two values. You don't
learn the individual ones. Okay.
So these commitments are homomorphic in that way.
How does that help us here? So this is our trick of
connecting wire labels after the fact. So here is a
single wire in a garbled circuit. It has two wire
labels, A0 and A1, and they can either encode true
and false this way or the opposite way. And I'm
going to associate a bit, zero or one, depending on
how they encode true and false, okay? So imagine I
have this for each wire in the garbled circuit. I
have these two wire labels and this parity bit.
Okay?
So imagine when I generate a garbled circuit, here is
one wire in this garbled circuit and I have committed
to these three values in an xor-homomorphic
commitment. And then later on I generate another
garbled circuit and I commit to those values, and
they were totally independent of each other. They
have nothing to do with each other, and now later I
want to connect these wires up. This is the process
that they call soldering. So, like, they were
dealing with gates, really like you were building an
integrated circuit or something, because I'm sure
modern circuit fabrication uses soldering irons.
That's my understanding.
So I want to connect these two wires. The first
thing I does do is I open the xor of the parity bits
and that tells me whether the true/false mapping is
different between these two guys. So, yes, indeed,
for these two wires the true/false mapping was
opposite. So I need to -- I want to map A0 to B1. I
don't know whether A0 is true or false, and B1 is
true or false, but I know that they have the same
logical value because the parity bits were xor 1. So
I'm going to open the xor of A0 and B1. I'm going to
open xor of A1 and B0.
And now later if you're the one evaluating the
circuit, let's suppose you got to this point and you
got A0, which might be true or false. You don't
know. All you have to do is xor it with this value
and you'll get B1, which is the corresponding wire
label for down here. Right? So these are like the
glue that allows you to transfer wire labels from one
wire to another. And they don't leak anything else,
so giving this to you doesn't leak anything about the
other wire label. You still have this property that
you can't guess the other wire label.
This is the idea behind -- this is the LEGO approach
for soldering wires together? Okay. So let's just
apply this to our setting of RAM-based 2PC. So in
the preprocessing phase we generate lots of garbled
circuits for the CPU circuit, and for each -- for the
input wires and output wires generate these
homomorphic commitments, right? We'll do a
cut-and-choose all of this, you know, huge number of
garbled circuits and the cut-and-choose also checks
the commitments also, right? So if you tell me to
open the circuit, I'll open it. I'll also open the
commitments and you can check that the commitments
correspond to the secrets inside the garbled circuit.
Okay? So with high probability, most of the rest of
them are good, and they also correspond to their
commitments, which is important.
So now here is some time step. I want to evaluate
the CPU on some time step. I pick a random bucket of
CPU circuits from the pool and I solder their input
wires together, right? So if there's a wire, like
the first input bit of here gets soldered to the
input bit of here and here and here. The first
output bit of all of them get soldered together. We
do this for all the input and output bits.
And now we kind of have -- so this bucket has five
circuits. I kind of have five paths from -- five
paths from here to here, right? And with high
probability, a majority of these circuits are good.
So I just take those five paths and take the
majority, and that will reflect -- that majority vote
will reflect like a correct garbled circuit. Okay?
So that's, like soldering this bucket together
becomes, like, this mega-bucket that kind of acts
like a single garbled circuit after you take the
majority at the end, and then just solder the
incoming state data from the previous step, solder
the incoming memory input from the appropriate step
in the past. The soldering can be done on the fly.
So when it's time to execute this, I'll know how I
need to connect it up to the past. Yeah.
>>: A question as to why you have to sort of lump
every bunch together. Why can't you just end up
separate threads and then solder -- or I mean, solder
together the gates in each thread but then put them
all together ->> Mike Rosulek:
Yeah.
So in this bucket, like any
of these could be bad. Imagine like one out of every
five circuits is bad. That's fine. So let me take a
bunch of buckets. One out of every five is bad, but
it's not like that means in every column there's
probably one bad one.
>>:
Right.
>> Mike Rosulek: So I don't want to evaluate things
that way, right? I want to isolate the bad one in
here. I know that the majority of these are good, so
I'll take the majority of these guys. And I don't
know where the bad ones might live in another bucket.
Does that make sense?
So that was kind of an issue that comes up in the
other kind of cut-and-choose as well. Right? You
need, like, the majority of an entire thread to be
good. You need to consider an entire execution
sequence to be good to include it in the majority.
Okay. So overall we get this protocol. In the
offline phase we generate circuits. We do this huge
cut-and-choose. And in the online phase we solder
things together using these homomorphic commitments.
The offline phase is expensive. I'm not going to
lie, but the online phase is just related to the size
of these buckets, which is determined by this
combinatorial argument, right? And so the bucket
sizes are size S, which is the security parameter
over log of T. T is the number of times I'm doing
this, the running time of the RAM.
I actually wrote a Perl script to figure out the
exact probabilities. So some of these previous
papers had bounds. I thought I'm going to figure out
the exact bounds. And it takes my desktop computer
days and days to figure this out. But the numbers
aren't so bad. Like even for 5,000, like a RAM
program that talks to memory 5,000 times is very
reasonable. The bucket size is seven. That seems
pretty reasonable to me. It takes a while to go from
seven to five, and I can't get it to -- I don't know
where it crosses over from five to three. Probably
with many more digits and the number T.
Concretely these numbers are pretty good, so if
you're wondering how much more expensive is malicious
security compared to semi-honest security, like if
the answer is seven or five, that's not too bad. I
think that's pretty good.
There's some other things I'll probably have to skip
because I'm out of time. There are some things I
really lied about and omitted. Yeah. Mostly with
standard techniques. Like cut-and-choose, by
introducing cut-and-choose, you introduce a lot of
new things that can go wrong, okay? Those can really
be handled by standard techniques.
I still am glossing over how do you initialize the
ORAM. That's the elephant in the room for all this
notion of basing 2PC from RAMs to begin with. I
didn't talk about how to get inputs into the RAM. I
didn't talk about ORAM constructions need randomness
to randomize the location of things in memory. I
didn't talk about those sort of things. And I won't
because I'm out of time.
So this is the slide that I showed before. This is
just kind of a summary of our results. All right.
So if you compare the ratio of the cost of malicious
versus cost of semi-honest, we kind of match the
ratios that we can get in the circuit world and just,
you know, the big idea is just reuse wire labels as a
representation of the secret data. It's a very good
way to -- a very good representation in a malicious
secure protocol because the garbled circuit already
has, like, protection against malicious behavior.
So that's my last slide. Thanks for your time, and
thanks for coming to see my talk.
[applause]
>>:
Any questions?
>>: So what are some programs that are thought to be
better than RAM [inaudible]? I think you had
[inaudible] minor search.
>> Mike Rosulek: Yeah. So the paper that we built
on that does RAM computation in the semi-honest
model, they do just a simple binary search. So a big
data, let's say ten million items, like a sorted list
of 10 million items. So one of the things that I
didn't mention, and now I'll mention it, is that you
can set up the memory once and then run a RAM program
many times. So many invocations of the CPU is one
execution of the RAM program. You could do that many
times, right? So you can query the memory for this,
query -- do a binary search on this, do a binary
search on that. So that was one thing that the
previous work did was binary search.
>>: So is it generally programs that have really
large inputs?
>> Mike Rosulek: So I think that's where RAMs are
the biggest win is when -- RAMs can give sublinear
time algorithms, and if you don't have a secure
computation protocol that even allows that, then it's
not going to be much fun. So like binary search is
the economical example, but anything like that that's
kind of a sublinear time access to a large data set,
I think that kind of makes sense.
One other thing that people seem to mention a lot is
the stable marriage, stable matching algorithm
because it's very -- it's very iterative, like you
start with a matching and then you, like, kick people
out and then, you know, things people keep proposing
to other people and stuff keeps, you know -- the
matching keeps changing in memory. It's a very
random access kind of computation. I haven't thought
hard about what it would look like as a circuit. I'm
also not sure I want to think that hard about it, but
that's another example. It's not a sublinear time
algorithm, but certainly it's more well sued to a RAM
model than a circuit. Cool.
>>:
Thank you again.
>> Mike Rosulek:
[applause]
Thanks, everyone.
Download