18071

advertisement
18071
>> Kristin Lauter: Okay. So today I'm very pleased to have Vinod Vaikuntanathan visiting to speak to us
on side channels and clouds. So now I've gotten over my big hurdle for the day, pronouncing his last
name. So we're very pleased to have him visiting. He's currently a post-doctoral fellow at IBM, and he did
his Ph.D. at MIT, with Jaffe Goldwasser, in 2009 where he won the Best Dissertation Award from the
computer science department.
Thank you, Vinod.
>> Vinod Vaikuntanathan: Thanks Kristin.
Before I start the talk, I have to tell you that Kristin told me before the talk that she wants to ask me a lot of
questions. I really want to encourage you guys to do the same. Let's make this interactive.
So the topic of this talk is side channels and clouds, new challenges in cryptography.
The development of cryptography has always gone hand in hand with the emergence of new computing
technology. Whenever there's a big change in the way we do computation or think about computation, it
has always led to problems and challenges for cryptography, and ultimately we have faced these
challenges. We have managed to come up with new techniques in cryptography.
In fact, we can see if you look back at the history of cryptography we can see a number of examples of this.
Such as the development of automated ways of doing cryptography, of encrypting messages. For
example, the [indiscernible] machine of the 1930s. The mention of public cryptography in the 1970s, and
so on and so forth.
Today, in the 21st century, we stand at the brink of a new big change in the way we do computing. A lot of
computing these days is being done on small and mobile devices, which is computing out there in the world
in an environment that we do not necessarily trust.
Whereas we used to think of data being stored and programs being executed in a personal computer,
sitting right in our office under our control, these days we do a lot of our computing on devices such as
laptops, cell phones, RFID devices and so on and so forth.
The problem with this change in the way we do computing is that these devices store sensitive data and
they're just out there computing where the adversary, the attacker, has an unprecedented level of access to
these devices and control over these devices. He has a lot of -- the devices are becoming closer and
closer to the adversary, and that's becoming, that's making it much easier for the adversary to carry out his
attacks.
The second big change we see in cryptography, in computing, is the amount of data that we need to store
and the amount of computation we need to perform on this data is becoming so huge that the end users
are not willing to do all of it themselves.
Either because they simply do not have the resources to do so or various other reasons. Instead, what
they do is they outsource storage and computing to big third-party vendors such as Microsoft, Google,
Amazon and so forth.
This is called cloud computing, and again it's a big concern in terms of cryptography because we're putting
out data and programs out there away from our trusted personal computers.
This new computing reality poses two new challenges for cryptography, which will be the focus of my talk.
First of all, in cryptography, we traditionally think of cryptographic devices as black boxes. In other words,
what this means is that the only way an attacker, the assumption is that the only way an attacker can
access the cryptographic system is by receiving the input and receiving the output of the computation.
What's really happening inside the device, he has no clue what's going on.
The real world is quite different. In fact it turns out there are various mechanisms by which computing
devices leak information to the outside world, not just through input, output access. In particular, an
attacker can avail of a large family of attacks called side channel attacks, which reveal extra information
about the storage, about the data stored inside the device and the computation being performed.
Such attacks utilize the fact that physical characteristics of computation, such as the amount of time for
which a particular program runs, the power consumed by running a piece of software, are in fact something
noninvasive such as the electromagnetic emitted by the device during the computation, or these physical
characteristics leak more information about the computation than just input/output access.
And this extra information leakage is actually being successfully utilized to break many existing
cryptographic systems.
So one thing I want to say before I go further is that these attacks are not completely new. They have
existed for well over a couple of decades. What makes them particularly acute and interesting in this
context is the physical proximity that the attacker has to these devices. This is the result of the new sort of
computing reality that we're facing.
Now the point I want to make is these attacks are not just theoretical attacks, they're actual attacks on
implementations of standardized crypto systems such as RSA advanced crypto standards and they've
been attacked very successfully using side channel attacks. So this is a real concern for us.
So the first challenge, challenge number one that I will address is can you protect against this kind of
information leakage? Can you protect against side channel attacks?
Okay. So that's number one. The second challenge appears in the context of cloud computing, the context
of security in cloud computing. The general setting of cloud computing is one in which there's a client and
a server. The client has some input data and she wants to do some computation on the input data.
And the point is that she may not have enough resources to do the computation or might otherwise be
unwilling to do the computation herself so what she does she sends the data across to the server which
does the computation on her behalf.
This is a very general setting, and specific examples of these settings are specific familiar examples of
these setting are the context of searching the Internet. Using Bing, for example, to search the Internet. But
this is also interesting in the modern context in a more general scenario where the server performs an
arbitrary computation on the input data.
So, first of all, the client may not be willing to disclose her entire data to the server. Why would it? Right?
So the client demands privacy of her data. And one way to achieve privacy of data is for her to encrypt the
data and send it over to the server. Don't send the data in the clear. Encrypt it.
But now we are in a little bit of a problem, because what the server gets is encrypted data. It doesn't even
see what's inside the ciphertext and yet it's supposed to perform some -- it's supposed to execute the
program. It's supposed to perform some meaningful computation on the underlying data.
So the question is can we reconcile these opposing goals of privacy and functionality? So in more
complete terms, our challenge number two is can we compute on encrypted data?
So these are the two challenges I'm going to address in this talk. And the talk will go in three parts. And
the first part I will show a mechanism to protect against side channel leakage. But this is a hugely active in
cryptography at this point. Many, many models of how outside channels behave and many constructions of
cryptographic primitives that are against side channels. In this talk I'll tell you about a simple model for side
channels and construction of a particular cryptographic primitive that's secure in this model. This is joint
work with Akavia and Goldwasser. In the second part of the talk I'll describe a mechanism to compute
unencrypted data. This is done using a very powerful cryptographic primitive called a fully homomorphic
scheme. This work is joint with Van Dijk, et al., and I'll talk about my research philosophy and open
questions and future directions.
This is the structure of the talk. And probably this is a good point to stop, see if everyone is on the same
page. Okay. Wonderful. So let's jump right in. Part one. How to protect against side channels.
The traditional approach to handle side channel attacks has been to address side channels as they come
along. In other words, for each particular kind of attack, design tailor-made solution that handles this
attack. So that has been the traditional way of looking at this problem.
This could be done in hardware, namely manufactured specific types of hardware that protect against
specific attacks. Can you modify programs so they assist in particular kinds of attack. So one sort of little
draw back with both these solutions is that they are usually ad hoc. And they don't come with a proof or a
guarantee that they actually do the job.
Solutions have also been explored in sort of a higher level setting, in an algorithmic setting when people
have considered leakage of specific bits of the secret. So let's leak the first bit and 50th bit and 100th bit, et
cetera. This is a specific and people have explored this in a number of works.
So this general approach of solving side channels has two problems. One is the scalability of this kind of
solution. So every time you want to protect against a new side channel attack you have to incur the cost of
building a new type of counter measure.
In other words, the cost of your solution grows as the number of side channels that you want to protect
against. Yes.
>>: Just curious about you're distinguishing algorithmic and software attacks. There's a lot of overlap. I'm
not sure I really distinguish them that well.
>> Vinod Vaikuntanathan: Sure. The line is a little bit thin in that case. But what I mean by software is that
I take a specific program. So one approach that people have explored to counter act timing attacks is to
make sure that the program runs for the same amount of time on every input.
>>: So it's not ->> Vinod Vaikuntanathan: That's not an algorithmic. So the line is a little bit thin. But so the first problem
with this line of attack is its scalability. The cost grows as the number of side channels that we need to
protect against.
The second, more subtle and probably more serious problem, is the composability. To roughly explain the
problem, the problem is that if I build a solution that protects against one side channel and build another
solution that protects against the second side channel, who said the effects of these side channels won't
cancel each other?
The question is do the different protection mechanisms work well with each other. That may not be the
case. In fact, people have done a lot of work in studying the effect of these countermeasures, the effect
that they have on each other.
So there are a couple of problems with this traditional specific approach to counter acting side channels.
What we will look at today is a modern approach. The modern approach is to say let's step back and let's
try to come up with a model that captures the wide class of side channels and design a solution in this
model.
And if your model captures a large class of side channels the solution will be secure against all these
attacks as well. This approach was advocated and pioneered by Micali and Reyzin.
So, in other words, instead of looking at each of these side channel attacks one by one, what we want to do
is we want to define a model of the adversary that interacts with this system and the model should capture
all these attacks in one goal.
And once you come up with this model, you want to construct cryptographic primitives and prove they're
secure in this model. Yes?
>>: The new way of side channel the military had a new approach of side channel attacks keep the crypto
in a very secure location and don't let anybody in. And except for timing attacks, it seems to be a pretty
good idea.
>> Vinod Vaikuntanathan: So it is. But the attacks still don't prevent people measuring the electromagnetic
radiation that come out from my cell phone when I'm walking around, for example.
>>: That's a cheap cell phone. If it were serious -- [indiscernible].
>>: But then you don't get affordability, which is his whole point.
>> Vinod Vaikuntanathan: Yes.
>>: If carried ->>: For $5 you get affordability, too.
>>: So I put my phone on [indiscernible] I try to punch the buttons [indiscernible] anyways, keep going.
>> Vinod Vaikuntanathan: But good point. Okay. Good. So this is the general approach that we want to
take. And this is, as I said before, this is a very active, very recent and very active area of research, with
the number of models and matching results.
Aside from the results that I mentioned before, many of the researchers have come up with various
different kinds of models and constructions of cryptographic primitives. What I'm going to talk about today
is one specific result, the work with Akavia and Goldwasser where we proposed a separate model and a
construction of a particular cryptographic primitive that's secure in this model. That's what we're going to
talk about today.
So the model we propose is called the bounds leakage model. And starting observation to this model is
that if you allow the side channel to leak an unbounded amount of information, there's really nothing you
can do. Because the information could be the entire secret contained in the device itself.
So one sort of necessary condition that seems -- one condition that seems to be necessary is to put a
limitation on the amount of information that the side channel leaks. And that's our first restriction.
We say that the amount of leakage is less than the length of the secret key. In fact, it's a parameterizable
quantity. The larger this amount is, the more secure you get and so forth. So this is the first requirement
that we pose on side channels.
The second observation is that if you allow the side channel unlimited computing power, then it can break
the crypto system itself. Right? There's no way you can protect against such a side channel. And that
leads to our second restriction. We say that the side channel should be, the computation done by the side
channel, which is an [indiscernible] for electromagnetic radiation or oscilloscope is poly bound. That is the
second restriction.
>>: Is it really enough to have leakage to be less than the key?
>> Vinod Vaikuntanathan: No, a lot less than the length. In fact, it's a parameterizable quantity.
>>: Good.
>> Vinod Vaikuntanathan: I should say that it's possible to -- this is a very basic model. I'm presenting this
because of simplicity. In fact, it's possible to relax both these conditions, in particular the information
limitation quite a bit.
And I'll be happy to talk about this one-on-one. Okay. Good. So this is the model. This is sort of the
abstract model that we'll work with today. Yes?
>>: So the first restriction, restriction on the adversary or on the ->> Vinod Vaikuntanathan: It's a restriction on the -- if you think about side channels there's two entities,
one is me, I'm the attacker. I'm using a particular hardware device to get some measurement on the
computation. So there are two entities here. One is the attacker one is the device. The restriction is on
the device and the amount of information that the device gives me and the amount of computation that it
can do. Does that answer your question?
>>: So you're saying that this will only work if the device doesn't actually leak less information to the key?
>> Vinod Vaikuntanathan: Absolutely. In fact, you can see if the device leaks the amount of information
that leaks is more than the length of the secret key it can just leak the entire secret key itself, right,
potentially.
So good. So this is the model that we'll work with today. And the rest of this part of the talk I'm going to
use public encryption as my example primitive. Let's refresh our mind about what public encryption is. In
the setting there are two people. Bob and Alice. Bob wants to send a message to Alice privately over the
public channel.
The way he does it is by letting Alice choose two keys. Secret key and a public key. Public key you
publish. The secret key Alice keeps to herself. And Bob has a mechanism to use the public key to encrypt
the message, send it over to Alice in the public channel.
Alice can use her secret key to decrypt. It's a very simple setting. And what the attacker can do, the
attacker that we will look at in the setting is an eavesdropping attacker. That's the most basic model of -that's the most basic attack that you can perform on a public encryption scheme. The attacker gets access
to the public key. It's out there in the open. So you can look at it.
It looks at the ciphertext because it's going through a public channel. And it tries to gain some information
about the underlying message. The security requirement, the classical security requirement is that the
attacker has no information about the underlying message.
>>: Your keys there, already looking it's the direction ->>: Public key?
>>: Because it's got the MIT marks on it. Your office keys.
>> Vinod Vaikuntanathan: Right. Aside from that. No. [laughter] I'm surprised you can see it from here.
>>: I can see it as well. I don't know what building it is. But security.
>>: No worries.
>>: Just joshing. Keep going.
>> Vinod Vaikuntanathan: We want to talk about the bounded leakage model. And the change that we
make to the security game is very simple. Instead of in addition to letting the adversary get the public key
and the ciphertext, we also let him ask the decryption Oracle, the decryption box, for a leakage function F.
So this is the leakage function that's computed by either the oscilloscope or iranta [phonetic]. This is one
that models both of these leakages. And as a result he gets back the function F applied on the secret.
So this is a very sort of simple extension to the classical security model.
>>: So F is polynomial time boundary? Oh, I guess ->> Vinod Vaikuntanathan: Obviously following our discussion before. We need to have two restrictions on
the leakage function F. One is that it has bounded output length. This is sort of a sliding scale. The more
the output length is, the more the leakage is and the better your security gets.
And the second restriction is that the function is polynomial time computable. Okay. So this is sort of the
concrete model for public encryption that we'll talk about.
Now we have a model. The question is, are there encryption schemes that are secure in this model, that
protect against this class of attacks? So let's first look at well known encryption schemes. Encryption
schemes that we know and love.
So unfortunately turns out that the outside encryption scheme is insecure for bounded leakage. There are
a number of results that show if you leak a small constant fraction of the secret key, you can break the
entire crypto system.
So RSA is out of question. As for the El Gamal encryption scheme, there's no explicit attack that breaks
the system with leakage, but there's no real guarantee or proof that it's secure either.
So unfortunately these two systems, not good for us.
What we show is that a newer class of encryption schemes based on mathematical objects called lattices
are in fact secure against bounded leakage. So these are not -- the mathematical basis for these systems
is not factoring a discrete log, but it's a different kind of object called lattices.
So we show that an encryption scheme proposed by Regiff [phonetic] is secure against leaking upon any
constant fraction bits of the key, 99% or 99.99%. And subsequently in work with Dodis Goldwasser, Kalai
and Peikert, we showed that a different encryption scheme based on lattices also has this property. The
interesting thing I want to point out is that these encryption schemes were not explicitly designed with
leakage in mind. They were designed for a completely different purpose.
It just so happens that leakage resilience is built into them. Okay. So that's what we show. And following
our work, a number for the constructions were shown to be leakage, some based on more standard
assumptions such as Diffie-Hellman assumption, also a class of constructions based on what's called hash
proof systems. Also shown to be secure against bounded leakage.
>>: Who are the authors on this, DHH?
>>: This is Don Bonnie [phonetic] from Stanford, Shy Levy, Mike Hamburg and Rafael Strofski [phonetic].
So there are all these constructions. In the rest of this part of the talk I'm not going to focus on any of these
specific constructions. But instead we'll step back and ask the question: What is it that makes these
schemes leakage resilient? Is there some kind of underlying structure or principle behind these schemes,
which makes them secure against these attacks?
In fact, I will show two principles, two ideas, such that if an encryption scheme satisfies these two
principles, two conditions, then it's automatically secure against bounded leakage. In fact, all these
encryption schemes I showed you in the previous slide actually satisfy these restrictions.
So you can think of this as sort of a way to explain why these schemes are leakage resilient. The first idea,
the first property that I require from an encryption scheme is that given a public key, there are many
possible secret keys. In fact, I will require that there are exponentially many possible secret keys for any
given public key.
If you're used to thinking about RSA or El Gamal, this is just a foreign concept, because in RSA the public
key is a product of two primes. The secret key is a unique factorization. So although you can't compute a
secret key efficiently given it's a public key it's uniquely determined.
>>: What you use as the secret key RSA is the decryption component, which is the encrypt component.
>> Vinod Vaikuntanathan: Yes, you can add a multiple -- I'm going to get to it in a minute. So this is the
first property we require.
And, of course, the encryption algorithm, when it encrypts a message, it has no clue which secret key the
decryption box contains. It could be one of these many exponentially many different possibilities.
Therefore, correctness of the encryption scheme requires that any of these secret keys decipher the
ciphertext to the same message. In fact the message that the ciphertext originally instructs.
In other words, all these secret keys are functionally equivalent. They all decrypt the ciphertext to the exact
same message. And yet even though they're functionally equivalent, we will show that partial information
about any single secret key is totally useless to the attacker.
So let's actually think about this idea, sort of step back and think about this idea a little bit, to answer Jeff's
question. You can take any encryption scheme, any encryption scheme that you like, and you can make it
have this property. You can take the secret key and pad it with a bunch of random nonsense.
So the number of secret keys grows, but we haven't really done anything. Right? So this idea by itself is
not particularly useful. What makes it useful is the second idea. And the second idea, the second property
that I require from the encryption scheme there are two distinct ways to generate a ciphertext. One is the
way that Bob uses to encrypt his message. Takes a message, uses the public key, and computer
ciphertext. This is what Bob does in the real world.
The second way to generate a ciphertext is what I will call a fake encryption algorithm. The fake encryption
algorithm is fake. It's never -- no one ever executes this in the real world. It's just an artifact of the proof.
It's just sort of a mental experiment that exists because it's a mental experiment that exists because we
want to prove this scheme's secure. So fake encryption scheme is a randomized algorithm that takes
nothing, no input, and just produces a random ciphertext.
And the ciphertext behaves very differently from a real ciphertext. In the sense if you take a ciphertext and
decrypt it with any of the possible exponentially many keys, you'll get a different message every time. If I
take a random one of these secret keys and decrypt the ciphertext I'll get a random message. Functionally
the ciphertext behaves very different from a real ciphertext.
>>: Are you going to show sufficiency of these because presumably nothing is wrong with the necessity of
these, do you have an answer?
>> Vinod Vaikuntanathan: No, I don't. In fact, in a couple of slides before I mentioned that El Gamal is not
known to be secure in this setting. Turns out El Gamal does not satisfy either of these properties.
But we have no clue if it's secure or not. So I'm not sure. I'm not sure that these ideas are necessary. But
this is one way to achieve security.
>>: And that is the lattice because they're all short based and you have a lot of equivalent for short
vectors? Is that where you're getting this from?
>> Vinod Vaikuntanathan: So just a brief sort of overview of these lattice solutions. In the lattice-based
encryption schemes the public key is a point in space, in dimensional space, and the secret key is any
lattice vector that is close to this point.
So depending on how you define close, there could be an exponential number of points that are close.
That's what contributes. And of course it still doesn't explain why the scheme satisfies this other property.
But just to give you an idea.
So the fake ciphertext behaves very differently from the real ciphertext. And yet we require, if I see a fake
ciphertext in one hand and the real ciphertext in the other, I can't tell which is which. So fake ciphertext is
indistinguishable from the real ciphertext.
>>: Secret key.
>> Vinod Vaikuntanathan: Even if you know secret key. Even if you know a single secret key. So if you
know --
>>: It will encrypt to the same thing with different keys but the only one won't.
>> Vinod Vaikuntanathan: You can't tell. So if you know a single key -- but so this is the second property
we need. And now I'm going to show that if you have these two properties, that's that. You get leakage
resilience.
So the first step in sort of the way to see why this works is to look at the fake world and the fake ciphertext,
right? And the fake ciphertext often contains it was generated without the knowledge of the public key or
mass -- it contains no information about any message or server.
In fact, the fake ciphertext decrypts to a random message under the choice of a random secret key.
This is what happens when the adversary is given just to the fake ciphertext. What if it's also given some
amount of leakage from the secret key?
So let's look at that. Let's say he gets one bit of leakage from the secret key, right? So originally from his
view the number of possible secret keys was exponentially large. All these huge amount of secret keys. If
I get one bit of information about the set, that narrows down the set by a factor of half. Number of possible
secret keys is reduced by a vector of half.
Two bits, right? It becomes a quarter. And if I get a bounded amount of information about the secret key,
there is still quite a bit of possibilities that are left in as a secret key.
And of course each of these, random one of these secret keys decrypts to a random message. Therefore
still the attacker has no clue what the message is.
Okay. So far everything happened in the fake world and we don't even care about what the fake world is,
right?
But the key thing to observe is that ciphertext, that happens -- that's produced in the fake world, is
indistinguishable from a ciphertext in the real world. So the adversary is not going to know the difference if
it gets a fake ciphertext or a real ciphertext. And therefore it has no information about the message either.
It didn't have any information about the message in the fake world. Why would it have any information in
the real world?
That's that. That's the end of the proof. All right. And that actually finishes what I wanted to -- I just told
you about a model, a simple model for side channels and a way to construct a public encryption scheme
that is secure in this model. So that finishes my section on side channels. Is there any --
>>: Supposing you -- this is the thing, supposing you have corresponding ciphertext, would that help them
at all?
>> Vinod Vaikuntanathan: New ciphertext potentially. So this is sort of a strong -- the attack you're
mentioning is a stronger attack called a chosen -- I guess a known ciphertext. Potentially it could. But I
think, again, don't hold me to this. But I think I can show that this scheme is secure against a known
ciphertext attack. Ciphertext message pairs. That's simply because it's a public encryption scheme.
I can construct messages in ciphertext by myself. But a chosen ciphertext attack is a different issue. This
may not actually be secure. We can take this off line. Yes?
>>: How do I know if my system only leaks a very small amount of information? That's an engineer's
problem?
[laughter]
>> Vinod Vaikuntanathan: Exactly.
>>: If I put an antenna at your smart card, you're leaking continuous data over, even over a bounded mile
path, how do I know there's no bit in there?
>> Vinod Vaikuntanathan: How do you know that the leakage doesn't sort of uniquely -- you can't recover
the unique secret key? I'm going to borrow John's answer, say it's an engineer's problem.
>>: But you motivation for this was all these countermeasures that engineers put in there they're heuristic
and they don't add any formal proof to what you're doing and we're going to avoid that problem.
>> Vinod Vaikuntanathan: One way to look at this solution is if you can guarantee to me that the amount of
information leaked by this particular attack is bounded, I'm going to tell you -- I'm going to show you a
provable way to protect against this attack.
So really -- you should think of the solution as a combination of the hardware countermeasure and an
algorithmic countermeasure. You give me the guarantee that the amount of information is bounded.
Here's the solution.
>>: Is it not just a reasonable imagine, it's just the channel.
>> Vinod Vaikuntanathan: Couple of different answers to your question. Maybe we can -- okay. So this is
what I wanted to say about side channels.
Let's move on. The second part is how to compute encrypted data. So again let's remind ourselves of our
motivating example. There's a client and a server. The client sends an encrypted data to the server. And
the server has to perform computation on the encrypted data and send back the encrypted result.
This makes sense in various specific contexts, for example, encrypted Internet search, which it turns out
that for specific instantiations of this problem there are already sort of tailor-made solutions that solve these
cases.
For example, searchable encryption, a cryptographic mechanism that solves encrypted Internet search.
What we're concerned about today is performing general computation on encrypted data. A cryptographic
tool that is really helpful in this general context is called fully homomorphic encryption. That's what I'll talk
about for the rest of the talk.
>>: So the data is by the user, not the data sent back to the user, correct?
>> Vinod Vaikuntanathan: Well, the question is what does it mean for -- it could be that we want to protect
the program against, so it could be that the client doesn't know what program the server's running.
And I want to sort of hide this fact from her. But that's sort of a different concern that we're not going to
focus on.
>>: In your case you care about bounded results, that's what the --
>> Vinod Vaikuntanathan: Yes.
>>: [indiscernible].
>> Vinod Vaikuntanathan: Well, another way to look at -- another way to answer your question is that the
computation on the encrypted data is performed without the knowledge of the secret key. It's a public
computation. Anyone can do this.
So the information that the encrypted result leaks cannot be any more than what the original ciphertext
leaks. Because I'm performing some public computation on the encrypted data. Right? So okay. So this
is the problem that we're concerned with.
So we're going to model a program, an arbitrary program, as a boolean circuit. So if you go down to the
most basic level you can write any computation as a combination of exclusive or gates and and gates. So
I'm going to write this as a big boolean circuit. I'm going to look at exclusive or. It's an addition in the field
of size two and it's multiplication.
That's an equivalent that I'm going to use. So what is a fully homomorphic encryption scheme. A fully
homomorphic scheme is a scheme where you can compute the and function on two interpretive bits and
interpretive bits. Add and multiply encrypted data.
Because you can write any computation as a combination of and/or gates, if you can do both of these you
can do everything. Right? So that's a fully homomorphic encryption.
What do we know about the fully homomorphic encryption scheme? This sort of concept was first defined
by the name of privacy homomorphism by [indiscernible] in early 1978 when they didn't really have a
candidate scheme that satisfied the definition.
So it was such an insightful sort of active foresight they defined this without having the candidate scheme.
The motivation was actually sort of the modern motivation for constructing these schemes, namely
searching unencrypted data.
What we know so far, limited variance of homomorphic encryption. For example, the RSA scheme, we
know it's multiplicative homomorphic. E.g., if I give you an encryption of one of a different message I can
encrypt the product of the two messages, that's something you can do.
El Gamal and Goldwasser Kalai encryption schemes additively homomorphic. You can add encrypted
plain text but you can't multiply them.
There are also mechanisms that compute very limited classes of functions on encrypted bits, namely
quadratic formulas. The real question is: Can you do arbitrary computation on interpretive data and that's
been opened for nearly 30 years until last year Craig Gentry constructed the first sort of instantiation of a
fully homomorphic scheme. This construction was based on very sophisticated new mathematics based on
ideals in polynomial ranks.
So fully homomorphic scheme is a really very powerful object. And this lets you do a lot of very nice useful
things. The question that Gentry's construction raises is: Is there a simple way, simple and elementary
way of achieving fully homomorphic encryption? I don't want to have to learn algebraic number theory to
understand --
>>: It would help to learn algebraic geometry.
>> Vinod Vaikuntanathan: Algebraic X. [laughter].
>> Vinod Vaikuntanathan: So the question is it an elementary construction of the subject? So, in other
words, the question we want to ask is what simple mathematical objects can you both add and multiply,
both permit addition and multiplication.
So what we show is the construction of a fully homomorphic encryption scheme based on integers, namely
the operations, the basic operations in the scheme is just addition and multiplication, plain addition and
multiplication of integers. No ideals, no polynomial ranks, just simple operations.
>>: Bounded in some way?
>> Vinod Vaikuntanathan: Integers mark something.
>>: How big is a thing? The penalty for the Gentry function, you frighten me? I'm more afraid of the
number of cross-function than having to learn new math.
>> Vinod Vaikuntanathan: The answer is kind of unfortunate. If you're frightened by the cost of Gentry's
function, you'll be really frightened by -- but the point is that -- the point I'm trying to make is that you can
improve on a construction only if you understand it.
You have sort of a good idea how this encryption scheme behaves. And therefore what we should do is
construct simple elementary encryption schemes. So we showed two constructions, one based on integers
and the security of the scheme is based on the approximate GCD problem. The second construction uses
addition and multiplication of matrices, and this scheme is not fully homomorphic. It can only evaluate a
limited set of functions, namely quadratic formulas. The security of this is based on standard well-studied
lattice problems.
So what I'm going to talk about is the integer-based construction. So lets sort of jump into the scheme. So,
first of all, what I'm going to -- we've been talking about public encryption all over the place. But what I'm
going to describe to you is a secret key homomorphic encryption scheme. Namely, to encrypt a message I
need the secret key. And I decrypt the message, decrypt the ciphertext using the same secret key. That's
the setting. In fact, the setting is already useful in the context of cloud computing, because the sender and
receiver are the same entity.
Because how does encryption scheme work? Well, the secret key is an N bit where N is pretty large. N bit
odd number P. So that's the secret key. How do I encrypt a bit P? I'm going to talk about encryption
schemes that encrypt a simple bit for simplicity. How do I encrypt a bit P? I pick a large multiple of P.
Let's say Q times P. Q is a very large number.
I'm going to pick a small number, small even number, 2 times R. And I'm going to add all these quantities
together with the message. So the ciphertext is Q times P. It's a large number. Plus the small noise term,
2 times R plus B. That's the ciphertext.
Okay. How do I decrypt this ciphertext? First of all, if I take the -- I have the secret key. I have the
decryption algorithm. I have the secret key.
If I take the ciphertext and take it modular P, what I get is the noise term. It's 2 times R plus B. In fact, the
key thing to note is that I get the exact noise term. I do not get 2 times R plus B mark B. I get that number
by itself.
That's an important thing to note. Now once I have this noise term, I can just read off the least significant
bit and that's the decryption algorithm.
So this is the entire encryption scheme. Any questions? Questions? Comments? Good. Wonderful. So
this is the encryption scheme. What I didn't tell you is how to add and multiply encrypted bits. So that's the
next thing we'll go over how to add and multiply encrypted bits.
At a high level addition and multiplication works because if I take a ciphertext and another ciphertext and
add and multiply them together, the noise term is still going to be small. Assuming that the original noise is
really small, the noise terms are not going to grow by too much.
>>: Would the number P not encrypt successive messages with the same key?
>> Vinod Vaikuntanathan: No. So that's part of the assumption. The assumption is that if I give you the
approximate GDC assumption, if I give you many near multiples of P, like large multiple plus noise, it is
indistinguishable. You can't recover P. That's the assumption.
So good. So addition and multiplication works in high level because the noise terms do not grow by too
much when you add and multiply numbers.
So let's look at it more concretely. Let's take two ciphertexts. One encrypting a bit B1 and the other
encrypting B2. And addition of the underlying bits is done by simply adding these two numbers as integers.
So when you add these two numbers, I get a multiple of P. I get a noise term, which is slightly bigger than
the original noise because I added two of these things together, plus the sum of the two bits.
So now if I take this sum of the two numbers part B and I read off the least significant bit, I get the exclusive
word of the two bits. So that's that. How do you multiply? Essentially proceeds the same way, except that
it gets a little bit more, the algebra gets a little bit more complicated.
So when you multiply two ciphertexts, I again get a multiple of P plus a small even noise. It's much bigger
than before because if you start with sort of an N bit noise --
>>: What's an N bit noise?
>> Vinod Vaikuntanathan: So I started with an N to the half bit noise and N to the half bit noise. And what I
get is two times N to the half bit noise. That's the resulting noise. And if I take the smart P and read off the
least significant bit, again get the product of these two bits mark two. So the homomorphic operation,
homomorphic multiplication and addition is simply adding and multiplying the underlying integer ciphertext.
Josh, question?
>>: You get N bit noise.
>>: N bits times two. Two N bits. So we're missing something --
>> Vinod Vaikuntanathan: N bits.
>>: Square root. Square root of.
>>: Oh, square root.
>> Vinod Vaikuntanathan: N bits.
>>: Never mind.
>>: Thank you.
>> Vinod Vaikuntanathan: Good. So --
>>: I made that mistake.
>>: N of N, if you want security.
>> Vinod Vaikuntanathan: That's a good question. So it depends on how secure we believe this
approximate GDC problem is. I don't have a good enough answer to you -- actually, I don't have any
answer to you. So we have to study this problem and see how difficult it is. And some of which we do in
the paper but we don't have a concrete recommendation for what a security parameter is.
>>: So the approximate GDC problem was just introduced a year or two ago?
>> Vinod Vaikuntanathan: Good. Good. So this problem has been known in various different sort of forms
and shapes before. So if you look at it a little more carefully, you can see this is a variant of the
simultaneous [indiscernible] approximation problem I give you fractions you find me a fraction number
which approximates each of these so many numbers I give you. So this was a problem that was fairly well
studied. In fact, Jeff Logarias [phonetic] has a work on it as far as 1982 and people have studied this
problem for a while.
So the name was introduced by Howard Gregham [phonetic] like five years ago.
>>: I suppose you're going to get to this quickly. But you go through multiplications --
>> Vinod Vaikuntanathan: Good. So you see that the least significant bit is a product of the -- oh, great.
Good. So clearly this scheme has two problems. One is that if I multiply two ciphertexts, the size of the
resulting ciphertext grows.
And essentially it grows -- essentially when you compute a big program on the ciphertext, the resulting
ciphertext is as big as the size of the program itself.
So if you think in terms of the cloud computing example, the receiver gets this huge ciphertext, just reading
the whole thing takes it as much time as computing the program. So this is a huge problem.
I'm not going to talk in detail about how we solve this problem, but the rough idea is that we give the server
another near multiple of P. And every time you do a homomorphic operation you take this mod, this near
multiple. I'm not giving you any extraneous information because instead of giving you hundred near
multiples of P, I'm giving you 101 near multiples. The problem doesn't change that much, but it does
decrease the size of the ciphertext. The more serious problem is that the underlying noise in the ciphertext
grows whenever you add or multiply numbers.
This is a slightly more serious problem to handle. And in fact the first thing that we observe is that we can
do a limited number of additions and multiplications already with this basic scheme.
And then there's a way to bootstrap this limited additions and multiplications to get computation of an
arbitrary circuit. This is a very beautiful technique that Craig Gentry introduced in his original construction.
So putting these two things together essentially solves our problem. Good. So this is QD, all I wanted to
say, right? Any questions?
>>: How important is that? So I think the bounded R models are the square root and bits. So how
important is that to be able to increase it, say, or how flexible is it?
>> Vinod Vaikuntanathan: So you can make it a polynomial, let's say N to the 0.99. And you will still get -the larger you make that noise, the larger your noise grows when you do homomorphic operations. The
smaller of the number of homomorphic operations you can do with this basic scheme.
So amount of noise dictates the amount of homomorphic operations that you can do. On the other hand, if
you have a larger amount of noise, it's conceivable that underlying problem is harder. So there's a trade off
here. The smaller you make the noise, the easier the underlying problem becomes and the more the
homomorphic operations you can do.
So square root of N is sort of an arbitrary number which lets me do log N multiplications. That's sort of the
threshold I want to wager.
>>: Log N, because --
>> Vinod Vaikuntanathan: Order of log N. Order of log N.
So that's that, the rest of the talk, two slides, I want to tell you a little bit about my research philosophy and
to tell you about a couple of open questions that this work leaves behind.
So I think cryptography is an exciting field to work, because it both takes practical problems, problems that
arise in practice, and solves it using very exciting mathematical techniques. So this combination is what
makes for me what makes cryptography very exciting thing to do.
What I told you about today is two applications, two exciting applications that we solved using
cryptographic techniques. The first is [indiscernible] security channels. And both the fields are sort of new
fields. They're just in their infancy. There's a lot of work to be done. A big question that arises in the
context of security against side channel attacks is the following. I showed you a particular construction of a
public security scheme that's against secure side channels.
The big question is can you take any program which is not necessarily secure against side channels and
immunize it against, a general way of transforming any program to a leakage resilient program. And I want
to do this while not killing the, for this to be practical, I want to do this while not killing the efficiency of the
program.
This is I think a very big question in this field.
Number two, again the big question in the context of computing our interpretive data is how efficient can
you make the fully encryption scheme. The scheme I presented to you is David's law slow, so it's
polynomial time. But it's a bit polynomial. So asymptotically we did a back-of-the-envelope computation.
We optimized the exponent, the exact polynomial. So I think the number that we came up with is N to the
8. In other words, if executing a program takes K steps, executing this under the encryption takes N to the
eight times K steps where N is the security parameter.
>>: But the size of the [indiscernible].
>> Vinod Vaikuntanathan: Sure. So the really big question here is can you do this -- can you take this
algorithm and make some improvements in the implementation? That might already bring down, that might
already bring down the complexity quite a lot. And the question is can you design completely different
schemes that are dramatically more efficient. That's the big question.
One thing I didn't get to tell you about is my work on the mathematical foundations of cryptography. So a
lot of these works use either explicitly the encryption schemes that I developed based on lattices. Lattice
are geometric objects that look like periodic grids in space, but they embed very hard problems, very hard
computational problems. And this is a very new sort of mathematical field that we have based
cryptography on. And it turns out that the schemes that we develop in this context are useful in both
protecting against side channels and also the insights very useful in constructing the homomorphic
encryption scheme. So this is another big set of problems that I'm interested in. And that's that.
[applause]
>>: Your integer lattice scheme has a little bit of the flavor of the [indiscernible] schemes a few years back.
Is it just --
>> Vinod Vaikuntanathan: So there's no formal relation that I can prove yet. One sort of superficial
difference is that all lattice-based schemes use N dimensional lattices and here we're just working with
integers. So that's a very superficial difference. But the real answer to your question is that a lot of the
insights are the same in constructing both these schemes, but I don't know of a formal connection.
>>: Okay.
>>: I'll ask at this point, this is probably something we can do off line. I'd love to get a better sense for how
the smoothing works to keep the [indiscernible] down. I imagine that's more than a slide or two.
>> Vinod Vaikuntanathan: Yes. Yes. In fact, sort of handling these two problems that I mentioned, it's 15
full minutes of the talk.
>>: So we'll do that later.
>>: [indiscernible].
>>: I was going to ask about your estimate of the N to the eighth, if that comes out of those details.
>> Vinod Vaikuntanathan: So it comes out of the details. So if you look at the -- so the security parameter
that we work with is the size of the noise. So let's call that K, which is square root of N. So the prime that I
need to pick is of size K squared.
And it turns out that this problem is actually easy if the multiple that I choose, the multiple Q I choose is too
small. So I need to pick that to be sort of this square of this size of P. So that's K to the fourth. And then
when I do this bootstrapping process that already adds the square root of this number, so that's K to the 8.
And I'm not sure if this explains anything, but the complexity grows very fast when you put all these steps
together.
>>: Still the size of the -- it sounds more like the size, but you're actually saying the steps grow [phonetic].
>> Vinod Vaikuntanathan: That's probably -- the number of steps grow, too, that's probably not apparent
from what I described. But once you do this bootstrapping process, the time required to process every gate
is essentially K to the power 8, the size of these numbers. I can explain this off line.
>>: Another thing, following up on one of Brian's comments in the first half of your talk, so does it actually,
instead of the shortest vector problem, does it reduce to the closest vector problem.
>> Vinod Vaikuntanathan: It's not the closest vector problem. It's something that looks like the closest
vector problem. The closest vector problem says I give you a point and you're supposed to find the really
closest lattice point. The problem that I'm looking at here is that I give you a point in space and you're
supposed to find some close.
>>: Which is LQ does.
>> Vinod Vaikuntanathan: Yes. Some close point within a certain radius. There are exponentially many
close points. And you're supposed to find one of them.
>>: Kristin Lauter: Other questions? Okay. Thank you.
[applause]
Download