>> Kristin Lauter: Okay. So today we're very... visiting us from the MSR Cambridge Lab, that's Cambridge in...

advertisement
>> Kristin Lauter: Okay. So today we're very pleased to have Cedric Fournet
visiting us from the MSR Cambridge Lab, that's Cambridge in England, and also
jointly from -- he's also at the MSR INRIA joint center in [inaudible] outside of
Paris.
Cedric was actually a student at Ecole Polytechnique and INRIA, a student of
John Jack Levy [phonetic], and after getting his Ph.D. he went to MSR
Cambridge as one of -- almost one of the first members there. He's been there
for ten years. So today we're very pleased to have him visiting and speaking to
us about a cryptographic compiler for information-flow security. Thank you.
>> Cedric Fournet: Many thanks. Okay. So my talk is about one of the projects
we are doing at this new joint center in [inaudible], and this is joint work with
[inaudible] and [inaudible] based there. So that's a good picture of the place.
Otherwise I'm in Cambridge.
So the overall goal of our project there is to understand where we can write a
[inaudible] card with some security around it and with a reasonable amount of
effort from the programmer. So we are looking at programming tools like
compiler or so to try to get there.
So the difficulties that in most cases the high-level security goes out in farmware
so they are quite to express and deliver why they make sense for programming.
And the rollover mechanisms can get quite complicated, especially with
cryptography, and they tend to be spread in the system so in some laboratories
or in the network stack. So that it's quite difficult to get this precise mapping
between what you like to get in security and what are the mechanism that are
deployed.
As a remark here we have to point out that priming framework and priming on
wedges don't take at all, don't take much to farm because all those languages
and the implementations and tools were designed before security became
relevant and so in particular there is a specific program in the implicit trust and in
computing that is where a writer of a program the tools of a program is very easy
to assume that everything is going to work as the language spec, and in a
distributed setting and with security and the real side processes in particular as
the remote host gets compromised then part of the programs start to be as in
expected ways.
So the goal here is to specifically to have programs where we are going to
assume for that where we are -- we want to comply with the program or with the
protocol but we don't know that, so in practice, maybe part of the program are
going to be hosting that machines are run by principles that are going to break
the host and run some other code or some other program.
So an example of that is web applications where we have quite a few parties that
are running the programs, so you can have [inaudible] machine with some
system, some [inaudible], some Javascript and then part of that code is also
running on the server on the other side.
And for the functionality to happen, you want all of them to work together, but for
security it's not complicated. So you don't want the server security to be
[inaudible] actually running that on that particular Javascript because it's
impossible to enforce symmetry. And conversely the security between that
distance and services and especially the client is talking with different services,
you would want to bind the interactions between them.
So another example that shows something that can be done today with the -- we
have is TPMs where you can actually take advantage of the new hardware to run
part of your program in complete iteration with in principle very good [inaudible]
so you can have a program and there is some code and some input and you can
move that to the TPM, run it, get some signature and get the result.
And you have guarantees about that execution that are very strong that do not
depend at all on the reporting systems. And so in principal you can run part of
your program with very high confidence. In practice to get that to work is really
hard and maybe it's because there are not much [inaudible] in fact no one is
doing that in that state, it's used for [inaudible] protecting the disk. But it's very
hard to get it to work at this granularity. This to have a bit of code that either runs
partly on the application and partly in the TPM.
So what you like to get is a program that's [inaudible] by constriction so we have
some form of specification like for [inaudible], for privacy and what kind of
discriminated platform we want to use. And then we have some transformers,
compilers, verifiers, whatever that generate local code that can run and to
execute the program. And we like to do that in such a way that the research is
secure at least in some sense of security. And in particular if you generate that
automatically we like also to automatically verify it as we go because otherwise
we don't have any confidence in this machinery.
And as you can guess, cryptography is an essential part of this puzzle. So for
the talk today, I'm focusing on one particular compiler of that text. As input a
simple programs without cryptography, but with some high-level security policies
expressed in term of information control flow and a more complete
implementation that is going to use actual telegraphy to protect or access to a
shared memory and so we stop on those programs we get discriminated
protocols with cryptography and the target is that all the properties that we can
hide out of the information flow that they allow the source in the implementation.
So cryptography and that's the technical novelty of this work, we in previous work
we are relying mostly on the models with a fairly abstract model of cryptography.
In this work we are relying on the computation of models so what I believe to be
the former model. And so we'll have to find a way to go down from this eye level
notion of information flow to slow down the signs like CCA for file encryption say.
And we are not going to compare those approaches, but in fact if you look at
what has been done, there is really two sides for cryptography. One of them is
using assembly cryptography, and that's really used to build automated tools and
to scale up to very large programs. But it's quite abstract in term of cryptography.
So here you -- cryptographic terms are just in some abstract algebra, and so
they're easy to manipulate but they're hiding a lot of information. And so in
particular if you look at information flows where there are things like
cryptographic session to be kind quite [inaudible] to somebody [inaudible] and so
for us it's important to go down to the usual assumptions.
So the computational approach is more precise but it's also more complicated
from a language viewpoint because you have to have a probabilistic model, you
have to deal with complexity and also there are much less tools. So maybe one
or two years ago there was no automated tool to do posts in that setting and
these two larger settings to imagine that's quite increasing but it's seen much
harder and to do post [inaudible] model.
I didn't mention that, but please stop me and ask questions at any point if you
want.
>>: [Inaudible].
>> Cedric Fournet: Yes.
>>: You mentioned the first approach as successfully applied to crypto
protocols?
>> Cedric Fournet: Yes.
>>: The [inaudible] success limited? And I just have the sense that there have
been so many attempts to apply the formal approach, the protocols that haven't
come very deeply.
>> Cedric Fournet: So here we are very -- we -- okay. So okay that's something
I'm going to present today, but I will claim that we weren't actually quite farther.
It's quite recent. It's in the previous four years maybe. But now we can verify -well, we have verified programs that may be [inaudible] and serious protocol. So
next week at CCS we are presenting some analysis where we actually redo both
for an implementation of [inaudible] and so we have an implementation of
[inaudible] that is almost completely verified using this. And with computational
[inaudible] we verify a small bit, but only the code of it. For all the details it's only
symbolic but it's still -- it's [inaudible].
>>: More recent stuff because I remember [inaudible] was proving this approach
years ago, many years ago, and years later a bug was found.
>> Cedric Fournet: Well, sure. As you've heard you find bugs inside the
mothers. Okay. It's not a complete victory there but I would claim that there you
can actually verify the heuristic protocols without too much trouble these days. I
see how that's intercepting. It may be one day that will be absolutely [inaudible]
but technically this is still much harder. And this one will [inaudible] heuristic.
>>: You say the computational approach [inaudible].
>> Cedric Fournet: Yes.
>>: [Inaudible] you're referring to is the modelling kind of games that are
[inaudible] to each type of protocol. Is that what you mean?
>> Cedric Fournet: Yes. That would be either based on games or based on -well, in the [inaudible] model where you press the security as a [inaudible] of
things going home which may be small but nonzero and where you actually deal
with -- so you see [inaudible] assumptions whether cryptography is doing but at
least the domain of cryptography is a finite [inaudible] as opposed to [inaudible].
So, yeah, I would make that separation between whether the keys are finite
number of bits of some fresh [inaudible]. I'm sure here there are many nuance,
many different ways. So for me for example on the marker it would still be there,
which would be [inaudible].
Okay. So I'm going to briefly review and present information-flow security. In
your setting that is in the context, we've actually verified. Then I'm going to
present how we can protect the abstraction then describe how we [inaudible]
using computational cryptography using your type system and [inaudible]
prototype complier.
So who knows about information-flow security? Yes. Okay. So I will just go
through the review that I have here. So information flow security is a nice
abstract way of specifying security in your system, and the -- so you take
valuables and you put a security level to every variable, and then you allow or
disallow flows of information between those levels. So the hosts of that secret
variable from that [inaudible] to variables and that for secrecy and for integrity the
variable from that influence trusted variable. So we can read that.
So the technique that needs information flow and you can read that data for
secrecy and for integrity. So things are organized in the letters so we have
security device and so I'm always going to use this kind of orientation with
secrecy on that side and integrity on that side, and so the central flow of
information is all going up so you can go from public to secret and you can go
from trusted to tainted. But the converse flows are forbidden.
And this model is quite flexible, so for example if you have a product with N part
you can give and dependent on points in the letters so each of the parts that that
represents what each of the parts is supposed to read or write at a given point of
the protocol. So here I have two primary example. So here I'm assuming that
the e-mail is more secret than the book report. And then you don't want the
collection of the book report to take the e-mail address by mistake and to send it
on the network and so you should have an e-mail as secret or more secret than
the book report then you will get an error when you try to compile that program.
So typically it will [inaudible] as we try to compile that program. So you have
similar things for integrity. So here I'm assuming that group is low integrity
variable and that's a high integrity one because it's used for controlling who has
access to machine and when. And so here we are not directly grouping the
group to less login but we are grouping to less login depending on what group is
and so that's an implicit flow and that's also forbidden and so that will also be
flagged as a [inaudible] in that system.
So that's the kind of security we get. And so there is a quite a lot of work that has
been going on in this area with programming languages and systems and so the
CV test is the usual thing, implementation for example they are also
implementation for Java script web applications. And on the formal side they are
also fully satisfied Java outlets depending on this security model.
So there are systems that can use this. So some things are polymeric with that
approach, so one of them is that polices are very demanding. So by forbidding
any flow of information it's very up to useful programs. And so in some cases we
need to relax the model to have a loopholes and to [inaudible] control them and
that's not easy. But also in many of the systems cryptography is used but is not
models, so people assume that the user's cryptography is good enough to
preserve the abstraction, but that's actually not a view and that's what we are
focusing on today.
Okay. So compilers in the source language and the target language. So my
source language for this talk is just a very small language not F# as in the
prototype but it's just a small language with impartial programming so I have
memory status and I can read an expression and write in assignments this
memory size and I can do branching and loops. So it's so small but you need
also imperative language.
And I'm going to use [inaudible] in that language and context in that language to
represent everything, so I'm going to use that to represent [inaudible] and so on
and so forth. And so in a sense we are following the so-called [inaudible]
approach as we are getting by [inaudible] using this language so that we are
going to reason about not about [inaudible] but about commands. Part of them
we can interpret as part of the protocol and other parts as the [inaudible] so that
everything is nice and formal in this small language.
So in this model, the only way of interacting -- the only way to interact is the
shared memory, but so for example if we talk about these applications it means
that we would assume that there is some transfer that moves that's about
[inaudible] to another, but we are not looking at that in particular. In particular
once we have shown memory we can use untrusted library for communications
like to do network stack.
Okay. So we want to represent systems where part of the programs are correct
and another part of the programs have been corrupted and are controlled by
either [inaudible] and so we have a notion of hosts. And all those hosts are
showing the memory. So here I have a sequence of commands, and I'm using
exponents to represent the host.
So here I have three hosts, A, C, and B. So A runs first, then C, then B, and so
on. And once we have such a program we can specify a compromise level for
the system by saying what's our levers [inaudible] and by setting the [inaudible]
we decide whether the [inaudible] can read and write and also assuming it can
read and write all -- everything [inaudible] then since the [inaudible] has full
control of that also we can just ignore that code and replace it by a code without
changing the model.
And so I'm going to use that factor whenever for the [inaudible] and everything
that is below alpha can be right and everything that is above alpha in term of
integrity can be returned. And so I [inaudible] that's the reason I raise the code
that is lower than that and I have a command with my code plus place others for
the [inaudible] and I want to understand the security of that code for any code
that I can plug instead of a hole inside that common and see what security
guarantees are maintained.
So the host is [inaudible] can run any code there, but that code can only read and
write according to the policy fixed by alpha. So you can vary alpha and then the
way alpha is to the left [inaudible] going to be. Okay. So we have a definition of
secrecy that says that this command is situ at alpha. If I start with some secret
values and run that command with any [inaudible] then I read -- I reach a final
configuration where the public part of the memory is the same.
So in a sense I want to show that there is no flow from the secret memory to the
public memory here and so it is the case by varying the eye level, the secret part
of memory I don't get any difference at -- in the public part. And for secrecy we
get something very similar. So we start from two different versions of the data
memory. We run the code with some arbitrary [inaudible] and if the trusted part
is unchanged, then this command preserves integrity.
So to establish that the command preserve integrity one classic way is to rely on
a type system and so I'm not going to detail the host but essentially there is a
small type system that we can use to take your command and determine in
advance whether it is going to be safe when run -- in the [inaudible] and so if
capturing succeeds then [inaudible] meaning that whatever [inaudible] we put
instead of the host is going to preserve confidentiality and is going to preserve
integrity.
So there is one technicality that I can't really hide that has to do with run-time
errors, and so in a sense as soon as we have dynamic checks we cannot hope to
get complete integrity because if I verify a cryptographic signature, say, and if the
verification fails, then in a sense we [inaudible] as chance the output of the
program because the plan failing is not the same as the plan succeeding with a
high value in the end. And so complete integrity will assume that [inaudible]
cannot cause the protocol to fail which will be what we're demanding. Yes?
>>: Can't you fix think about [inaudible] so you've got a badly signed message,
you pretend to accept it and you say here's a garbage message let's [inaudible]
the whole system.
>> Cedric Fournet: So essentially that's what we do, yes. So we have a notion
of error and we have an error value and we guarantee the integrity of every value
at the end that is not an error value. So that's what we do. But at least we need
to have some conditions to deal with that. But indeed that's essentially what you
do. So we want to distinguish between [inaudible] which are okay and [inaudible]
where the reason is changed without any error signal.
And so from a [inaudible] viewpoint we can adapt with that system to do that. So
there are some restrictions like you should be careful not to assign twice its
invariable because otherwise you could forget that there was an error at some
point. But that's a detail essentially provided it's -- it's okay to treat errors as
varied integrity. Then again by typing you can get that property for all the
programs.
Okay. So so far the situation is quite simple so we don't have cryptography, we
have a nice shared memory. And we can trust the memory to protect the values
of -- that are installed in the variables here. So here I have a command with just
A and B. I'm assuming that C has gone back so I definitely cannot [inaudible] go
there.
But still if the [inaudible] has no access to X, Y, and Z, then we know the value
that is written for X here is going to be the one that is the value written for X here
is going to be the one that is read there, so X is equal to 1, and hence Z is not
going to be -- could be to Y so the value of Z remains completely secret. And in
the end Y is going to be [inaudible]. So in a sense here although we are
[inaudible] code, that code cannot touch memory that is protected. So it's too
easy. We are relying on this strong assumption for the memory.
And so of course we are trying to get rid of that restriction because in practice A,
if you assume that A and B are on two different machines, then we need to pass
some protection message to carry the value of X from this machine to that
machine and also to pass the value of Y from this machine to that machine
again. So to do that what we can use cryptography. So we can, for example,
verify signatures and they create when you want to read the memory for instance
and we can also encrypt and then sign whenever we want to pass the value to
some remote host. And so this command may be compared to that command
where we are assuming some sold-out cryptographic primitives for signatures
and encryptions.
The question then is whether -- in what sense is this implementation correct?
And in particular, what keys am I supposed to use, what are the assumptions
about those keys, can I use those keys within different messages? And in a
sense if this [inaudible] compiler, how to make sure that my compiler is going to
generate code that is good enough to get the similar security as if I had this
magical share memory between our hosts.
So here, for example, I'm relying on XE and XF, so there are two unprotected
values that start the encrypted value of X and the signature of the encrypted
value of X. So those values are unprotected. They can be passed on a network.
We don't have to trust the code that's going to communicate that.
And we do the same for Y. So -- yes?
>>: [Inaudible]. Since you have a shared secret anyway for containing X
encrypted value, why do you need a signature automatic? It cost a hell of a lot
more.
>> Cedric Fournet: So okay. So okay. So for now I'm separating the
encryptions and the signatures. In fact, we use more efficient cryptography than
that in the end. So we are going to use either authenticated encryptions or
symmetry key encryptions plus max.
But the problem is that indirect implementations you are going to build up things
before encryption and signing while making -- and you don't want to make exactly
what you want to encrypt and vice versa. So I agree that they are much more
efficient when they do it than this way. This is just an example. But for now we
want to -- okay. My goal is to have a setting where I can have a compiler that
generates that that maybe tries different optimizations in different size and see
the guarantee that may compare exactly to secure for that notion of security.
But, yes, we would like it to be efficient. So in particular we would like to upload
the reuse of keys for encrypting several variables on searches, otherwise we are
going to run out of key very fast. But yeah, as we start to reuse keys for the
same -- for different variables at different levels then things are less obviously
correct.
So for example here, if I'm using the same key for X and Y then some trivial
attacks stop up here I can insert some code that is just going to copy blindly the
encrypted and the signature already was from X to Y and then that achieves Y
equal to at the end of the program instead of 3. And similarly, as I mentioned
earlier, I can still write to zero to XS, which is going to cause that code to detect
and error in size without completing the program. Okay.
So now we get to the cryptographic part, so we need to [inaudible] language to
be about to represent both the cryptographic premises and the hypothesis we
have about them.
So why we do this, so we take imperative language and we add some
probabilistic sampling primitive. So F here is a probabilistic function. And this is
sampling the value of F with those parameters and putting the results in XY to
examine. So I'm picking the [inaudible] because that's continuing on for
generating the keys at the same time but that's a detail. And so in particular, I'm
going to denote by this the [inaudible] function that is 01 with equal polarity. And
I know that we have this small extension; in fact you have enough to express all
the algorithms or and [inaudible] games of confidential cryptography so we have
enough source to do that.
So I'm not going to show you the probabilistic semantics of this small language,
but essentially just stomp out the lack of chain, so nothing special there. So it's
probabilistic semantics. Okay. So this is not the cryptography assumption I'm
using but that's simpler to express, to explain what you are doing, so here I'm -- I
was going to explain what we do and the simple variant of CPA encryption.
So we just take three commands, so they are actual commands in your language
that encode algorithms for generating keys for encryption, for encrypting and for
decrypting. And we are going to have two hypothesis, so one of them is the
usual -- a drawing that says that if you decrypt with the key associated with the
encryption key then you get back the plain text. The polarity of the plain text is
short enough, so we have to be very careful there. So we have to be -- to check
-- to compute without the maximal length of plain text.
And then here we express CPA or security again chosen plain text attacks as a
small probabilistic games that you can hide in your language. So we have CPA
is the command that encodes the game so we -- some [inaudible] then we
generate keys and then we let the [inaudible] run before and after running long
encryption to is going to select which of X you want XY is encrypted [inaudible].
So just the [inaudible] except that here we can just let the [inaudible] arbitrary
commands in the language with access to the variables at the right level. So if
we put the variables in the [inaudible] levels then clearly the Boolean that
controls the game is high secret and high integrity. Why the encryption key is
high integrity but low secrecy. The values that are polarized are low integrity, low
secrecy, and the value X that is encrypted is low integrity but high secrecy. And
then in the end we get G which is low integrity, low secrecy and we compare with
the guess of the [inaudible] the same as the bit of the game and then we check
with our [inaudible] probability of 1F. Yes?
>>: The encryption key is low secrecy? It's counterintuitive at least.
>> Cedric Fournet: Well, here it's public key encryption so the encryption and the
encryption keys are held out this time, so I should have said that, yes. So the
encryption key is here is public. But it has to be high integrity, because otherwise
how you can cheat by putting 0 instead of K and observing that part.
>>: You said that the cypher text is low secrecy?
>> Cedric Fournet: So the cypher text is low secrecy, yes. The selected plain
text is high secrecy.
>>: Okay.
>> Cedric Fournet: But low integrity because you are copying values from the
other side. To just to [inaudible] that we can cast strong notion of secrecy while
you are forcing in that framework of integrity [inaudible] probability of failure and
just new information. So in practice we use something that is variant of CCA and
also for encryptions and CMA for [inaudible] so nothing special except that the
[inaudible] are not complete so we need to have two [inaudible] but again no
language in that we have expressiveness to [inaudible] that formally.
Okay. So we arrive at the target notion of security at least for secrecy and so it's
expressed as a game and so it's a game between [inaudible] that picks a bunch
of commands and a program P which is the one that's been generated by our
compiler. And so the game goes as follows. So the [inaudible] picks one
command for initializing the low secrecy memory and then two commands for
initializing the high secrecy memory, and then some commands that are directly
[inaudible] that are going to run together with the process. And then [inaudible]
the command that is a text that is going to [inaudible] holding the thing.
Then we pick the [inaudible]. We run I [inaudible] small [inaudible] or BY, then
we run P of A and finally we run G. And if G can hide into G something that's
equal to B then we want to control that probability so that it doesn't deviate from
that. So that's the computational path for the notion of secrecy that they are
[inaudible] and you can shake that abstract notion in place this one but
[inaudible] is not correct because you already have to have -- or should I say
everything is polynomial here, all the commands are polynomial.
>>: [Inaudible].
>> Cedric Fournet: So BY and B1 of the [inaudible] to initialize the secret part of
the memory. So the [inaudible] gets to choose what are the secret memory. But
it doesn't know which of the commands has been used to ensure the memory.
So it plays the same role as X1 for the single shot encryption game. Okay. So
essentially that's a -- so I have defined the source language and the secret notion
there, the title language and the cam security one there, the cryptographic
hypothesis so now we are ready to just have the compiler. And so I'm already
going to apply what happens in the compiler on it to give a lot of details about it,
but essentially we start from a program that has commands plus annotations of
what is the host where that bit of [inaudible] will run. And for each variable and
also for each host we have some security level that says it's relative level of trust
for integrity and for secrecy.
So the first thing we do is we split that program into a series of local [inaudible]
so for example here we have one of the server that goes there and that is coding
back on to the client there. And so at that point you are still relying on global
integrity for the control flow and for the share memory.
So the next step is to make the control flow explicit. So we add auxiliary
variables that represent where we are in the program so. So PC is the prime
counter variable. And so now we check before running the site that we are at the
place where we think we are following that bit of code. And so if that -- if that
chunk of code is called out of schedule like a second time before the local first
local for that has run, then the PC variable is different and so this code is going to
face to detect the problem. So we are hardening the control flow of the problem
by making it explicit in auxiliary variables.
Then the next step is to compose all of those variables into local replicas such
that we have a singular assignment property for each of those replicas. And that
is quite important because we want to distinguish between the different -- the
sequence of values that are assigned to a given value, to a given variable rather
than the -- because there is no such notion at the current value of the variable,
you have to know that it is the variant of the valuable that is written by thread too
which is enforceable as opposed to this version of the variable which shouldn't
be.
So once we have this single static assignment for variables, then at this stage we
can apply the cryptography, so we add the CH verification and encryptions and
decryptions for reading and an encryption and the signature for hiding.
And so here I'm just doing that for one variable but the compiler tries to trust all
the encryptions and signatures into [inaudible] to make that less than efficient.
So at this stage, we also rely on some PKI for bootstrapping, so we need at least
a verification key to get started. But everything else can be programming the
language by essentially establishing session keys at the beginning of the protocol
by encrypting and signing your further keys that are going to be used later on in
the protocol.
>>: How can you establish specialties in this language?
>> Cedric Fournet: I'm -- well, you can copy to -- okay. Let me.
>>: I understand key to show up on bodes sides.
>> Cedric Fournet: Well, so you can -- okay. So assume that you have a way to
implement one variable using cryptographic protocols. Then you can put in that
variable precision key and then rely on the other keys that have been established
already to pass that key.
>>: All you have is a verification key. How can that work?
>> Cedric Fournet: Well, so I have a verification key and I know the verification
keys [inaudible] so then I can pas to the other on my public encryption key and
then they can send to me -- they can generate a key, a fresh key and send it to
me, and then it's still the other one.
You can use a variety of established ones that you can -- essentially it's -- you
need something to bootstrap like at least you need to know the verification keys
of the other parties. But once you have that or if you have some keys, then you
can generate further keys. And it's important to generate new keys because
otherwise the proof will be out there. We actually don't want the keys to be
reused between different teams of the program, so we have to be very careful
there.
>>: Why is it bad to reuse the keys? [Inaudible] label the runs?
>> Cedric Fournet: Yes. So you need to label the runs. And one way to -- for
labeling them is to ->>: Okay.
>> Cedric Fournet: Yeah, you can label them. But, yeah, that's another ->>: [Inaudible] way to label the run.
>> Cedric Fournet: Yes, we use text, too. I'm going to get to what we can do
with the cryptography later. So there are many ways to establish keys for that
session. I'm not -- I'm just claiming here that it's possible to do that and that we
have implemented a simple way to do that and that is secure. I'm not claiming
it's efficient.
Okay. So before we care about efficiency -- yes?
>>: Could you go back to the previous slide.
>> Cedric Fournet: Sure.
>>: [Inaudible] about what happens to [inaudible] in particular say you implement
the value of Y to ->> Cedric Fournet: Yes.
>>: So you need to have [inaudible] value by the time you get to the other
variation.
>> Cedric Fournet: Yes.
>>: So I'm a little bit -- I'm kind of missing where the logic is. You are doing that
operation in the host that you don't trust.
>> Cedric Fournet: Yes.
>>: Then you encrypt a value to it and then you want to implement it, you have
to go back to one that you trust and then come back so that the incremented
value, don't you?
>> Cedric Fournet: No. So if you -- there are two cases. You have to compare
the level of trust of that host to the level of trust of that variable.
>>: That's right.
>> Cedric Fournet: So if the variable is middle, middle integrity or middle secrecy
and you sort of trust the host, then that's okay.
>>: That's okay. But if you [inaudible].
>> Cedric Fournet: So now if you read the value and you don't trust the host that
has one less, then you are going to do the decryption and the signature
encryption using keys of the preview, of the last host you trusted in that chain.
And you know by typing that, no one -- the only way you guy who could have
written the variables are guys you trust because otherwise they will variable.
>>: That's final. But you want to perform that increment operation. Right?
>> Cedric Fournet: Yes. Okay. But here we call those two variables are distinct
variables that are singular signed. So I'm not incrementing -- okay, even if both
of them were X, I'm not incrementing the variable, I'm putting the value plus one
in the fresh variable that is hosting the next correct value for that encryption.
>>: That's okay. But -- well, okay. Maybe we should take this offline. I'm still
missing where you are problem so that by the time you perform these operations
the values are encrypted for because it could happen it seems in practice that if
you don't trust the host that you're going perform these operations then the only
choice is to go to one that you trust.
>> Cedric Fournet: Indeed, yes.
>>: Encrypt it there and encrypt the result of that addition, send it over.
>> Cedric Fournet: Indeed, yes.
>>: All right.
>> Cedric Fournet: Yes.
>>: So I'm missing where those [inaudible].
>> Cedric Fournet: Well, so the constraints -- for that particular the constraints
happens in the I'm assuming that the [inaudible] is well typed.
>>: Okay. So basically it comes from that and ->> Cedric Fournet: So the first one being well typed, you know that the last
writer -- if you don't know where the other bad guy is going to run but he's
supposed to run, but you know that if he is then he can only pretend to hide in
two values that he has access to, and so you know that the last guy who was
written the value that you are going to run must be trusted to deliver that variable.
>>: So basically by the time you get that program and this is basically syntactic,
right.
>> Cedric Fournet: Yes. Yes. And we need -- so we need singular assignments
here because of all right out not to get confused about what is the last value of
the variable. So we are going to generate text.
>>: [Inaudible].
>> Cedric Fournet: Sorry?
>>: I don't think [inaudible] is enough. What if there's a.
>> Cedric Fournet: So if there is a loop, then we are going to index the variable
by the loop indexes.
>>: Okay. So it's more than what people usually mean when they say ->> Cedric Fournet: So it's a variant -- it's a variant of singular setting where we
have to work something on the loops so if -- so from the cryptographic viewpoint
the thing that is verified is the finding of the program. Up to a certain point. And
we need that; otherwise, the tags will be ambiguous.
Okay. So before we care about efficiency, we want the thing to be sounder and
we want to hold not for a particular compiler, which is quite -- it's a prototype so
it's very unstable, but for a class of program that are going to do these kind of
things with the cryptography.
And so our approach was here to design a type system that kept charge just
enough of the correct use of cryptography so that if you can type check the
cryptography after running the compiler without trusting the compiler then we
know that at least it is as secure as the source language. And so while that's -- it
took us a way, there are some problems. Some of them we did not really
anticipate but, oh, so here is a list.
So one of them is the section of the other user of cryptography, and that's -- so
we are doing information flow here, and so you don't need to be able to decrypt
to when some of the is doing from the encryption and decryption and so we have
to make sure that you never decide to reencrypt on that depending on some
secret branching because by observing reencryption which is possible in the
[inaudible] for that then you can infer the value of the branches. Also, [inaudible]
may detect [inaudible] branches by injecting a bad signature and checking with
all the prime phase on that.
And so we have to make sure that in some cases we are going to rewrite the
variable even if it's [inaudible] and in that case that you are going to check
signatures before we get into high secrecy branches within the code. So we
have to -- essentially to move the cryptographic operations so that they're failures
does not depend on the secret part of the control flow.
There are also some balances which integrity and secrecy, so in particular we
need to put levers through the keys to when the signing keys must have secrecy
that is high enough and the decryption keys must have integrity that is high
enough, and so there is some cause and balances between the part of the
[inaudible] for each of the keys for used for secrecy and for integrity.
There are also some internal cryptographic problems like encryptions and key
generation so we -- if you start generating twice the same keypad and use that at
the -- in the -- for the same encryption and decryption then things can go terribly
wrong. So we are just going to exclude that by enforcing that for every usage of
the keypad, single generation. We are also have to find a way of avoiding
encryption [inaudible] so that we can apply [inaudible] our output.
Last remark is that for the keys we get something that is much weaker than the
information security, so when we encrypt there is no guarantee that the key itself
remains a secret, only that it remain secret enough not to play the cryptography.
And so we have to very carefully between values that the keys on value that are
better because the notion of secrecy for them is not the same, and we don't want
to have a failure for results because we were assuming that the keys were secret
and the information the way they are only secret in term of protecting the actual
payload that is encrypted.
And so okay I'm going to give somebody [inaudible] system, but essentially we
add the specific host for the policy functions and for the cryptography and we get
this from a theorem that says that if you take a program that is going to be
[inaudible] computation and if you can type check that program, then against a
policy that has levels for the variables and for the keys, then this program
satisfies the computational variants of secrecy and integrity against [inaudible].
And the proof which I'm not going to detail at all is in fact by a series of games
that are applied and because of the protocol and the structure of the types tell us
in which order to apply the games so that we can eliminate the cryptography one
level at a time.
So you have the types that you use for now, so we have data types and so the -the types consist of the type of the actual value which we call [inaudible] and then
we get an iteration between the two. And so a type is something that you can do
with the data plus a situation and then we have data which is the basic data like
the actual values by the program, then we can build that up. Then you have
primitive types for encryptions. So that's the type of value of tip-toe that is
encrypted. That's the type of an encryption key for encrypting values of tip-toe
and the decryption key used for decrypting value of tip-toe.
And you can see here that we have static label and this label can be used only
once and it's used so that we don't mess or mix the keys together, so that it's
[inaudible] if you try to decrypt a value with the wrong key, even if it's a wrong key
for the same kind of underlying data because if you decrypt with a key that is
mismatched with the encryption key, then everything can happen. We don't
know what the primitives will be doing.
So we have a variant of that for symmetric encryption, and then we have also a
similar things for signatures and for max. And so for signatures and for max
notice that we don't tell the type of what is being signed. Instead, we give a
function F that is a function from tags to the type of what is being encrypted. And
so because we have this function that for which tag gives us a different tag, it
enables us to reuse the same keys for sending values of any number of different
types provided that at the signing and verification time we use the same tag in
the command that's used that key.
What else we say about that?
>>: What those chains are?
>> Cedric Fournet: Okay. So the K is for calibration between the different keys.
So it may be the case that we have keys for encrypting integers, say. But if you
encrypt -- so if you generate a blue keypad on a right keypad, if you encrypt with
the blue keypad and you decrypt with the red keypad, then you get something
that you don't know what it is, and so if what you encrypted was a key you can
get from the [inaudible] that you are going to need some data.
So you must make sure by typing that you only -- you're always going to use the
right decryption key associated for the wrong encryption key and the same for
signature. So you want to make sure by typing that you are going to use the
verification key that is the one associated with the signing key. So that just not to
get the keys mixed together.
Then F is independent. F is declared usage of the signature in time of text. And
so it says that with that -- with the singular keypad at run time that is going to
have the static label the signer on the verifier agree on what is the type of what is
being signed from this map to text, two types.
>>: And what's the K doing there?
>> Cedric Fournet: So the K is just to again to make sure that you don't mix -you don't use the wrong verification key.
>>: So for every key value that's around there's going to be a K in the type
system.
>> Cedric Fournet: Yes. Yes.
>>: A K in the type system.
>> Cedric Fournet: And so ->>: Those Ks don't show up in the running program?
>> Cedric Fournet: Not at all. Just enough [inaudible] typing.
>>: Okay.
>> Cedric Fournet: Is going to what you can do. So for example in this setting,
it's going to forbid us from locating -- from generating keys inside loops for
example. We -- you can install that system a little, but for now with this type
system you cannot generate keys at the direction of a loop because that's hard to
keep track of at compiler time. That's the trade-off. In a sense, this type system
is just as what we are doing now in the compiler, which we want to have this
reasonably expensive class of cryptographic program but certain time, so as long
as it coverts the optimizations that we want to do, that's fine with us.
>>: So what is KM, the last one?
>> Cedric Fournet: K -- KM is just the text. So it's the key formats. So it's the
symmetry variant for signing and -- so key is just a symmetry signature on
verification key. So it's much more efficient. That's why you have the two. So
we are going to use that [inaudible] and then we are going to use max with the
keys for everything, yes. We don't have to, but that's much more efficient than to
use -- there is maybe one fuzz and [inaudible] symmetry key [inaudible].
>>: [Inaudible] model [inaudible].
>> Cedric Fournet: Okay. So we have -- okay. I'm going to skip that one. Okay.
I'm not sure I want to that one either, but essentially just to give you a flavor of
what -- each time you put a new crypto primitive you have to essentially say what
are the content that you are going to unsolve by typing, and so for example if you
want to sign, you have to check that so the thing that is signed is M. So M is type
two. You have to check that the -- and you are sending the tag so you have to
check that the static map maps the type 2 to the tag T.
Then you have to check that the type of X which is going to contain the value of
the signature as a signature type with that level and to a security that is
compatible with the parameters and so you can see that in particular the secrecy
of X cannot be higher than the secrecy of M, because you are not estimating that
signatures are going to write any secret out of there. But in term of integrity, the
important part is that we don't [inaudible] the integrity of 2 as compared to the
integrity of X, so 2, which is the type of [inaudible] can never integrity that is
higher than the integrity of X, which is the whole purpose of using a signature
here.
Okay. So then with we have the compiler, we have the system we put them
together. So again we don't have to trust the compiler, which is good news
because it's a completely bit of [inaudible]. So we start from [inaudible] we take
this program, we type check them so they are well typed, then we generate well
typed code and not only the code is well typed but it's well typed for an extension
of the social policies that is the -- we have a lot of new variables for all new
cryptographic values, but at least the source variables have the same security
develop.
And then we type check the result. And so by constriction we have the property
that with interacts with a compiler that controls the rest of the code that is at the
lower level can get integral information with only an [inaudible] can make the -what's kind of the program go home with the probability. And these days we are
trying to extend that work so that we can deal not only with program [inaudible]
from the start with program that so far that are not really secure and here the
goal is to say, okay, so [inaudible] leak some information and we don't want the
cryptographic information to leak anything else, anything more on top of that, so
that if there is an [inaudible] distinguish between the code and the -- there is a
cryptographic [inaudible] that can break secrecy or integrity starting from two
different [inaudible] then there must be a source [inaudible] with no access to
cryptography that can break the same integrity of secrecy probability. So that's
the final probability that deal with program that declassify along this information
which is -- but here we can't [inaudible] we have to use an [inaudible] of typing
and we can only type the cryptographic part of the problem not the [inaudible]
part of the problem.
Okay. So we have part of the implementation so it's in F# or it's in fact a small
framework of F# so we have the first order imperative subset of F# which is very
much like the Y language that we had. And we also -- the implementation also
generates the code for actually doing the messaging and the control after we do
the cryptography. So that part of code is untrusted. But you see it for the actual
discouraging program to run correctly if there is no attacker.
So the target is also F# plus there's no doubt [inaudible] so we use the section of
the .NET label to do that, so we use I think we use the standard yes for now with
some -- I think it's CBC, but we are considering something that is not too sure in
fact. And for now we use quite a simple allocation scheme so we allocate
probably too many symmetry keys, but it's relatively cheap. But we allocate
symmetry keys, and then we push them in the first messages, and then we keep
using the symmetry keys later on. And we type check in fact several times
during the computation, after each stage of the compiler we have to check we
are not going home. Yes, Ben?
>>: So this is all sourced [inaudible] there's no white code anywhere?
>> Cedric Fournet: No. So we assume that the F# implementation is perfect.
It's [inaudible] for the good we do we could probably -- it's quite simple so we
could -- okay. On top of .NET it's impossible to say anything from a [inaudible]
but of course it's [inaudible] that we could have a much, much smaller run time
for the particular part. So it's -- but, yeah, we just trust the local run time. We
don't trust the remote run time but the local run time is trusted. If someone can
access the key in the shared manage variable we are in trouble.
>>: That's clearly one concern [inaudible] concern is whether they would be able
to take something and say C# and [inaudible].
>> Cedric Fournet: So I don't think whether it's C# or F# would make a big
difference right now because essentially we are in a language that is at the
intersection of the two. It's a very small language so far. It's a [inaudible] it's not
-- we don't have impressive [inaudible] for example, we have with this one. So
it's -- we are beginning the experiment on the side really. Okay.
So in summary, so we compile programs plus policies. And the property that we
get is that all the secrecy integrated properties of this and by the source
preserved in the cryptographic implementation. And therefore there's no
[inaudible] hypothesis. And so to do that, we have to develop compilers and
[inaudible] to reason about computational cryptography with [inaudible] using this
small command language.
So we are trying to extend the compiler to get more performance out of it. So we
recently involved cryptography and we also trying to deal with more structures
these days. We would like to try to do that for an existing language so that we
can just plug a compiler as the final stage of an existing compiler. And we also
like to recognize the pulse to get the stronger guarantees about what we are
doing. Although that is part of a joint default with people, formalizing crypto
primitives in the [inaudible]. Okay. So that's my talk. Thanks.
[applause].
>> Cedric Fournet: If there are questions or remarks? Okay. So we also have
other projects. So if you are especially interested in some of them, I will also be
happy to discuss about them. So I'm at home until the end of the week. So we
are doing TLS, cat space and we're also doing typing for -- although this work is
cryptography. Okay. Thanks.
[applause]
Download