Kristin Lauter: Today we`re very happy to have Yevgeniy Dodis

advertisement
>> Kristin Lauter: Today we're very happy to have Yevgeniy Dodis visiting us.
He's a professor at NYU, and this month he's a visiting researcher here. He's
done a lot of really interesting work in a variety of different areas leakage
resilience, randomness, designing block ciphers, hash functions, things like that.
But today he's going to tell us about some new results in the left over hash
lemma.
>> Yevgeniy Dodis: Thank you very much. So just a small kind of warning. So
on the normal speeds the talk will be an hour and 15 minutes, but I think I will say
most important things in the first hour, so I'm let people who want to leave, but,
you know, people who want to stay, it will be another 15 minutes, roughly.
All right. And also feel free to ask me questions and so on.
Okay. So a very quick introduction to the [inaudible] of this talk. So we all know
that we need randomness, and we especially need randomness for cryptography
because secret keys have to be random. And the question is where do we get
this randomness. And there are many ways to get randomness. We can maybe
have some physical sources of randomness, maybe we want to use biometrics to
get our secret keys from biometrics. Make we will do some key exchange.
And the point is in many of the situations the randomness is not going to be this
perfect ideal sequence of independent unbiased bits, but the distribution where
the randomness [inaudible] is going to be imperfect. That's why we call imperfect
randomness.
And even in cases where sometimes we have perfect randomness, sometimes
because of this [inaudible] can get partial information about our perfect uniform
secrets. So from the perspective of [inaudible], once again, the secret is going to
be imperfect.
So the question is can we base cryptography on this imperfect randomness
because we have to deal with it, so we better base cryptography on this
imperfect randomness.
And so first we need to say what is a necessary assumption to succeed, because
the randomness is like the sequence of four zeros. This is not very good
randomness. We're not going to do anything.
So it turns out it's easy to see the necessary assumption is no particular value of
the source is too easy to predict. Because, for example, if you have a source
with some value you can predict with probability of half and the rest is like
uniform random and so on, so this probability of half that the attacker knows your
key in, the attacker is going to break your system.
So this value m is called the min-entropy of the source meaning that no value
happens in this probability more than 2 to the minus m. And the question is can
we extract -- so we'll assume this because [inaudible] so we'll assume this, and
the question is can we turn this situation into this situation. Can we extract
randomness from the sources.
So this is called randomness extractors. They, of course, were known before
Nisan and Zuckerman, but roughly speaking, this object takes a secret x, source
of randomness x, and then extracts new uniform string R. Unfortunately this is
too good to be true if you don't want to assume anything else about the source
besides the min-entropy, so we actually need to define what is called a seeded
extractor, but because unseeded subtracters are enforceable, they'll just call
extractors of this object.
The so the extractor will take another input, which is a seed, which we'll call the
seed, and extract randomness R and now -- and the seed is assumed to be truly
uniform. And now you can ask several kind of questions. You can say, well, if
you already have two randomness, you know, why don't they just use S? Well,
there are two reasons -- several reasons why it's [inaudible] interesting.
First, I wanted to say that this seed -- think of this seed as public. So, yes,
maybe [inaudible], but somebody kind of powerful who has very good source of
randomness was just nice enough to give you this through randomness, but he
also gave it to the attacker.
So, yes, we're going to use this public seed S to extract secret randomness from
X. So formally the extracted bits are uniform even if I give you in particular, the
attacker, this seed S. So this is what we are going to call an extractor.
Sometimes we call strong extractor, but we'll just call it an extractor.
And it has many, many uses in complexity theory in cryptography, so for us the
most important will be key derivation, but this is like a cool concept [inaudible]
like a couple of papers about shaving some log factor, some constant factor,
some stuff. So this is like really an important primitive.
Okay. So the parameters, like a very painful slide, but essentially all the
parameters except for one will be introduced on this slide. So it's painful, but
hopefully not -- it's just one slide so we concentrate all the pain here.
So min-entropy is going to be M. This is like five letters that I'll introduce. Output
length -- so how many bits we need to extract. So think of V as kind of a security
parameter, but in general the number of bits we want to subtract. And for us an
equivalent measure, which will be more convenient to write our bounds, will be
as an entropy loss, which is the difference between the amount of entropy we
have and the number of bits that we extract.
So ideally we would like the entropy loss to be essentially zero, so we want to
extract everything we have, but that might be, like, too good to be true, so we'll
try to come -- essentially we try to make entropy laws as close to zero as
possible. But as we'll see, sometimes entropy loss could be even negative, that
you have so few random -- you have some randomness. It's less than what you
need, but, still, you want to do something to you will extract more than what you
need.
Okay. Good. Add our epsilon. So we're not going to be able to extract
completely perfectly, you know, independent uniform bits so there will be some
kind of statistical area epsilon that we will achieve, which is, roughly speaking,
[inaudible] distinguish some probability between uniform bits and our extracted
bits. So think of epsilon as like 2 to the minus 8 will be good enough.
And then in the second part of the talk the seed length n will also be important,
so how many random bits we need. So for cryptographic applications it's often
not that important, but it's kind of nice that you don't need to rely on some guy to
give us, like, a lot of random bits. So we would like to make it small.
And it turns out there are optimal parameters for this problem unknown, so the
important one to remember for our purpose is the entropy loss. It's 2 log 1 over
epsilon. And the seed length, essentially it's like logarithmic in the [inaudible]
parameter, so we actually don't know, in principle, a lot of random bits.
>>: [inaudible]?
>> Yevgeniy Dodis: X is the source.
>>: So X is m?
>> Yevgeniy Dodis: Is defined -- usually just to pick -- [inaudible] everything is
defined, so I will give you 10 dollars if you find the [inaudible] and define those on
the previous slide. But it's a good question.
Okay. So we have, like, very good parameters, but unfortunately this [inaudible]
step that actually shows that the random function is high probability and will give
you these parameters, but this is an exponential [inaudible] so can we measure it
efficiently. And [inaudible] -- yes?
>>: [inaudible]
>> Yevgeniy Dodis: That's the thing that everybody knows. It's close, but -yeah. All right.
So sometimes when you kind of have a clean question, sometimes before
solving a clean question you should, like, look around. Maybe somebody already
did this for you, so you can look at some leftovers. And if you look at the
leftovers, because this question formerly in this setting was defined in '96, so if
you look at what was known before, you actually arrive and see that somebody
actually pretty much at least came close to solving this question, you get exactly
what is called the leftover hash lemma.
So I need to make you one definition. So we'll say that the function of family H is
universal, so it's a family function and it's like from whatever source is. You can
wish arbitrary variable length [inaudible] or something like that. The output length
will be v, exactly the number of bits we want to extract. And the property of the
universal hash family is very nice. It just says that for any two distinct inputs, the
probability that they collide in their randomly chosen function h should be small,
and that result's the best you can achieve, and we can achieve it [inaudible]. You
cannot go lower than this, but this kind of the best.
Okay. So this is what the universal hash function is. So leftover hash lemma
essentially says that universal hash functions are good extractors. It wasn't
proven exactly in this forum, but kind of translating this proof, this is what it tells
you, that even if I pick a random universal hash function and give you the
description, it's a good extractor.
All right. So let's look at the parameters. So the first observation is [inaudible]
this left over hash lemma which was known since, like, you know, '89, has an
earlier version of, like, '86 or '85 even of this lemma. It achieves optimal entropy
loss.
So essentially -- so from this perspective we're kind of almost done. Okay. So
left over hash lemma achieves optimal entropy loss. Unfortunately that achieves
suboptimal seed length. It turns out this is like a very nice definition, but if you
think about it, it's not very hard to prove that to satisfy this definition, the seed has
to be at least as long as the source. If you think of a source as, like, you collect a
lot of entropy, you're in a low entropy environment, you collect a lot of entropy,
you know, you need a lot of measurements to get the entropy, the seed will be
pretty bad.
All right. So this is ->>: [inaudible]
>> Yevgeniy Dodis: Yeah, yeah, yeah. You killed the joke, right [laughter].
>>: So this seed is the specification of this hash?
>> Yevgeniy Dodis: The seed is the description of the hash function in this
particular case.
Okay. Good. So this is not so good, but this is still a useful theorem despite bad
seed length -- well, first of all, it already gives you optimal entropy loss, which is
what we care about in any situation. It's very simple. There are many simple
constructions of universal hash functions [inaudible] multiplication in a product
and, you know, many, many others, [inaudible], all kinds of stuff.
It's very fast. So here is kind of the lemmas we'll see in the second part of the
talk. So in theory, it should be faster than [inaudible] by big factor, like probably
100 or something like that, than cryptographic hash functions because in practice
people anyway use cryptographic hash functions and don't bother to optimize
universal hash functions. It's kind of -- it's simply that people think that, you
know, cryptographic hash functions as faster, but, you know, if I can convince
hardware guys to help me out with this, I think it will be like order of magnitude
faster than cryptographic hash function because it's just a very big property. In
particular, we hope that cryptographic hash functions depends on what you call
the seed, but it will probably satisfy it anyway or come close.
And it's also very nice algebraic properties for some of the schemes when we
design some, like, let's say [inaudible] encryption scheme [inaudible] algebraic
properties actually come into play and left over lemma is just a technical tool.
But the [inaudible] -- well, the first one is obvious. The seed length is large and
also the entropy loss is also, even though it's 2 log 1 over epsilon, I will argue in a
second that this is also large.
So essentially the question is can we kind of get rid of these disadvantages, and
in fact I also need to explain to you why is the entropy loss large.
So essentially the talk will consist of two different parts from now on. The first
part will be how to reduce entropy loss of left over hash lemma, the second part
will be how to reduce the seed length, and then we'll combine them.
Yes?
>>: [inaudible]
>> Yevgeniy Dodis: Yes, well, it has to be. Yeah. So there is a lower bound
proven already, and since [inaudible] that universal hash function with seed
lengths has to be at least the length of the [inaudible]. And I'll talk about some of
the [inaudible] a little bit later.
All right. So the first part of the talk is improving the entropy loss of the left over
hash lemma, so this guy has a very, you know, severe entropy loss so we want
to kind of improve it. Okay.
So I need to answer two questions. The first question, is it important? Because
it looks like such a small entropy loss, 2 log 1 over epsilon, so maybe we don't
care. I'll hopefully define on the next slide why it's important.
And the second, well, it's a great thing that maybe I want to reduce it further, but
because it's optimal, how can I do it? So that will be a little bit longer. It will take
a little bit longer.
>>: So is there a reason that [inaudible] loss because you're hoping to do better
than giving that seed length?
>> Yevgeniy Dodis: So the next slide -- so exactly. So let me answer this
question. If I don't answer it in the next slide, ask it again.
All right. So let me ask -- so let's address this question of improving the entropy
loss. First, why do we -- maybe 2 log 1 over epsilon is really nice already. Well,
unfortunately in practical situations it often becomes a deal breaker because
many sources, they have so little entropy to begin with or some who waiting for -you know, a lot of entropy will make things very inefficient that they just don't
have this extra 2 log 1 over epsilon just to throw around. Like biometrics barely
have enough entropy, physical sources [inaudible], and a very important
example, especially here probably, arguably which could be the most important
application, so if you do like a Diffie-Hellman key exchange, forget about the
imperfect [inaudible], you just agree on a Diffie-Hellman key, it's like a group
element over some really [inaudible] group, especially if it's elliptic curve group,
so you need to really hash it down to extract 128 bits.
And if you need this big -- well, this entropy loss, 2 log 1 over epsilon, it means
that the size of elliptic curve group, just the size of the group has to be, you
know, 2 to the 1 -- you know, essentially 128 plus 2 log 1 over epsilon beats, let's
say, 300 bits or something like that [inaudible] maybe you can have 160 bit
elliptic curve which is, like, very fast to implement [inaudible] or something like
that.
So essentially what I'm saying is for Diffie-Hellman key exchange, what you care
about is what is the minimal size of the elliptic curve groups that you can hope to
use, and that really corresponds to efficiency in this case.
So in this case entropy loss, reducing the entropy loss will make things much
more efficient because you can use a smaller elliptic curve group in this case.
>>: [inaudible]
>> Yevgeniy Dodis: Because you have a group element, but -- so essentially
you have a random -- a bit string, but it's not random. It's like a random element
of a group, but the group is like a weird group. It's like an elliptic curve group. So
probably its representation will -- it's not going to be a random, you know -- so if
you have, let's say, a random 120-bit -- if you have an elliptic curve of size 2 to
the 128th, right, it's not going to be a neatly -- you know, a 128-bit string. To
write it down it will be, like, you know, a much longer string because of -- and to
extract randomness.
So in this case you're right. So I know exactly the distribution. My distribution is
perfectly uniform over elliptic curve group, but believe it or not, ironically -- so in
principle, I don't need randomized extractors. I can have a deterministic extractor
that extracts those 128 bits.
Believe it or not, we understand so little -- or maybe -- not you guys -- I
understand so little about elliptic curve groups, and people who write state of the
art papers currently -- maybe you guys improve it -- maybe they understand so
little that if you look at the best deterministic extractors for elliptic curve groups,
they actually have worse entropy laws than left over hash lemma.
And maybe -- so it's like it takes some really significant bits of the [inaudible] or
something like that, but what they can prove using, like, currently, like whale
bounce [phonetic] are actually much, much, much worse than what you get from
[inaudible]. So [inaudible] paper [inaudible] but the bounce is actually much
worse than left over hash lemma.
So in this case because attacker cannot control the distribution of the source to
make it dependent on the seed, you're much better off just fixes, like, I don't
know, like, 500 bits and just applying some very fast universal hash functions and
doing your [inaudible] extractors. So I'm not sure [inaudible], but -- yes?
>>: [inaudible]
>> Yevgeniy Dodis: Well, you get the random group element, right? So you
have some -- you have, like, x and y, some kind of, I don't know [inaudible] elliptic
curves that will be more than -- so x and y itself, if you write it as a bit string, it will
be much more than -- you know, it will be some element of the group, but the
group is weird, right? So how to hash it down is a [inaudible] the description of a
group element to a random bit string because we don't know how to do it.
So left over hash lemma says you can do it because all you need is entropy of
the group, but, I don't know, does it make sense?
>>: [inaudible]
>> Yevgeniy Dodis: [inaudible].
>>: [inaudible]
>> Yevgeniy Dodis: Yeah, I want to extract and I want to graph a [inaudible].
We do key exchange, I get a weird group element, but the group element is not a
128-bit string. It's a weird string which is, you know, not uniform in its
representation so I hash it down.
All right. Good. So I took more time than I wanted on this. So let's look at the
case study to see what is our hope.
So here it just kind of says that this is -- we don't have this [inaudible] it would be
good to get rid of it, but a much bigger competition from us, as we'll see, comes
from cryptographic hash functions as I will just define by looking at this case
study.
So assume I want to derive a v-bit key for an epsilon-secure application. So
what I mean, I mean like I have an application like AES or AES encryption or
something like that, which is epsilon-secure if I have a random 128-bit AES key.
Right?
And unfortunately I'm not going to have a random key, I'm going to derive the key
from -- you know, I'll have some physical source or something like that for the key
exchange and I hash it down. Right?
So left over hash lemma says, look -- and it's probably reasonable in this case -it will say, look, I shoot for error epsilon as the ideal case, so let me have an extra
error epsilon for the abstraction so the final error will be, like, 2 epsilon, if you
think. So this is, like, a reasonable goal.
So if I have epsilon security in the ideal model, all we extract is the same errors
at my ideal security, right?
So the question is how much do I need? So left over hash lemma says that if
you want to have provable security, you need the min-entropy of the source to be
the number of extracted bits plus 2 log 1 over epsilon.
So in here, this stiff competition we are facing is from the random oracle model,
so this is a little bit comparing apples and oranges, but I will try to convince you
using this kind of ugly formula that in the random oracle model, if I just apply a
random oracle to my source, you know, heuristic level, just apply it on the oracle,
similar oracle model, I don't need to lose any entropy in this case.
So I'm going to try to convince you. So this will be, like, much better, right?
So in the random oracle model the only way I can tell random oracle x from
random is if [inaudible] I manage to query x to the random oracle. Because if he
doesn't query, random oracle of x looks random. Right?
So what is the probability of query x? Well, for each query, because the source x
is [inaudible] entropy m, you cannot predict this probability better than 2 to the
minus m. So for each query you have probability 1 over 2 to the m of hitting the
source, so you'll make [inaudible] queries -- you know, you'll make some number
of queries the probability of hitting the sources of most number of queries you're
allowed divided by 2 to the m.
But what I'm saying is the number of queries you're allowed is at most the
biggest running time you can have. Right? And in the biggest running time you
shouldn't break your security with probability epsilon because my application was
epsilon-secure with ideal randomness.
So even if you try to do exhaustive key search attack, essentially -- you know that
in time epsilon times 2 to the v, essentially, you can make an exhaustive key
search attack which succeeds this probability epsilon. By trying this many keys,
you'll hit, you know, kind of at random, that you know that the number of queries
should be upper bounded by this, and from many settings this is actually kind of
[inaudible] bound. In some sense that's exactly what it kind of assumes, that for
some applications like AES exhaustive searches essentially is the best after a
factor of 8 just need to show that, you know, it's kind of close enough to the best.
So this is a pretty tight bound anyway.
So but if you plug this bound, if you said v equals [inaudible], you already get the
security that you want. Okay? So essentially -- and this is something -- this is
maybe a [inaudible], but intuitively what this random oracle gives you, it says -- it
just magically smooths your entropy without any kind of -- it just says you don't
need to know where your entropy is located for the purpose. If you use the
random oracle model it's as if you have a random key of the same length, kind of.
This is kind of the intuition. Okay.
So as an end result, if you look at the practitioners like [inaudible] but he couldn't
make it, so it will say, you know, forget about left over hash lemma. It's crazy to
have a proof. I'll just apply cryptographic hash functions and I will apply it to all
these kind of cool settings where we don't have this 2 log 1 over epsilon extra
loss, right? So actually I'm just saying, a, you know, in some setting this is a big
loss and, b, heuristically we don't need to suffer this loss. So as a result, nobody
will use [inaudible] security, right?
So our goal is can we provably reduce this entropy loss 2 log 1 over epsilon to at
least come closer to random oracle. And just jumping ahead, we will not be able
to quite match it, but it will be kind of halfway in between, but hopefully we'll make
it much closer so partitioners can maybe at least give a second thought to using
a provable key derivation.
So this is the first question. The second question is ->>: [inaudible]
>> Yevgeniy Dodis: The random oracle is a seed. Sure. Yeah.
So there are some other subtle advantages it can extract from a predictable
entropy, but from this clean setting, the main advantages are entropy loss.
>>: Have you talked to any [inaudible] and asked them if they've looked at
analyzing hash functions as extractors for low entropy ->> Yevgeniy Dodis: You mean for low entropy things?
>>: Yeah, to ->> Yevgeniy Dodis: So we have a paper at Crypto 4 [phonetic] -- oh, yeah, here,
this is DGH plus where we actually said -- we said, look, if you want to look at the
particular design the way the current cryptographic hash functions -- you know,
use some kind of [inaudible] design, is it reasonable to assume -- to -- how
universal are they, or how good of an extractor or they?
And you have a bunch of conditions, and our results were kind of mixed. Under
[inaudible] assumptions like the compression function is random or was ideal
[inaudible] used to build the compression function, you know, [inaudible] you
don't quite get that it matches the parameters of left over hash lemma. So
essentially when you [inaudible] bounce you get suboptimal, and also you don't -I mean, you need to -- you either need to [inaudible] keys out to the hash function
or you need to assume that every block has entropy or the last block has
entropy.
So essentially we had caveats under extra assumptions which are unnecessary
in the case of left over hash lemma and assuming that the whole compression
function is kind of a seed, if you wish, we could argue [inaudible] is impossible in
[inaudible] properties, but, you know, they're not yet quite -- so essentially part of
the thing --
>>: But I'm just asking if you had talked to different analysts, right? This is a
very clean question ->> Yevgeniy Dodis: Universal ->>: Well, not how universal, but can you show that there's bias in the output
given the main entropy source.
>> Yevgeniy Dodis: Well, what I'm saying currently is the only analysis I know is
you argue that -- you argue that to prove that cryptographic functions as good
extractors, you see how universal they are under some heuristic thing what the
seed is. And it seems like the only reasonable thing -- so this is -- I mean, this is
the only thing that I know, and we initiated this study in Crypto 2004 ->>: [inaudible]
>> Yevgeniy Dodis: Well, but, you know, if the analysts like [inaudible] because
they could have looked in our paper and got a lot of good questions, but at least
this question was, you know, analyzed at the major cryptographic confidence,
and, you know -- but I'm not aware of people following up on it.
So, anyway, good. So essentially our -- and jumping ahead, you know, the
results will show we can plug, you know, a Crypto 4 paper and say that even if
you actually use cryptographic hash functions under appropriate kind of
conditions, you can actually -- you know, the same assumptions that we made
here will allow us to reduce entropy loss, you know, for cryptographic hash
functions themselves.
But anyway, you know, more provable than just assuming that the whole hash
function is a random oracle. So does it make sense [inaudible]. Sorry. I saw
that I was finished 20 minutes ago with this slide, but I'm glad that people are
interested. Does it make sense? All right.
So let me finally come to the third slide of this talk -- okay, so even if this entropy
loss is already optimal, because it's -- hopefully I'm motivated that it is important
to reduce it [inaudible] already optimal. And the answer is yes. I mentioned it
before. But it's yes only if you care about all kinds of distinguishers.
So what I mean, only [inaudible] because what is the statistical distance, you
know, what is it epsilon? Because epsilon is the best advantage the attacker can
have, some distinguisher can have, telling apart extracted bits from random bits.
So if we care about all possible distinguishers and define epsilon this way, this is
what we get. But in cryptographic applications I claim -- and at least for the
setting of key derivations, very often we care about restricted distinguishers.
And by restricted I don't necessarily mean when it's running time, but restricted in
more ways. Essentially I claim that the kind of distinguishers we care about -- so
we figure out a key for cryptographic application like signature scheme,
encryption scheme, the kind of distinguishers we care about is a distinguisher
who will run the cryptographic assumption, he will run the attacker again as the
challenger let's say for a signature scheme, and we'll output 1 as the attacker
won the game.
So let me actually -- if this is too abstract, let me give you a case study.
So assume I actual use left over hash lemma to derive a key for an MAC or a
signature scheme. All right? So what do I know? I know that any attacker
forges a signature if I had a truly random key, if I didn't have to use left over hash
lemma [inaudible] some epsilon. So I'm using this same epsilon, but hopefully -which is kind of negligible. Think of this as negligible.
And the only thing I care in this application for the case of signature, I only care
that the same attacker, suddenly his probability will jump to high if instead a truly
random key, he was given extracted key and also the seed. That's the only thing
I care about.
I don't care about -- so the inside -- so if you look at the distinguisher, the
distinguisher in this case, he ran essentially his attacker again as -- so he kind of
said, okay, you know, he gave -- so the distinguisher, he said, okay, I'm given a
key, a secret key. I don't know if it's random or extracted. I ran the usual
unforgeability games against this attacker and I see the attacker won the game,
and then I output 1.
So the only thing I care in this particular application, this epsilon prime suddenly
didn't jump too much from epsilon, right? Does it make sense?
If I can argue -- so but in particular, this distinguisher, he never -- he almost
never succeeds because epsilon is very small. So in this particular case I claim
that we only care about distinguishers which almost never succeed in uniform
keys in the first place.
So, yes, in principle -- you know, if there was an attacker who already forced a
signature with probability a half, maybe this probability will become three quarter,
which will be a huge gap, like one quarter.
But let's say I don't really care because this attacker I assume doesn't exist or it
takes too many resources for me to care about. Right? So I only care that the
attackers who almost never succeed in the first case still almost never succeed.
And for encryption schemes, it will be a similar thing. Right? But hopefully this is
[inaudible]. For cryptographic setting we don't care about all distinguishers. We
only care about some kind of restricted distinguishers which the security of the
application tells. Is it clear?
So the hope is that maybe we can have better entropy loss in this case. Right?
So any questions about philosophical kind of -- so this is our philosophical hope.
Yes?
>>: So the security of the [inaudible] on the side of the distinguisher?
>> Yevgeniy Dodis: So the distinguisher -- when I say restricted, in this case
restricted means both the running time, because implicitly bounds a number of
[inaudible] attacker, but more importantly as we'll see for our particular case, it
means that the distinguisher almost never succeeds.
So abstractly -- I mean, we'll talk about particular kinds of distinguishers in more
detail. All I'm saying is if you derive a key for a cryptographic application, the
kind of distinguishers you care about are not all distinguishers. They're restricted
in some way. Maybe time. I don't know. Whatever. Okay?
All right. Good.
>>: [inaudible]
>> Yevgeniy Dodis: So let me make a statement here. I hope it answers the
question here.
All right. So let's recap our setting and then I will tell our theorem.
The setting is the following. So I have some application P signature encryption
scheme which needs a v-bit secret key R. Okay? You know, perfectly random,
you think of [inaudible] if you need, like, coins for key generation algorithm,
whatever, right?
In the ideal model I just sample this R uniform at random. But in the real model I
will not have [inaudible] so I will take some secret source X which only has
min-entropy, I extract out key R, and I will assume that the min-entropy of my
source is v plus L. So whatever this L is, you know, hopefully it's large, but
whatever, right? So this is my setting.
So the assumptions that I will make is that my application P is epsilon-secure in
this ideal model, whatever the security means. Unforgeable, you know, chosen
[inaudible], whatever. Okay?
So the conclusions that I hope to derive is that my application P will be epsilon
prime secure in this real model. This is my goal. And I want to make epsilon
prime very close to epsilon. Okay?
So this is how our goal and this is our question. Does it make sense?
Do you have a question? Good.
So first let's apply standard left over hash lemma. So let me translate the
[inaudible] of the standard left over hash lemma because it was stated differently
to this setting.
Translated to this setting I'm saying epsilon prime is at most epsilon plus the
statistical distance between the real key and extracted key -- uniform key and
extracted key. And the result, statistical distance, you can just [inaudible] why
error is more convenient for us is just square root of 2 to the minus L. So, for
example, if you plug L to be 2 log 1 of epsilon, you get exactly another epsilon.
So this is -- I'm just saying if you try the parameters of left over hash lemma, this
is exactly what it tells you. It tells us that the only thing you lose is the statistical
distance in your security. Right?
>>: [inaudible]
>> Yevgeniy Dodis: This is in the entropy. Sorry. This is in the entropy. So you
cannot predict any element with probability more than 2 to the minus of this.
Sorry, this is [inaudible]. He has actually -- I think he might want them back
because -- I'm not sure. Maybe I didn't define it. But I think it's universally known
because, you know, I'm only talking about, like, letters, like random letters. H
infinity is like universally known notation, okay? But we'll talk about it later.
All right. So here is our theorem to hopefully answer [inaudible] question or
partially answer that for a wide range of cryptographic applications we can graph
a tighter bound. We can argue that epsilon prime is at most like epsilon plus,
there is a bound, which is an extra epsilon inside.
>>: [inaudible]
>> Yevgeniy Dodis: Sorry?
>>: [inaudible]
>> Yevgeniy Dodis: The stage of the super random generator? What do you
mean? You mean like if I need a longer key?
>>: Yeah.
>> Yevgeniy Dodis: So the PRG, think about -- the [inaudible] but essentially
think about v in this case is a security parameter. Even as a standard model, I
can always -- maybe -- I think this is what you're asking. I'm saying in a standard
model you can extract something short and apply it [inaudible] to extract
something long. Sure, you can do it. But you still need to extract at least, like,
security parameter bits.
So if you think of v like security parameterizing, PRG is not superior.
>>: [inaudible]
>> Yevgeniy Dodis: I see. Okay. So hopefully -- okay. Hopefully we'll talk, but
for now is everything on this slide clear? I mean, aside from the word wide
range. Okay. Which I'll define.
So this is our bound. So the moral of the story is if you know why you're
extracting, you can extract more bits. Right? If L is, for example, fixed, you can
kind of extract more bits and still the entropy loss -- essentially, right? So it's kind
of a special case of a general moral in life that if you know why you're doing
something, you can achieve better results. So this is kind of a special case of
this.
>>: [inaudible]
>> Yevgeniy Dodis: Yeah.
>>: [inaudible]
>> Yevgeniy Dodis: Sure. Yeah, L is the entropy loss. So L is the difference
between the min-entropy and the [inaudible] extracted bits. But here I want to
think about this way. I'm saying I just have some -- I have some min-entropy, I
don't know what it is, and I want to extract v bits. So let's define L to be the
difference and let's write the bound as a function of L.
So if L is, like, really bad I guess I will get a really bad bound, but whatever it is,
okay?
So let's compare the bounds that we have. So the standard left over hash
lemma gives you this bound. So in particular I'm just repeating, if you want
comparable level of security, n has to be at least 2 log 1 over epsilon if you just
plug this formula.
But on the other hand, if you have negative entropy loss you get, like, nothing.
You get a meaningless bound, so if you want to extract exactly the same number
of v's that you have, you get nothing useful out of this, right? Even if your
application was super, super secure, suddenly, you know, if you -- if you have
negative or non-positive entropy loss, you're screwed. You don't get anything.
All right. So let's compare it with the heuristic bounds, a [inaudible] heuristic. So
let me derive -- remember this derivation about querying x and so on? It turns
out that if you put it at this bound, the right bound becomes epsilon plus epsilon
times 2 to the minus L. Why? Because I just kind of -- maybe I'll go very quickly
back.
So 2 to the v minus M. It's 2 to the minus L. So it's kind of the same bound. I'm
just translating it, putting not L equals zero but [inaudible]. So the bounds that
we have over there is good.
So this is the bounds of the heuristic [inaudible] hopefully. Just trust me, but, you
know, I kind of derived it before.
So here is a good -- as I told you, the moment the entropy loss is non-negative, if
even like zero, you already get comparable level of security. So random oracle is
this kind of magic primitive.
And it's also -- another thing, it's also -- the random oracle is also meaningful
even if you have negative entropy loss, so, in other words, even if you extract
more bits than what you have. Well, this guy -- at least this proof just breaks
down.
So intuitively the reason you get something, you kind of borrow security from
application. So this application is super, super secure, it's not like it will suddenly
become completely insecure. You can kind of borrow a few bits of security, kind
of one bit of security for every extra bit that you extract which you don't have. So
this is the intuition.
So if you look at our bound, our bound is somewhere in between. So on the
other hand, our bound is obviously better than this that has epsilon. On the other
hand, it's worse than this because it has a square root, and if you take a square
root of a number less than 1 you get a bigger number.
But it's still hopefully -- so, first of all, if you want comparable security we only
need entropy loss log 1 over epsilon as opposed to 2 log 1 of epsilon. So this is
kind of halfway in between. But also, it's kind of interesting that just like here, we
can borrow security [inaudible] entropy loss. For the first time we argue that, you
know, an existing extractor has the same properties, you know, kind of [inaudible]
in a the sense that you can gracefully borrow security from this application. It's
not like you suddenly die or kind of flip a switch and there is no security. So this
is kind of another -- yeah?
>>: [inaudible]
>> Yevgeniy Dodis: [inaudible].
>>: [inaudible]
>> Yevgeniy Dodis: [inaudible].
>>: [inaudible]
>> Yevgeniy Dodis: Well, our result is for a wide range of applications, P epsilon
prime is this. This is our result. And I will tell what wide range of applications on
the next slide. But for now I just wanted to kind of compare before confusing you
with some technical things or I'm confusing you. I don't know. You'll see.
Yeah?
>>: [inaudible]
>> Yevgeniy Dodis: But I'm saying it has nothing to do with application. I'm just
in general saying this application is super secure. I know, look, if I have slightly
uniform things the security -- if epsilon is so low it cannot just suddenly jump to 1
just because, like, one bit isn't secure. It's like if you know one bit of the secret
key, for example, if I generate a secret key which always starts with zero I know
that I can at most double my security because the attacker can guess it. So it's
nothing super magical, but yeah.
>>: I don't really understand when you say borrow security from the application.
>> Yevgeniy Dodis: Well, I'm saying the epsilon is the security of the application
at hand. Remember, in our setting epsilon was the ideal security. So I'm saying
if epsilon is very good, 2 to the minus L could be a positive number. So if L is,
like, minus 17 it will be epsilon times 2 to the 18. So kind of -- as long as epsilon
is better than 2 to the minus 18, I kind of borrow a little bit of security to get a
useful bound.
Yes?
>>: [inaudible]
>> Yevgeniy Dodis: It's not like -- as we'll see, we are not going to derive -- I
mean, we actually argue that standard LHL suffices. That's what we're saying.
We're saying use standard LHL, but for a wide range of applications standard
LHL -- so this is back to John's question. Yeah, I'm using the same thing. I'm
just saying previous analysis of LHL was very crude. You cannot separate the
security from application from statistical distance and it just sums them up.
We're saying we have a tighter security for the LHL itself, yes. Sorry, I
misunderstood John's question. We're not deriving a new extractor. We're just
proving better proof technique for existing extractor.
Okay. Good. So which applications now, hopefully. So it turns out pretty much
everything, but let me put some -- to be formal. So all unpredictability
applications, but unpredictability applications -- all applications was the goal
[inaudible] if you guess the whole long string correctly.
So like MAC, [inaudible] with signature, one with function, identification scheme,
anything like that actually works, and it also includes prominent
indistinguishability application including stateless chosen [inaudible] or chosen
cipher text secure encryption, [inaudible] doesn't matter, and also something that
we'll see is important in a second, something called weak [inaudible] random
functions which are like [inaudible] knows the value of the function at random
points as opposed to other specified points.
But unfortunately -- so in red I'm going to put things which we cannot cover, so
maybe this is to [inaudible]'s question. I cannot cover deriving key [inaudible]
random functions, block ciphers, you know, stream ciphers or one-time pad, at
least not directly.
So this is a bound, and I will explain in a second why there is the bound there is.
But now let me make one remark. You can say what do we do if we want to
have a key for these applications?
So the first answer is, you know, nobody uses PRFs and PRPs directly, right?
They always use this kind of building block for something else. So I'm saying
that this theorem, all I care about, what is your final application. It doesn't matter
that you use PRFs, IBE or whatever, like, as building blocks to build a final
application. If the final application is green, this is fine.
So, for example, if you derive a key for AES encryption in CBC mode, yes, you'll
get [inaudible] key for AES, but because you're going to use it for CBC encryption
and CBC encryption is green, this is fine. You still get the entropy loss.
But if you invent some application which you don't tell me where you say it's
based on PRFs, but I'm not telling you what this application because PRF itself is
not an application. It's just a security definition kind of pointing to a useless thing.
If you don't tell me I say, look, until you tell me, I cannot tell you what you want to
do.
But stream cipher is a useful application. So it's like a state full encryption. So
I'm saying if you want to derive those, you can use left over hash lemma directly,
but here is a [inaudible] that you can do. Because we include weak PRFs
because weak PRF is a [inaudible] application, what you can do, you can first
extract a key, assuming this application is green, which weak PRF it is, and then
just apply weak PRF from a random point to your key.
So essentially -- let me see. Is it clear what I'm saying? I'm saying -- so more
specifically, I'm saying make seed a little bit longer. In addition to the seed of
LHL, put the random point in the domain of, let's say, AES. And then we apply
[inaudible] first you'll apply LHL and then you will need it from this random point,
AES or a random point and you have your final key.
So now you can actually use PRFs, PRPs, stream ciphers, you still cannot use
one-time pad because the last table was computational. So I'm saying if you just
compose it, you can now use any application which is computationally secure.
So you can use one-time pad. It's just now one-time pad will become
computationally secure, so it will become like a stream cipher. Okay.
And the cost is kind of manageable. It's like 1x to the double PRF [inaudible],
plus you make the seed a little bit longer. You include one more point on the
seed.
>>: [inaudible]
>> Yevgeniy Dodis: So weak PRF is a random function where the attacker only
gives the values of the functions at the random points. He cannot choice points.
He just says here is random point 1, value 1, random point 2, value 2.
So this is -- yes?
>>: [inaudible]
>> Yevgeniy Dodis: No. So all I'm saying, the suggestion about -- so stream
ciphers, I think it's a well-known cryptographic thing ->>: [inaudible]
>> Yevgeniy Dodis: [inaudible] if you read like Animal Farm, it's almost like you
have this room, but then there are always three to put -- like, after the [inaudible]
free to put clarifying, like, footnotes. The footnotes only talk about single lattice,
like a, b, c, d. Good point. Thanks for the typo. Good. Steam ciphers. Good.
So anyway, now I don't want to talk about this point, I just want -- I mean, this is
like a little [inaudible] to include all applications, but now I just want to tell you
what is the magic boundary between -- what is the magic boundary between red
and green because it seems like a very weird boundary.
Any questions? Yes?
>>: I have a question about why it doesn't work for the PRF [inaudible] piece.
>> Yevgeniy Dodis: That I'm explaining now.
>>: So you said it works for [inaudible] encryption, but let's say if my encryption
scheme is just the output of PRF [inaudible] with my message, why ->> Yevgeniy Dodis: Output on the PRF on what? On the lemma point? If it's on
the lemma point, great. Because in fact for this case, all you need, you actually
need big PRF security.
So I'm saying it doesn't matter. If you talk about output over PRF [inaudible] that
becomes kind of a stream cipher, then it's not covered, and indeed this is not
covered for a good reason. It's kind of tight in this case.
But the question why it's not clear, that's what I'm explaining in the next two
slides.
Yes?
>>: [inaudible]
>> Yevgeniy Dodis: That's actually a good question. I think it is -- I mean, I
cannot ->>: [inaudible]
>> Yevgeniy Dodis: Yeah. So it is tight. [inaudible] really stupid. So, I mean,
I'm giving you a fixed distribution, like now -- so you have a number P which is
somewhere in between 2 to the K and 2 to the K plus 1, right? So you have a
uniform random number from 1 to P. But now I'm forcing you to extract the bit
string, which is a power of 2. If you just truncate it you get, like, constant
[inaudible]. So essentially you need to truncate log 1 of epsilon more significant
bits intuitively. So you can kind of argue by just counting because [inaudible] in
between and I'm forcing you to output bits, you have to lose log 1 of epsilon for a
stupid reason just because you need to output bits.
So, yes. So essentially -- yeah, that's actually a good question. So essentially
[inaudible]'s point was that, you know, unless you make some kind of heuristics,
that's the only way to access something. Like on the [inaudible] heuristic with
approval the secure way, you must lose log 1 of epsilon. Yes. So there is a
lower bound even if I tell you a very beautiful, nice distribution, but it's not like
friendly 2 bits, you have to lose at least log 1 of epsilon. So in this sense we are
matching it, yeah.
>>: [inaudible]
>> Yevgeniy Dodis: Just for stupid -- I mean, yeah. Good question. Yeah.
So now let me tell you the magical condition. Hopefully it's not that technical, but
it's a little bit technical. So the condition is I can look at the advantage of the
attacker on a particular key. So given a key R, let me define the function F of R,
which is application specific, which is intuitively -- what is the advantage of the
attacker on this particular key R? How well does he break my scheme on this
particular key R?
And let me look for whatever reason it's a variance of this attacker's advantage.
So I square the attacker's advantage. And now I can look at this as a variance
when the key is really uniform.
So I'm saying if I can upper bound the variance of the attacker's advantage by
epsilon by the security of my original application, then my bound applies.
So more generally, I'm in just kind of -- more generally, this epsilon -- I mean, you
know, this is really this variance. In general, I can put the bound variance so the
attacker's advantage is just for the applications. For the green applications I will
argue that this variance is at most the security of the application. But this is kind
of the magical condition.
>>: [inaudible]
>> Yevgeniy Dodis: It's coming up, yes. For now this is just a statement, and I
will tell you the intuition.
So this is the magical condition separating green and red. So as I'll show on the
next slide, this condition is true for green applications but may be false for red
applications.
So this is a condition. So first I claim for unpredictability application that's trivial,
because from unpredictability applications the expected value of the square is at
most expected value of advantage itself which is epsilon because advantage is
positive.
So if you square a number which is, you know, between 0 and 1, you know, it's
less than the number itself, and the number itself is -- the expected value of the
number itself is epsilon. So it's kind of immediately true for all unpredictability
applications.
So now I know you might be a little bit confused, so let me show why it's false on
one-time pad and hopefully you will see what the issue is.
So one-time pad we expect it to be false. Essentially if I would improve the
entropy [inaudible] one-time pad, you would know that something is wrong
because intuitively one-time pad is kind of equivalent to uniform randomness. I
mean, it's kind of one-to-one correspondence. But let's see.
So let's look at the particular attacker. It's like [inaudible] one-time pad, just to
make it simple, who just takes a cipher text and outputs a cipher text itself. What
is the advantage of the attacker on different keys?
If the key is zero, he is correct, because he indeed outputs a message. Right?
So his advantage over random guessing is a half. So his advantage is a half.
But on the other hand, if the key is 1, he's always wrong, his advantage is minus
a half. So the expected value of advantage is zero, so it's a perfectly secure
scheme, well, you know, over random key R, but if you square it, unfortunately it
becomes a quarter. Right?
So indeed, we are not getting -- we cannot bound, you know, a quarter by zero.
So it's kind of -- I mean, I didn't tell you why the condition is sufficient, but
assuming you believe this condition, that means it's justified, right?
And indeed, similar examples can be made for random function stream ciphers.
It's actually easier to at least argue, you know -- give you examples. At least -- I
mean, may not follow them, but even for [inaudible] I can give you some
examples like that.
>>: [inaudible]
>> Yevgeniy Dodis: Well, the advantage in this for a particular application, so
you're close to making some money here, but -- so the advantage is -- I should
maybe lower my thing like 10 cents. And so I'll get broke here. Okay. It's too
late. It's recorded, I guess.
Right. So what was your question? Sorry. I lost ->> Yevgeniy Dodis: Never mind
>> Yevgeniy Dodis: Never mind. I asked it. Okay. Good [laughter].
Oh, yeah. It's advantage about [inaudible]. Yeah, so in this case for a particular
cryptographic application, the advantage will be the probability of, let's say,
guessing or minus a half, absolute value minus a half, yeah. Well, these all types
of value, sorry.
Yes?
>>: [inaudible]
>> Yevgeniy Dodis: Right. But for [inaudible] it's -- let me see.
>>: [inaudible]
>> Yevgeniy Dodis: So let me define it as absolute value. So for one-time pad
you know it's secure, so on average this expected value of this particular
attacker -- I mean, trust me, you translate it, you know that this attacker is not
winning the one-time pad game. All right.
Okay. Good. So -- but now here is an interesting question -- well, everything is
taking longer than I thought, but at least to finish this thing, why is the true for
CPA-secure encryption? So from unpredictability application it was easy. It
seems like it could be false for encryption applications because attacker could
flip-flop around the half, so overall his expected value is zero, but overall -- oh,
yeah. The answer is because the absolute value of the attacker [inaudible], but
the probability is taken over without the absolute value of the occurrences of key
generation algorithm. So in a computed advantage outside the key generation
algorithm, I don't take absolute value. Yeah.
All right. So let's look at the CPA encryption case. So here is the insight -- yes?
>>: [inaudible]
>> Yevgeniy Dodis: Yes. So CPA-secure encryption is the encryption where
you can ask encryption [inaudible]. So actually you will see it [inaudible] why
one-time pad is excluded on this slide.
So here is the insight. The insight is that for any attacker A making Q encryption
queries -- think about symmetric key encryption because it's going to be a little bit
cleaner -- there is another attacker B making two Q plus 1 encryption queries.
Size of advantage of this new attacker on any particular key R is essentially the
square of the original attacker.
So essentially for any attacker A, there is just another attacker A whose
advantage is non-negative. Because there is no attacker you could have
negative advantage in some, has positive advantage in some other and maybe
they kind of balance out and the square makes it positive. But I claim that this is
kind of impossible using this argument.
So let me just define why this is case. So here is attacker B. Attacker B, he kind
of says, okay, I have -- so let's say John is my [inaudible] John is my challenger.
But first I, you know [inaudible] use attacker A. So essentially first I say, okay, let
me see if Melissa is any good. I'm going to use Melissa to answer [inaudible] let
me see if Melissa knows what she's talking about.
So first I'll kind of simulate the attack between John and Melissa myself. So
whenever Melissa asks encryption query, I'll ask John encryption [inaudible]
follows to Melissa, Melissa says I can't distinguish encryptions of two messages.
I pick a message, one of those two messages, myself. I ask John how to encrypt
this message. I forward it to Melissa. Then she, you know -- she tells me what
the message is and I see if she's succeeded or not.
If she succeeded I say, oh, she probably knows what she's talking about.
Otherwise, you know, you will see a [inaudible]. And there's a second stage
actually around it for you, around Melissa [inaudible]. So I again forward Melissa
agency queries to John. Now when Melissa -- she can distinguish encryptions of
two messages, I tell John I can distinguish encryptions of two messages, I get
the cipher text, Melissa gives me the answer, and now I will say should I trust
Melissa's answer or not.
If she was right in the first case, probability she's likely to be right in the second
case. Otherwise, she might be confused and [inaudible] her answer. So this is
the thing.
And it's very easy to see, like in the [inaudible] computation. Once you fix the
key R, essentially I'm running the same experiment twice because the key R is
fixed. It's very easy to see that my advantage, because I kind of -- intuitively I
succeeded Melissa's both times, so Melissa is wrong both times. So this is kind
of a simple calculation.
All right. So why -- so why is it useful? Because now I know that what I care
about is expected value of advantage of A squared. This is what I care about to
apply my result.
But because A is eliminated by valid attacker B, well, I know this is at most this. I
just proved to you. But B is a valid CPA attacker, right? So his advantage must
be at most epsilon. Right? He made double number of queries, but this is still
reasonable.
Now, of course, so this is epsilon. Therefore, this is also epsilon. So here is
specific corollary for encryption scheme. It's a very weird statement. I'm saying
if the encryption scheme is secure again, it's the attacker signing in 2T plus 1,
making 2q plus 1 queries and security epsilon is the ideal model, then it is secure
with this parameters Tq, but a much better parameter epsilon. I can put this
epsilon inside here with this LHL-extracted randomness.
So you can see here the tradeoffs that I lose a completely insignificant factor of 2
in the running time of the number of queries. Instead of maybe like 2 to the 8 I
do, like, 2 to the 79, but, again, a huge factor of 2 in the entropy loss because
previously if I didn't want to lose those things I wouldn't be able to put epsilon
here, but now I can put epsilon. So this is like a verifiable tradeoff.
So now to answer [inaudible]'s question why one-time pad is not included,
because 2q plus 1, even if q is 0, even if you want to prove security again is 0
queries, you need security again that's at least one query, and one-time pad
obviously is insecure again as one query.
So this is the result. So technical tool is [inaudible] left over hash lemma -- well,
okay, I'll say something. Don't worry. This is the most technical slide, but I'll just
tell you what you need to know about it.
Essentially what you need to know about it is the following, that we didn't invent a
new proof left over hash lemma, we just did the standard proof. We were just
very careful about not losing some terms which originally we just upper bound by
1. So if you do actually -- I'm cheating a little bit because there are some subtle
issues, but if you do the proof in the right sequence of steps you have this -- you
argue this advantage of the distinguisher in telling extractor [inaudible] is what left
over hash lemma gives.
And there is one more term which is exactly -- intuitively it kind of corresponds to
the, essentially, variants of the advantage of the attacker where [inaudible] with
the constants that I choose. And usually you can just upper bound it by 1
because you say this is some kind of probability. It doesn't matter what it is. It's
some kind of probability. So this is like at most 1, and you just -- and now you
get a bound which is independent of the distinguisher and you get standard left
over hash lemma.
And we are saying, you know, for green applications we don't want to lose this
term. For green applications this term actually corresponds to a
cryptographically meaning faculty thing, which is, you know, the square of the
advantage of the attacker which for cryptographic applications could be small.
This is kind of a high level thing so I'm not going to give you the proof. There is
no novelty in the proof. It's kind of observations that we don't need to lose this
term.
All right. So now I guess we have a question to answer, and then this maybe,
like, 60 percent of the talk. So it's almost -- I guess I started at 2:35 so shall I
continue now? It will take me 20, 25 minutes.
All right. So maybe it's a good -- so people who want to leave, I guess, and I'll
take another 20 minutes and I'll try to finish within 20 minutes.
>> Kristin Lauter: [inaudible]
>> Yevgeniy Dodis: Okay. Good.
All right. Any questions before I move to the second part of the talk which is
essentially how to improve the seed lengths? Okay. Good.
So a better seed length. So first let me very quickly talk about some approaches
before I tell you our approach. The first approach you can see, maybe we don't
need perfect universality of the left over hash lemma for universal hash function.
Maybe we can eliminate universality. And it turns out if you eliminate
universality, if you optimize this value gamma, if you make [inaudible] gamma
with 2 to the v, you could reduce the seed link from being [inaudible] in the input
lengths to being [inaudible] output lengths which is a significant saving.
So this is very good [inaudible] for extracting, let's say, AES key because v is
like, you know, 128. This is very good. But at least in theory, if I have a lot of
entropy and I want to extract all of it, it's not very good. This entropy is
[inaudible] in the source length, it's still not very good.
And also it's possible -- well, anyway, let me -- but [inaudible] functions actually
seems, like, to be slow to construct. You know, if you really want a short seed for
almost all universal hash functions, it seems a noticeable slowdown compared to
perfect universal hash functions with long seed. But this is [inaudible] I'm not
sure if I'm correct here. That's based on my knowledge.
And you lose algebraic properties. It's almost universal hash function [inaudible]
product and so on. So anyway, we'd like to do better. And recall in theory I can
do really, really well, right? In theory I can do well. And there is essentially an
infinite list of papers, including two of these top folks, which kind of tries to make
the seed lengths kind of closer and closer to this number. I think that it actually
stopped in the last few years. I think the current record is this GUV07. It comes
very close to it, but it -- in practice, also beautiful things. There's a lot of
theoretical applications, but this is really an inefficient construction so nobody's
going to use them in practice.
So the question is can we really have something with the same simplicity as left
over hash lemma with the short seed lengths? And what we're going to do is
something very natural. We're going to define the expands and extract approach
which essentially -- I will come back to this -- okay. So there is some funny
pictures so let me explain what it means, I guess.
So recall we are trying to get a very low seed length. And now I'm introducing
one less letter [inaudible], k, which is a security parameter, because I'm going to
use some cryptographic scheme in a second. I'm going to use [inaudible] in a
second.
So everything is kind of upper bounded by security parameters, so in theory we
can hope to achieve seed length which is kind of [inaudible] in security parameter
if you want to extract for cryptographic applications. But left over hash lemma
forces us to have this huge seed.
So an idea, of course, for cryptographers is very natural. Use a pseudorandom
generator. The pseudorandom generator is something that takes a short seed
and kind of allows you to get a big seed which is computationally distinguishable
from a short seed.
So this kind of is a very natural suggestion. You have a very good extractor
which requires, you know, a long seed length, red. Red is bad. Green is good.
So what I'm going to do, I'm going to take a short seed, I will expand it using a
cryptographically secure pseudorandom generator. This is how I define my
extractor.
>>: [inaudible]
>> Yevgeniy Dodis: So you're jumping to the next slide, but as a seed, it has to
be public, right? It will have to be public eventually, but, yeah. Good point.
But for now the idea hopefully is natural. We'll try to see if the idea is secure in a
second.
So this hopefully is avoiding reading all the [inaudible] papers a way to get a
short seed, right? It's very friendly because stream ciphers pseudorandom
generators, you can kind of expand them online. It's very friendly to streaming
sources, so it's a very practical idea. And it can result in very fast implementation
if you use a very fast stream cipher and fast extractor.
So our hope is, of course -- because we use a computational assumption, our
hope is that the extracted [inaudible] will be random because intuitively, I mean,
we will give our probably statistical security, but for cryptographic security it
should be good enough.
And the question is, is this idea sound? Unlike cryptographic hash functions
where we need to make some random oracle things, here it will be, like, based
on a very nice assumption like pseudorandom generators.
So shall we take a poll? Who thinks that this idea is sound and who didn't read
my abstract? All right. Let me now take the poll. So here is a trivial observation
which is easy to show. The trivial observation is as we extracted bits of
pseudorandom, if I only give you the extracted seed, the expanded seed, if I give
you a big set, then the extractor thinks I'm pseudorandom because otherwise if
there were not, you could tell apart really true big -- a long random seed from
pseudorandom seed. So this is easy. This is like a simple [inaudible] argument,
right?
But unfortunately this just tells me that if I want to use pseudorandom bits instead
of -- if I want to use left over hash lemmas pseudorandom bits instead of 1 million
random bits, I need to publish 1 million pseudorandom bits, but the seed length is
still 1 million. I want a very short seed length.
So what I need to argue, I need to argue that even if I give you the short 128 bit
seed used to expand my stuff, even then the attacker cannot tell apart random
from pseudorandom. So is it clear? So to really call it a short seed, I need to
argue this. I need to argue that it's safe to me. It potentially gives a seed of
PRG.
And now it's not clear because, you know, if I give you have the seed, it's no
longer pseudorandom, right? Okay. So this is kind of the subtlety. And indeed
our first result will be three theorems that I will go through quickly.
Our first result says that under a standard cryptographic assumption code DDH,
there exists a pseudorandom generator G, in fact, a very nice pseudorandom
generator, and a very natural extractor X -- in particular, it's a universal hash
function -- such that this assumption is not only false, it is as false as it can get,
so this conclusion.
Namely, there exists a distinguisher who breaks this -- who distinguishes those
two distributions. His advantage is essentially 1 on any source, even uniform
distribution. So this is, like, really, really bad.
So therefore in general the expands and extracts approach is insecure. So let
me give you the counterexample and then despite saying that it's insecure, I will
show you two ways to salvage it that actually show this approach is secure just
by the counterexample.
So first let's see the counterexample. It will fit on one slide. I hope most people
know what the DDH assumption is. Let's say you have some prime order group
with generator g. It just says that those two distributions are indistinguishable.
So under the DDH assumption I need to tell you two things, pseudorandom and
extractor. So let's start with the pseudorandom generator. The pseudorandom
generator is this ugly guy, but it's arguably not so nice. If it's a [inaudible]
pseudorandom generator it takes like three numbers and outputs like six
numbers, like various kind of things.
And essentially not always it's a very [inaudible] proof why this is a good PRG,
but this is a special case of what is known as Naor-Reingold [inaudible] function,
and there is a [inaudible] argument which shows that this is a secure PRG.
So here just hopefully -- just take it for granted that it's not the best thing, but
under this assumption this is also pseudorandom. So if I want to prove it, I'm just
take it for granted.
So now I need to define extractor. And the extractor I define is essentially
metrics multiplication of the exponent. So my source is going to be three
exponents, my seed is going to be six bases. It's a long seed, but never mind, it
doesn't matter. In fact, we know it should be kind of long seed. This is my
extractor. It outputs to kind of [inaudible] in this case.
So I need to check that this is universal hash function. This is easy. So for any
two inputs x, y, z, x prime, y prime and z prime it's easy to see that if I choose A,
B, C at random, the probability of this collision is 1 over the group size. It's a
[inaudible]. The same thing about D, E, F, so overall the probability of collision is
1 over p squared which is exactly the range size. So here the range is not bit
strings but let's ignore this little detail.
All right. So this is a perfect universal hash function. Now let's see if I try to
extract but not using random things but using pseudorandom things. All right?
So -- and you plug it in and you get some ugly formula, and the way to parse it is
to notice that essentially our PRG has this property that every guy on the right is
a guy on the left raised to the power c. G to the a, cg to the b, cg to the abc. So
the same thing here with some kind of combination, but the point is the second
guy here has an extra c compared to the first guy.
So this is equal to -- essentially it's [inaudible] raised to the power of c so I'll let
you stare at it for a second. Hopefully it's easy verification.
In particular if I give you this seed which we don't particularly include c, you can
just easily test. You get two strings, you take the first one and raise it to the
power of c. If you get the second, you know that you're in the extracted world
because in a truly random world this never happens. So this is a
counterexample.
So now I will tell you two kind of positive results to complement this -- yeah?
>>: [inaudible]
>> Yevgeniy Dodis: So I need something like DDH. You'll see on the slide next.
>>: [inaudible]
>> Yevgeniy Dodis: Yeah, yeah.
>>: Why do I care about this? Can you say it again?
>> Yevgeniy Dodis: Oh, so essentially I'm showing -- the question is is the
expand and extract approach secure, is it safe for me to publish the seed for the
PRG and then you expand the PRG, and I'm saying it's not secure because there
is a counterexample.
I give you some -- you know, not so artificial PRG, not so artificial extractor such
that this approach is very, very insecure.
>>: Why is this extractor actually statistically secure?
>> Yevgeniy Dodis: Because it's universal hash function. So I argue that this is
a perfect universal hash function.
>>: [inaudible]
>> Yevgeniy Dodis: So what I'm saying is I need to do three independent things.
The first independent thing is I need to tell you a pseudorandom generator. This
is a pseudorandom generator.
The second thing I need to tell you an extractor. And by extracted, I will tell you
even more than extracted. I will give perfect universal hash function.
So I need to prove that this is a perfect universal hash function. Here is the
proof. This is a perfect universal hash function. Left over hash lemma tells me
any perfect universal hash function is an extractor. This gives me this bullet.
Now the third bullet is why is the composition kind of insecure? I gave you the
composition. It's insecure. So okay, good.
All right. So now how can we salvage this result? So first observation we are
saying this result is -- so we claim that you can actually prove security with this
approach if you extract very few bits. So what I mean, intuitively we show that
the extracts and expands approach is secure when the number of extracted bits
v is informally -- there are quotation marks because it's undefined. I'll define it so
it's legal. So it's less than log of the PRG security, in quotes.
So more formally what it means, it means that if my PRG is secure again in
circuits of size -- you know, essentially exponential in v over epsilon, then this is
okay. So this is what I mean by log of PRG security. So roughly speaking, as
you can see, the number of extracted bits is log of the PRG security. The PRG
means the size of the circuits at which pseudorandom generator is secure.
So this is a statement.
And the second observation is actually very strange. Remember that I told you
that when you use a PRG it seems like, well, you have to settle for computational
indistinguishability. The extracted bits will be computationally secure. It turns out
we're using the PRG security in a different part of the proof in some weird way.
We actually get a regular extractor. We can show that as long as this is a good
PRG, we get the regular extractor. We don't have to settle for computational
distinguishability. The bits are statistically random. So this is in some sense a
little bit weird, but -- so this is a statement.
And the third note is, yeah, the min-entropy is presented, but we lose a little bit
on the error, at least in our proof. In our proof is shows that the error has to, you
know, potentially increase, but it's still reasonable.
So this is essentially the theorem statement. So the corollary, if you believe the
theorem statement, the corollary is that it's always safe to extract because if k is
security parameter because pseudorandom generators are secure again
[inaudible] normal size circuit, it's always safe to extract logarithmic number of
bits in security parameter.
And if you assume exponentially hard pseudorandom generator, you can even
extract some small constant fractional security parameter depending on the
security of the PRG.
So I put a smily face. It looks like it's a great result, so maybe we'll just make it
PRG with big enough seed lengths to kind of -- let's assume exponentially strong
pseudorandom generator. Let's hope the yes in the countermodel is a strong
enough pseudorandom generator and let's just, I don't know, get a good
parameter.
Unfortunately, if you think about it, what are the n seed lengths you can achieve if
you believe that theorem? I didn't tell you how to prove it.
I claim that the best seed length n -- so if I need a PRG which is secure again in
circuits of this size, its seed has to be at least log of the size because I can
always [inaudible] through all the seeds. So if the seed is very short, I can just
explicitly [inaudible] over all the seeds until I hit the guy. Right?
So if you take log of this number, you get order of v plus log 1 over epsilon. And
now to check if you guys pay attention, we'll know how to read the last bullet.
This number appeared somewhere in the talk. It appeared -- if I use almost
universal hash functions, which I know they're slightly slow. Maybe they're
perfectly -- well, actually, faster -- well, anyway, if I use almost universal hash
functions with a small key which is not such a huge loss, there are some pretty
efficient construction, I can already achieve it without assuming exponentially
strong pseudorandom generators.
So this is kind of -- I view the theorem as just a nice structural result, but if you
want to use it in practice it's much better to just implement almost universal hash
functions as appropriate parameters than, you know, use exponentially secure
PRG and use perfect universal hash function. So that's why there is a frowny
face here.
Let's see some of this interesting result. So in the interest of time, I had two
slides how to prove it, but if there are questions I can either go through slides or
just tell you intuitively. Maybe I'll tell you the intuition and I'll skip to the next
thing.
But just intuitively, the PRG is used -- the security of PRG is used to argue that
this size you can test if a particular seed for the extractor is good or not. This is
kind of a high level idea of the theorem, but otherwise I was going to move to the
most interesting kind of part because -- so let me move to the last theorem.
The last theorem is arguably the most interesting for this part of the talk. I
claim -- so I'm going to show that this approach is secure in Minicrypt. So let me
explain what it means.
So notice [inaudible]'s questions that our counterexample used decisional
[inaudible] assumption. It's a nice assumption, but this is like some kind of public
key assumption. Our expands and extracts approach has nothing to do with
public key cryptography. We use a pseudorandom generator which is a
symmetric key, an extractor which is a combinatorial object. So why did we use
an assumption from a public key world?
And we kind of show that this is necessary. So let me explain why this is
necessary and then what are the ramifications.
So Minicrypt is one of those imaginary worlds where we believe that
pseudorandom generators exist and the public key encryption doesn't exist. So
let's just assume for a second we live in Minicrypt. It's a consistent world, as was
shown by Impagliazzo and [inaudible].
So the third result is this approach is secure in Minicrypt, and it will tell you next
slide why this is an interesting result. But for now let's believe in Minicrypt and
let's look at the result.
And this is true for any number of extracted bits, so you don't need to settle for
small [inaudible] extracted bits, so everything is great. But you do have to settle
for pseudorandom bits, just as we expected originally. And also I can only prove
it for efficiently samplable sources, which is arguably reasonable for most
applications.
So let's translate. This is a beautiful one-line statement, but it's very confusing,
so let's see what we need to prove by counter positive. I need to argue that if, for
the sake of contradiction, expands and extracts approach is insecure, then I'm
not in Minicrypt. Namely, if there exists some efficiently samplable samplable X
and some pseudorandom generator G such that some distinguisher can tell apart
extracted bits from true bits, then I'm not in Minicrypt. Namely, what does it
mean I'm not in Minicrypt? Public key encryption exists. So this is kind of a
weird way, but do you agree that this is the right statement?
>>: In other words, to construct a public key encryption scheme, I just need to
construct a pseudorandom generator and show that this approach works. That's
what you're saying?
>> Yevgeniy Dodis: Yeah. Exactly. He's jumping ahead to the slide, the next
one. Right. But for now -- so good, excellent. So I'll give you another kind of
twist on this result.
All right. So in particular -- so this result I'll show you on the next slide is, like,
one line, very short proof despite the long statement. This is kind of a win-win
result.
Essentially we're saying that either our approach is secure or intuitively we can
construct public key encryption from pseudorandom generator, which would be
also a surprising result which we believe is actually false.
So this is kind of a win-win result. Either something we hope is true or something
else we hope is true. So at least one of the good things is true.
So this is similar under spirit to some other results in the literature which are very
cool results, but unfortunately -- those results unfortunately were proven in our
results, but the history should be reversed, it should have been the other way
around, because our result is the simplest one. Each of those results was like a
separate beautiful paper. Our thing is one slide.
So let me just tell you the proof.
>>: Maybe I'm confused, but if we do believe PKE exists, then what is this
theorem statement?
>> Yevgeniy Dodis: Next slide. Let me prove the theorem on the next slide and
I'll give you all the essentials. So despite [inaudible] I'll finish in time. I'm almost
done.
So this is the statement of the theorem. So here is my public encryption. So this
is my assumption, this is my public key encryption. The secret key is going to be
the seed for pseudorandom generator. The public key is going to be the output
of the pseudorandom generator.
To encrypt the bit b, here is what I did. If bit b is 1, I just send a random string. If
bit b is 0, I sample x, which is efficiently samplable by assumption, and I simply
use public key to extract from x and I send it out.
So now first I need to tell you how to decrypt. Well, I don't know how to decrypt,
but likely there's a theorem that tells me that there exists some magic
distinguisher who can tell those things apart, and the magic distinguisher needs
to know the seed, which is a secret key, so he's going to decrypt from. So likely
it is a [inaudible] because I have no idea how to do it. Good.
Why is it secure? Well, that's exactly the three-wheel argument that I showed at
the beginning. The three-wheel argument is that [inaudible] if you want to know
the public key, which is the expanded string, so with the security of PRG and the
security of extractor, he cannot tell them apart because if he could, he would
either break the extractor, which actually he can't because it's additionally
secure, or he has to break the security of PRG.
All right. So this is the proof of the theorem.
>>: [inaudible]
>> Yevgeniy Dodis: [inaudible] so assuming distinguisher is perfect, yeah. Good
catch. Yeah.
>>: So the distinguisher might only work for intuitively many [inaudible]
>> Yevgeniy Dodis: Good. So, yeah, in general this is like the gap between
what we call break and what we call security. For security we want, like, to hold
for everything. For break -- you know, so because of this [inaudible] it's a subtle
point. Good.
Yes?
>>: [inaudible]
>> Yevgeniy Dodis: [inaudible] so now interpretation. Back to Tom's question.
Why do we care about Minicrypt, because hopefully we don't live in Minicrypt.
Otherwise all this e-commerce and [inaudible] credentials will be dead, so why do
we care?
So here is the real corollary of our result. In equivalent [inaudible] result, we will
use the following. Let G be pseudorandom generator. I assume -- I make a big
assumption. I assume there exists no public key encryption with the following
properties. The secret key is S, public key is G of S, and the cipher text is
pseudorandom and the security of my encryption scheme is tightly related to
security of PRG.
So essentially I'm trying to tie your hands in terms of constructing encryption. I'm
not giving you arbitrary freedom. I'm saying assume you cannot build an
encryption with this really weird and restrictive properties, cipher text
pseudorandom is the same security as a PRG and such very weird things, right?
Which is exactly the encryption we constructed in the theorem, if you think about
it.
Then the expand and extract approach is secure with this pseudorandom
generator. So this is kind of a counter positive -- yes?
>>: [inaudible]
>> Yevgeniy Dodis: Yeah. In particular, we'll hopefully use it as AES in a
second.
So one thing is if G is, let's say, [inaudible] to the random generator, then the
theorem is [inaudible] because there exists an encryption scheme. I can build
like a [inaudible] encryption visual kind of help -- some use less things in there -public key which will have this properties. So, yeah, if the pseudorandom
generator is based on some kind of number theory, I mean, this theory is
[inaudible] because the pre-condition is false and we can build an encryption.
And indeed we have a counterexample.
But if these are PRGs such that we don't believe that we can build such an
encryption, then the expand and extract approach is secure.
So now if you look at the practical PRG, now you ask the question, which PRG
will I use in practice? Will I use a slow [inaudible] pseudorandom generator
which does exponentiations or will I use AES or some other stream ciphers like
RC4 or something like that? Probably you will want to use fast stream cipher.
Otherwise, you know -- and for fast stream ciphers, likely this seems to be true.
From what we know, it's very unlikely that those guys will yell such a public key
encryption.
And here is a few more justifications. For example, if you even forget about what
PRG I use, if I just tell you, look, give me a black box construction of a public key
encryption from a PRG with these properties, use anything you want, fully
homomorphic encryption, music, you know, anonymous credentials, whatever
you want. We don't know how to do it because a public key is so restrictive.
You're not allowed to put, like, CRS or anything like that on the public key. It's
like -- I don't know. I don't know how to do it.
Okay. And, also, as far as we know the [inaudible] could be any encryption
scheme which is S secure as AES. It's entirely conceivable that if quantum
computers exist, I cannot get, like, for example, tight security with action. So
maybe I can build it from AES, but from all we know, you know, the security is
actually a little bit tight. In fact, it's conceivable there are worlds, like some kind
of extension of [inaudible] where, you know, this is -- that no public encryption
could be as secure as AES.
So let me move this point. So hopefully it's a believable assumption. It's a
practical thing. And I think if you do it for AES, I think it would be a major
breakthrough. I don't know. It would be a very surprising encryption based on
AES.
So the moral is we give a formal evidence and expect this extract approach might
advance to be secure in practice where you actually use ciphers as opposed to
beautiful mathematical ciphers despite being insecure in theory and not having a
reductionist proof of security.
So very briefly, I'm moving to some conclusions, just some extensions of our
results. You can ask questions in one minute when I'm done.
So it extends to almost -- so, I mean, so far I stated all the results were perfect
universal hash functions, but everything just works for almost universal. This is
kind of a subtle but point extension. In cryptography we never have, like, a
source which is a sample out of the blue. There is usually -- let's say for DDH
this will be g to the xy and this will be like g to the x and g to the y.
So you need to sample some side information [inaudible] notion of conditional
entropy. So our results are [inaudible] conditional entropy. Because the problem
is the generic way to go from conditional min-entropy to worst case entropy,
loses the fact of log 1 over epsilon, which is exactly what we saved. So if our
result didn't extend it would be kind of [inaudible]. You say the factor, but then
you have to lose exactly the same factor. But likely everything works.
And we didn't expand it so far. I talked only about key generation, but left over
hash lemma presumably has kind of other [inaudible], other applications. So as
far as I know, this technique knows these applications. If you can bound the
variance of the distinguisher for whatever applications, you will get an improved
result.
And the final point is, you know, there were, like, two disjointed parts of the talk,
and you can kind of combine them together to have something very practical
which will help both your short seed lengths and, you know, if you believe the
theorem of the corollary on the previous slide, you will have also improved
entropy saving. So this is a very quick example just to show how [inaudible] it is
to construct it and then I move to conclusions.
So here is, for example, a suggestion. It might not be the fastest one, but it's,
like, so simple and, you know, it
overcomes all the limitations. So the seed could be just a random 128-bit string,
but you view it as a field element in a field of size -- sorry, this is [inaudible], but
anyway. Sorry, this is wrong, but this is the [inaudible] 128.
But, roughly speaking, take the source, you split it into chunks of 128 bits each,
and you [inaudible] a temporary key which is this inner product, AES1, AES2,
AES3, this is kind of a stream cipher, and you take the inner product. So the
inner product is a good universal hash function.
And here I'm just using pseudorandom seed. So you just do this kind of thing.
This is a temporary key. This would already be good for every green application,
but if you want to be good for even for red applications you compose it with a
weak pseudorandom function, which AES is also. So you just take this
temporary key and elevate it with another random point AES of 0, which is your
final key.
All right. So this would be, like, you know, a very particular -- I mean, hopefully
very fast in some very simple extractor. So if you believe what I stated today, it
will have both short seed lengths and improved entropy loss.
And final key could be used for any applications, including like PRFs and stream
ciphers because I use a big PRF thing over here, and, you know -- and we get
entropy safe by using this random point because it's a big PRF.
All right. So let me move to the summary. In one minute I'll be done.
So we can improve large entropy loss and seed lengths. So for entropy loss we
can go from 2 log 1 over epsilon to log 1 over epsilon, and using this trick, this
double PRFs, it essentially kind of works for all applications. And for the seed
length the most interesting result is even though the expands and extracts
approach is generically insecure, it is secure -- probably secure if you use this as
practical ciphers. And the paper can be found on eprint.
[applause].
>>: [inaudible]
>> Yevgeniy Dodis: [inaudible].
>>: [inaudible]
>> Yevgeniy Dodis: Oh, this is like paid sick leave [laughter]. I'll trade you the
source if it's okay for me not to pay you 10 bucks for this [laughter]. Okay. I
don't know.
>>: Question.
>> Yevgeniy Dodis: Yes?
>>: Why did you work on this project? What led to you work on this?
>> Yevgeniy Dodis: Oh, because -- well, if you didn't see, I mean, I had, like, 10
papers before on this subject, but -- well, I mean, essentially I just cared -- you
know, I like left over hash lemma. We use it for all kinds of things. But, you
know, when I talked to [inaudible] who is a practitioner, he always says -- you
know, he has this new standard. He says that people should use cryptographic
hash functions and so on, and even worked on a way under which conditions
these cryptographic hash functions are good extractors.
So essentially -- well, it was just I like randomness extraction, and I guess -- but
this project was kind of a combination of two independent collaborations with two
different people. We just realized that [inaudible] topics that are interesting.
So the seed length thing was essentially -- notice that none of the results is
beautiful extractor literature and it's just some things that at some point -- I don't
know. I think somebody asked me -- one of my co-authors asked me why don't
you -- why do you need to use all this complicated stuff? Why don't you just
expand this PRG? And then I explained to him, I said, no, no, it cannot be true
because it leaves a seed, and that's how it kind of all started.
>>: So in this corollary you have this thing in the second part.
>> Yevgeniy Dodis: Yeah.
>>: So, I mean, can you derive concrete kind of security statements in this
setting? And how would you set security parameters based on such a weird
assumption?
>> Yevgeniy Dodis: So the short answer is -- if I make an assumption explicit,
then I can. And actually in the paper there is a section ->>: [inaudible]
>> Yevgeniy Dodis: Right.
>>: [inaudible]
>> Yevgeniy Dodis: Right. So essentially it's a heuristic assumption -- right. So
in the paper -- so the short answer -- I mean, I'll try to answer now, but, you
know, in the paper for the extractors they put with this AES and so on, I have
some heuristic security bounds, but roughly speaking, the way you get complete
security, you're right, because kind of an incomplete thing.
So -- let me see. I'm just blanking out because, I mean, this question is explicit -we'll talk offline, but it's a good question. So you do need -- sorry, I'm just
blanking out a little. But you do have to make a heuristic -- the heuristic jumps
that you make is essentially you say that -- you assume -- right, yeah. So the
way you get complete security, you assume, then when you use expands and
extracts approach you lose essentially the security of PRG. So epsilon drops
with the security of the PRG. So this is -- again, there is some justification in the
paper why. But essentially to get complete security bounds, whatever PRG you
use, whatever security you assume from AES or whatever, you just add it to the
final bound.
There is a heuristic step here, but in the paper we kind of argue that if you look at
the proof of the theorem, this seems to be a reasonable thing to do. So with
heuristically, yes. There is some heuristic [inaudible] ->>: [inaudible]
>> Yevgeniy Dodis: So, yeah, based on ->>: [inaudible]
>> Yevgeniy Dodis: Right. So in the paper it kind of argues that if you look at
the proof of the theorem, the public encryption is kind of tightly secure so ->>: [inaudible]
>> Yevgeniy Dodis: So it doesn't really matter. So here we can have an
asymptotic statement, I mean, aside from this sight security, but if you just trace
the proof of the theorem -- I mean, it's argued in the paper that the heuristic step,
if you want to have complete bounds, is indeed as I told you. Just add the
security of PRG to the final bound. And there is some justification in the paper.
I'll tell you offline, yeah. I'll tell you offline. But that's an excellent question.
It bothered me also that we didn't have this quantitative statement, so, I mean,
it's [inaudible] out in the paper.
>>: [inaudible]
>> Yevgeniy Dodis: You mean for the entropy loss?
>>: Yeah.
>> Yevgeniy Dodis: [inaudible] they just showed that for any extractor, it doesn't
have to be left over hash lemma. For any extractor there exists a distinguisher,
you know, essentially having -- essentially for any extractor flipping the
parameters, but roughly speaking, they show that just the combinatorial things, it
is impossible to deal with an extractor who has entropy loss less than 2 log 1
over epsilon. We're doing some kind of counter [inaudible] by just arguing that
you can always have an attacker with a ->>: In your case, like in the work here you have log 1 over epsilon?
>> Yevgeniy Dodis: Yeah.
>>: You're saying if you don't ->> Yevgeniy Dodis: Oh, I see. Yeah.
>>: [inaudible]
>> Yevgeniy Dodis: Yeah. It's just a silly -- possibly it's a silly reason, but
believe it or not, in practice -- I'm just saying that if I really want a random bit
string, and even if I know the distribution -- I'm just giving you -- the distribution is
uniform from 1 to p where p is not exact power of 2, if p kind of is in between two
rows of k and two rows of k plus 1, for stupid reasons you can just do both in
[inaudible] because it's like uniform distribution. It doesn't matter how you do it.
Essentially [inaudible] is to drop some of the more significant bits. And if you just
see it you say if you drop less than log 1 of epsilon, the statistical distance will be
less than epsilon. It's a stupid counterexample, but unfortunately -- I guess,
yeah, it shows that this optimal.
>> Kristin Lauter: Thank you.
>> Yevgeniy Dodis: Sure. Thank you.
[applause]
Download