22969 >> Ben Zorn: It's a great pleasure to introduce... of Massachusetts. Emery is a long-time friend and colleague. ...

advertisement
22969
>> Ben Zorn: It's a great pleasure to introduce Emery Berger from the University
of Massachusetts. Emery is a long-time friend and colleague. He's worked with
us in many different capacities. Emery's gotten numerous, numerous awards,
including an SF Career Award, MSR Fellowship Award as a graduate student
and teaching awards, et cetera.
Emery's done work, as probably many know, in areas related to security,
performance, reliability. Today he's going to be talking about security; in
particular how to use randomization to thwart the bad guys.
>> Emery Berger: Thanks, Ben. So the title, of course, as you can see is
DieHarder, securing the heap. So obviously before DieHarder came DieHard, if
any of you have watched the films. I certainly hope you have.
So I'm not really talking about the movies, of course, I'm talking about a project
called DieHard. DieHard is work that I did together with Ben a few years back.
And this work eventually ended up being the inspiration for Windows 7's fault
tolerant heap. So hence the little I really need to get one of those T-shirts.
Maybe not the whole thing was my idea. But anyway, so anyway, so what is
DieHard? So DieHard is about probabilistic memory safety for C or C++
programs. So it's a reliability story. And it's all about dealing with programs with
bugs. So the goal of DieHard was to be able to withstand memory errors. That
is, the kind of bugs you get in sort of run of the mill C, C++ programs. So here is
an example where we've got some code and let's see if the clicker works.
Slowly. Maybe. This is a good sign. Nothing is happening and the elapsed time
clock on my, on the presentation timer is stopped. Yes, that's a good sign.
Yeah. It's one of those. Yeah. It came up. Awesome.
>>: Running on top of DieHard.
>> Emery Berger: That's right. There you go. So actually in fact what's
happening is, this is maybe not a very interesting technical detail, but it turns out
that the Mac OS has a really, really broken virtual memory subsystem. So it very
aggressively caches pages that you touch. So aggressively that it pages out
other stuff.
So presumably some part of the presentation went away. And probably the
PowerPoint code itself then got paged back in. It's pretty bad. It's so bad that
every now and then you actually, to get it to restore, I go to the command line
and type in the word "purge". So and that forces it to dump all of its pages, and
you can end up literally where you're running in free space where there's just a
handful of K available. K. Okay. And then you type "purge" and suddenly
there's 500 megs. So meanwhile it's swapping. It's horrible. All right.
Anyway, back to DieHard. So here is an example of a kind of bug you can get in
C, C++. Here print foo actually calls delete. If you invoke print foo twice on the
same object you get a double free. And in plenty of heap organizations this can
result in heap corruption. Another example, here are two examples. So you are
free to invoke delete or free on anything.
And that can include things that are part of the stack. Many memory allocators
will happily accept a pointer to some region on the stack and make it available to
you for use to satisfy future malloc requests which leads to some very
entertaining and terrifying bugs. Like when all of a sudden your stack variables
change value because there's a heap object changing. It's exciting. You can
also inadvertently free an object in the middle. So you can free to a pointer that
is to a heap object but it's sort of inside the heap object, and this can screw
things up as well.
So you have to be very, very careful. And, of course, there's dangling pointers.
So dangling pointers where you free something too soon. You actually had a
pointer left to it that you forgot about.
So here we have a pointer to some foo object that's initialized with happy. We
have a pointer X that points to the same thing. We delete F. Now we make a
new object, G. So we've allocated new object in most allocators, that memory
that was just freed up will be recycled. So now when I print out X info over here,
it will probably say sad and not happy.
All right? So and then, of course, the classic error. So that was a bunch of
errors. Then there's the, I saved the best for last, which is the classic memory
overflow. Right? You underallocate an object with ten here. Ten is just a
random choice of number. And then you go beyond it and you write something
and you land on some other data. All right. So here's all these bugs.
So what's the deal with DieHard? The deal with DieHard is DieHard is meant to
prevent or probabilistically tolerate these bugs. All right? So it replaces the heap
entirely with a new heap organization.
And the heap organization is bit map-based which is quite unusual. So there's
no actual pointer, metadata. There's no graph in the heap. It's just these bits.
And each bit corresponds to an allocated object. If it's set to one it means it's
allocated. If it's 0, it's free. And so you can see right away, for example, that
double frees go away because when you free something you just set it to 0. If
you do it twice, you're just setting it to 0 again, it has no effect.
What about allocation? What happens? So when I need to do allocation, I
actually do it at random. So I don't just pick, say, the first empty bit. I randomly
stir around. Thank you PowerPoint. And I find something and then I set the bit
to one. All right? And return the corresponding pointer.
So when I go to allocate another object, I'm going to do the exact same thing.
I'm going to randomly choose once again and I alight on this one and I return that
object.
So what does this buy you? So it avoids double freeze like I mentioned before.
The heap, I didn't really say this before but we also make sure the heap is bigger
than you actually require. We grow the heap on demand. So there's always
going to be a high likelihood of some empty space. So, for example, if you have
an overflow from here, and it lands on nothing, then it's benign. So this overflow,
it's sort as if it didn't happen. I should also mention that the issue of invalid
freeze can be dispensed with quickly by DieHard as well because it knows where
the addresses of objects are and it can cheaply identify them. So it never -- if
you say free a stack object it says no. It just ignores it. If you try to free in the
middle of an object it says oh, you meant this object that's the enclosing object
and it takes that instead.
So a lot of these problems go away. Some of the protection like this protection I
showed just a second ago with the buffer overflow is just probabilistic. All right.
There's one other thing which is about a dangling pointer protection.
So suppose this object which was allocated you freed but you actually have a
pointer into it? So now it's a dangled object. So the way that DieHard protects
you in this scenario is totally based on the fact that you're doing random
allocation. So the problem arises with the dangling pointer when you recycle the
object. You recycle the object, you fill it with something. Now that object is
corrupt. Here your chances of hitting that one red 0 are in this particular instance
only 1 in 8 for the next time. Of course for a realistically sized heap we have
millions of objects. The chance is one in a million and so on and so on. So you
have a high likelihood this object will remain untouched until you actually are
done with it.
All right. And it is for some foreseeable circumstance. All right? So lots of -yes?
>>: When you do these -- choice in fact you increase the size of the churn, right,
as well, you have overhead of the memory allocated?
>> Emery Berger: So I'm not exactly sure I understand the question.
>>: Okay. Let me rephrase. When you say that, which one of those bits is
actually being chosen for the next allocation, this is like a buddy system where
you say, oh, the first is two to the power of I and then the same five plus one, et
cetera. If you [inaudible] then you're going to allocate churn the size two to the
power of I plus or minus something, plus something obviously and then it's not
exactly the size that you needed but something slightly bigger because it goes
into a different place or something.
>> Emery Berger: So I guess the question you're asking is: If I allocate objects
of size eight and I allocate objects of size ten, are those in the same space? And
in fact it turns out that you can't really do this at all if you don't segregate things
by sizes. And the reason is that you would otherwise get catastrophic
fragmentation.
The intuition is very simple. If you allow people to allocate one object, one tiny
little object at random, and you allocate enough of them, then you'll completely
fragment the heap and you'll never be able to put a big object anywhere. So they
are segregated. Right now for the purposes of this example. I think in the
current implementation it's base two but it's not required. It could be any base.
>>: So my question was ->> Emery Berger: Any reasonable base.
>>: Would the protection imply that you're going to allocate something to the
power higher than necessary? So say, for example, you needed 16 bytes and
finally you're going to allocate 32 of 64 because of this probabilistic choice.
>> Emery Berger: Okay. So I think the short answer to your question, which I'll
try to rephrase is do I overallocate to protect? So I do overallocate in the sense
that I overallocate the entire aggregate heap, right? But I don't overallocate each
individual object.
So there is a probability -- let me go back a little bit for this buffer overflow.
There's a probability that this object will remain unallocated. But it's not certitude.
Okay?
So I'm still allocating, say, in this particular case pretend it's 16 bytes. When I
allocate or request 16 bytes then I get 16 bytes. It's just the fact that it falls out
from the fact that I'm randomly allocating in this bigger than necessary heap that I
get empty space. So I have a likelihood of empty space. It's actually a tuneable
factor how much this expansion is.
Right now this expansion by default is something like four thirds, but you could
make it bigger and the bigger you make it and the higher probability you have of
success of let's say correct execution in the face of buffer overflow. I should add
there's other work followed on this called exterminator which finds the bugs and
fixes them. But I'm not going to talk about it today. I'll be happy to talk about it
off line.
So there's all this good probability stuff for resilience. Let me advance past the
animation. The runtime, you know, as you can all see, the runtime is really great.
So the runtime is fine. That's not really the point of this discussion.
I brought the little space shuttle image here because the actual original
motivation here was to try to see how you could do redundant execution and get
something meaningful out. So the space shuttle used tri modular redundancy so
every single thing you do it three times, and you take the majority vote. It's not
clear how -- it's clear for independent hardware errors that this is fine, but if you
have a bug, bugs are deterministic. Right?
So then the votes would always be the same. So if you actually run multiple
replicas with DieHard and they all have their own independent random number
generators or random seeds, then their heap layouts will be distinct. They'll be
independent by design, and then voting would actually work. So we built a little
prototype system that could do this, too.
But you don't need to use replication. I just mentioned that because it's part of
the paper. You can just use this as a stand-alone replacement for the heap.
So the goal here was resilience or fault tolerance. Right? And it got brought into
Windows 7 in a sense as the fault tolerant heap. A little bit watered down. So in
particular, all this randomness stuff I mentioned earlier, that DieHard, it doesn't
rely on, the bit map representation and all these things are great, you get some
resilience, but it was eliminated. So there's no randomization.
So one of the things that we thought about, and actually I came to Microsoft to
talk to some of the Windows folks about this, when they're talking about Windows
8, now we care about security. Okay?
So security is important. And my argument was: Look, DieHard is random.
Random is really good for security. And it may be annoying for other reasons,
but it's really, really great for security. So then the argument was: Well, maybe
we don't need -- all right. Your corner, you're talking about randomness. Fine.
But in our corner what we're going to do we're going to pepper the heap with
canaries, little indicators that something has gone wrong and maybe we'll add
some page protection here and there and we'll do this and do that and do this
and it will be fine.
So you don't need the randomization. The randomization is to make it really
difficult for somebody who is exploiting a bug to actually locate objects or get
their exploit in whereas these things are made to detect or raise the bar or make
it just as hard.
So I contended that randomness was really important. And then we had this
discussion. And I realized we really didn't know -- we had no way of saying who
was the winner, right? It's just argument by intimidation. You know, I'm right.
That's not really very convincing.
So let me tell you not the way we discussed in the paper. I welcome anybody to
go look at the paper and see all the arguments that actually are mathematical of
the they talk about these exploits and threat model and all this stuff.
But instead I'm going to give you a flavor of the problem with a geographical
analogy that I hope will not offend too many people.
>>: I'm already offended. [laughter].
>> Emery Berger: Well, if you were Belgian you'd be more offended. But we'll
wait for that.
>>: Wait a minute. [laughter].
>> Emery Berger: All right. So here's France --
>>: [inaudible].
>> Emery Berger: That's good. That's good. You'll be happy, because France
is going to represent Windows. Okay? And Germany is going to be represented
by hackers. All right. So this is the black hats. All right. So, again, no
aspersions meant on Germans or French. But it's a great analogy. Right?
So Germany and France, some of you may recall, if you read your history books,
that France and Germany have a checkered history. And maybe not the best
relations throughout history. So in particular there have been lots of invasions,
so Germany invades. And then eventually there's a counter-attack and Germany
is pushed back.
And then there's a period of intervening peace. And then Germany attacks
again. And then there's a counterattack. So really this is, of course, meant to
represent the threat of hackers, right? So the problem is that the hackers come
and discover some vulnerability. Right? They're trying, during this period of
peace, there's sort of quiescence while they're trying to come up with the
vulnerability. They discover it. And there's some sort of countermeasure. In
fact, there's a whole history of countermeasures that have been put into heap
organizations both in Linux and Windows as well as in Mac, to try to deal with
these issues.
All right? So it used to be you could do this one thing and there would be a great
vulnerability. You could exploit it really well. They said: Oh, we'll fix that. We'll
make sure you can't do that. All right? So but then it's back and forth, back and
forth. And, you know, people at Windows and people -- Microsoft and people in
France got tired of constantly being invaded and having to push the invaders
back.
So what did they do? They decided to come up with the ultimate solution that
would prevent any further attacks. Or at least invasions, which is...the Maginot
line. So the Maginot line is, by the way, an amazing feat of engineering. We
tend to make fun of it in the United States, I think without understanding the
scope. In 1939, 1940 dollars, a little bit before '40, this was an unbelievable
amount of money. It was billions and billions of dollars back then. So it was
massive, massive undertaking, and each of these things, so if you go through the
countryside along the border with Germany, you'll occasionally see these pill
boxes. All you see are the cement pill boxes what you don't realize is all the stuff
underground.
It's really amazing. They have separate air supply. There's armored posts to
repel invaders, they have trains so they had electric railroads to connect the
stations. They had food. Air locks to deal with gas attacks. So really, really very
impressive.
So let's replace France with Windows. So let's say we go and we build the
Maginot line for Windows and we're going to prevent all further attacks. Okay.
We're done. Why do we know we're done? Because we put together this
impregnable fortress. And then we block all ways for any attacker to get in,
right? So here's Germany and here's France, and these little dots, these sort of
dashed line indicate less fortification, less dense, but really, really dense here
and less dense over here but we're really protected.
Okay? So what happened?
>>: Technology changed.
>> Emery Berger: No.
>>: People.
>> Emery Berger: No, there was a hole. So there was an unexpected
circumstance. The unexpected circumstance was that Germany invaded
Belgium. They invaded Poland. That was different. They came through
Belgium. It was neutral and therefore untouchable. You didn't really need as
dense protection on the border with Belgium. And the Germans very quickly
went through Belgium which wasn't in a position to put up much resistance,
effectively went around the Maginot line. It wasn't that the Maginot line didn't
work. It just really wasn't tested. Because they found another way.
And so this is, you know, really a pretty fair analogy to the situation that we often
find ourselves in. We're always sort of fighting the last war. So the last war is,
man, the Germans just come across the border and come right in and we're
going to stop them from doing that. Then it's oh they come through the Belgian
border. Next time we'll seal off the Belgian border too and then next time next
time.
>>: One subject in 1914 they came from Belgium and what's this? [inaudible]
creating that?
>> Emery Berger: I'll say it came from Encarta. No, I have no idea where I got
the map. I don't remember.
Anyway, so here we are. So the question is today: Are we done? Are we
secure? Right? Or is there another hidden channel that we're unaware of?
Right?
So the challenge is to actually know that we've done something. Right? If we
build some defense, have we actually improved security? Are we just putting
bars on the front windows and leaving the back door wide open?
All right. So for the rest of this talk, as before with DieHard, I'll just be talking
about the heap, which is scary. As you can see. So the heap is -- there's lots of
work that focuses on code. Code is much more structured. It's a much more
well understood problem. The heap is more problematic. And again a lot of
these problems go away in the world of managed code. So if you're a managed
code bigot, you know, this talk is over for you now.
Because it's only, you know, or if you believe programs can never have bugs, if
you can get a program with no errors, then it's also not a problem. Right? So
every exploit depends on the unsafe programming language substrate and an
error.
No errors, no problems. All right? So, again, rather than go into gory technical
details but to stick with my belligerent theme, I'll do an analogy on Stratego. How
many have played the game or are familiar with that game?
>> Emery Berger: Really. Man. Yes, every time, it's getting worse and worse.
But there's a clear age divide. So Stratego is a board game, which means it was
played with physical objects.
People would get together and not be near electrical sources. [laughter] and play
these games. So the idea behind Stratego is you have two players. And the two
players face off. And you can see that there's red pieces and blue pieces.
Pretend we are the blue player.
So the blue player can see these little faces on his side, but on his opponent's
side can't see anything. Just red. And obviously it's the same situation for the
other one.
And in effect Stratego is a fancy game of capture the flag. All right? So there are
all these different pieces. And the pieces have these different powers. You have
a small number of very powerful pieces and a large number of cannon fodder,
let's say, and basically the higher number indicates that they're weaker. So if a 9
touches an 8 the 8 wins and the 9 piece is taken off the board.
And should you find the flag, when you move your piece, and you can move your
piece up and hit it, if you get the flag you win. All right?
So in addition to all these pieces, there's also the little moving pieces, there are
also bombs. So think of these as mines. If you hit a mine, unless you're the
miner piece, then your piece blows up. All right?
So let's use a very simple model where we only have two kinds of pieces. Okay?
We have flags and we have bombs. All right? And our opponent, instead of
actually having to move a piece linearly, right, going tap, tap, tap, they can reach
down like the hand of God and touch a piece. Okay?
If they touch a flag, they win. All right? And if they touch a bomb, then they lose.
Okay? So right now you can see in this board, they have a 50/50 chance of
winning. 50/50 chance of losing.
So this is meant to represent a pretty conservative threat model, right? The
attacker can exploit any object, can reach any object on the heap, right, at will.
That is, they can go and pick -- I shouldn't say any object on the heap. I mean
any location.
So this is not always the case. But trying to preclude the Maginot line scenario,
let's imagine that our attacker is really, really powerful. What can we do? Okay.
So to be concrete, the flags, of course, represent sensitive data. All right? So
there's some data that's somehow exploitable. And we argue that all data is
sensitive. This is clearly not true in general. However, luckily for this example it
turns out that on conventional heap organizations, it really is pretty much true
that all data is sensitive, because right next to all data are these yellow objects
which are heap metadata. So the heap metadata lives right next to these
objects. And the heap metadata has information like a pointer, the size of the
current object. So this is essentially the underlying graph representation, the
data structure that holds the heap.
And if you go and you overflow into it, as here, right, I've written the first byte of a
pointer. That is bad news. Okay? This will not end well.
Right? So in effect the heap is already peppered with sensitive data in that kind
of an organization. So it turns out that the analysis basically is very simple. If
you have metadata on the heap, then the game is already lost.
All right? So but we'll continue. All data, metadata are sensitive. This bomb
here is going to represent unmapped memory or a guard page. So if you go to
access memory that is unmapped, this is what happens to you.
Okay? You get one of these scary messages. And this is basically a Seg fault or
an AV in Microsoft terminology.
Let's dive down. So here is the layout in Stratego terms of Windows and Linux. I
highly recommend for the amateur or beginner Stratego players that you do not
employ this layout at home. This is a bad layout. So everywhere I touch,
everywhere I see this stuff, right, it's all flags. So I just go and I touch it and I win.
Okay. Yes?
>>: How do you know you're not touching free space?
>> Emery Berger: That's a good question. So the question is how do you know
you're not touching free space? So the question free space is actually not such a
problem for me, from the perspective of this analogy, because free space, the
freed objects also have metadata in front of them. So you basically free object is
just as vulnerable as an object that's in use.
All right. So this is not entirely true, this picture. This is really bad. So Microsoft
and the Linux community recognize that this is maybe not the best heap
organization. So they did this. Did you see that? Did you see. Look, look. See,
it moved. Okay, the whole thing moves. This is called ASLR. Or address space
layout randomization. The idea is you pick up the whole heap and you place it at
someplace at random.
So it turns out that you don't do anything else with the objects. You just need to
know where the start of the heap is. So the amount of entropy that's involved is
actually surprisingly low.
So if you know the address of any one object on the heap, you now know the
address of all the other objects on the heap.
So you can randomly find it fairly quickly, or you can just know one object and
now you've given away the store. Okay. So back to DieHard. So when I first
looked at this problem, I said: You know, DieHard has randomization. It has no
metadata on the heap. Done. Right? Perfect. It's resilient and secure. Right?
It's motherhood and apple pie. Who can argue with these things?
So here is an example DieHard layout. Now I should add DieHard actually looks
more like this. Okay? So DieHard unmaps all memory that's not currently in use.
Right? And it can randomly allocate objects across the entire address space.
Now, so here's this one layout, I'll remove the bombs but pretend the bombs are
there. If you randomly go and pick a place in the heap you're probably going to
hit a bomb. But you'll notice there's one, there's two, there's four, here's another
possible layout.
Again, one, two, four. Right? So the problem here is that we didn't really think
about this, because we were not worried about security when we designed
DieHard. But DieHard has this strategy I mentioned before that it grows the heap
on demand. And the way it grows the heap is by doubling.
So once one of these -- this is size 8 object, size 16 object. Once this one
becomes too full, like half full, then you need to get more space. And the way we
did it is we just got twice as much. Right? So this turns out to be very efficient.
It actually allowed us to locate objects on the heap in amortized constant time.
Very, very nice. In fact, all of DieHard is big 0 of 1 amortized. So we were like,
yeah, we're geniuses. However, from a security standpoint not so much,
because if you allocate four, then eight, then 16, then 32, at any one time,
50 percent of those objects live in the exact same block.
So that means that all you need to know is one of the addresses and if you can
overflow from there, you can trash half of the heap. So that's not so good. All
right. So we said, all right, maybe we need to rethink things a little bit and see
what we can do.
But before we did that, we also looked around. We said: Well, there's Windows,
there's Linux, not the most famous for security. But this little guy represents
open BSD. This is their logo. And open BSD is by design this -- it's actually
pretty annoying. Nobody really wants to use it because it's so crippled. Like they
won't ever update things, because they're scared of introducing security leaks,
right? Security vulnerabilities.
But they're tremendously, tremendously anal about security. And they created a
new allocator which was designed to enhance security. So this is roughly
speaking how their allocator works. First, pretend there's bombs again, which
you'll see in a second. But you break up the heap into pages.
So there's four K chunks for the heap. And in effect in between all of these
things there's bombs all the time. So if you are here and you overflow from this
object, you're quite likely to hit a bomb. All right? So that's pretty good.
It turns out that they also optionally -- that's what the asterisk is for, when you
free an object, they can -- it's not set by default, they can overwrite the contents
with crap.
This turns out to be required. All right? It's not necessarily obvious the paper
has more details. But there's a kind of attack called heap spraying which is an
awesome, awesome attack where you fill up the heap, like you go to a Web page
and it's JavaScript. JavaScript can allocate as many objects as it wants, fill them
with whatever, you fill them with shellcode, and you just jump to an arbitrary
place in the heap. And you're likely to hit shellcode.
So the problem here is that if you actually don't destroy things as soon as you
delete them, then all your free space can be filled with shellcode. All right? And
this means you can have a gigantic sort of unbounded fraction of stuff that is all
waiting to be jumped into.
All right. So here's that. But still it's a little unsatisfying because the objects -like here's an object, here's an object, it's deterministic. In fact, I jumped back.
It's not entirely deterministic. So what they do is they do some little bit of
randomization. And the way they do the randomization is actually quite cute.
What they do is they keep a few slots that correspond to like the low order bits of
the objects. So you hash them in.
When you go to free an object, you look in the corresponding slot. If there's no
object there, you put the object there. If there is, then you free that object that
was sitting there in its place. So it adds some uncertainty to where the objects -like when the objects get freed. It turns out, unfortunately, it's not that much
entropy.
So when you see entropy here, 2 to that power means sort of how long or how
many times I'm going to have to like hit this to find my object that I really wanted
to attack.
Now, having all these bombs out here means that's great, right? The bombs are
fantastic. It means purely random, I don't know anything, I'm going to hit a bomb.
But if I do know one object address, and I need to reach another one, then this is
really not very much. Right? This means that on the order of thousands I can go
ahead and win.
So what did DieHarder do? We adapted the other layout, but we basically take
the best of both worlds with the exception that we can't really support dangling
pointer protection anymore because we have to destroy the contents of freed
objects. But these objects are completely independent of location.
So one object is totally de correlated with any other object. You leak practically
no information at all. You can't actually do better, it turns out. That is, the
amount of entropy we get from these objects you could actually use as a random
number generator. So if you drop like the bottom six bits or something, you have
no way of knowing where those objects came from.
So knowing the address of one object reveals no other information about other
object locations. Yes?
>>: The heap spraying, and you are linked to resort to something executing on
the heap, can you do dangling pointers? Dangling point protection? [phonetic].
>> Emery Berger: That's a good question. So the question is: Could we go
back to protecting against dangling pointers if we don't worry about heap
spraying, right?
>>: Or you protect the heap spraying via no execute [inaudible].
>> Emery Berger: Right. So if the heap is not executable and there's no way of
making heap executable like M map or M protect gets taken over, it's not clear to
me. I actually would want to do the math, to be honest. There's math in the
paper, and math keeps you sane.
So a lot of -- these results actually fell out from doing the analysis. So I would
want to be more careful. Intuitively it seems like it's probably fine, because then
you can still have a bunch of stuff lying around and it's free. Maybe it doesn't
matter if it's not code. But I'm not entirely sure. It seems like information leakage
for sure could be a problem.
So the way that we did this was we extended the layout. We did the traditional
computer science solution of one more level of interaction. So these many
heaps instead of being contiguous, they're contiguous pointers to pages, right?
Then these pages are chosen randomly. I should note that you cannot rely on
virtual ALEC or M map to give you sufficiently randomized pages. Far from it.
So we actually map a gigantic space and then randomly activate pages within it.
Otherwise you get very, very little entropy. Much better on Windows, by the way,
than on the Mac. So if you want to pat yourself on the back.
Actually, with the new OSX lion they made some hay they've increased the
randomization. I was like: Really? Like maybe two bits. So not very impressive.
All right. So you might be asking yourself about performance. Here's the ritual
performance graph. I'll just wrap this up. Boring. So basically this is GNU libc.
So that's the default allocator for Linux.
This is the DLmalloc, newer version which in principle one day will become a part
of Linux. This is open BSD's more secure allocator. This is DieHard and this is
DieHarder. For a lot of these, these are the spec benchmarks. Most of them it's
a wash. There's a couple where it's not. Notably SLAN and omnet PP. The GO
mean is roughly 20 percent, I believe. It's around 20 percent. So there's some
degradation. Most of the performance degradation comes really from TLB.
Right? So now we have pages. These objects are located in a bunch of random
pages. That means that the TLB footprint is larger. And that's mostly what kills
you. And really this is more -- it says more about Intel's annoyingly baked in TLB
organization than anything else.
It has a very -- it works great if things are dense. It doesn't work well if it's
sparse.
>>: Could you go back to the previous slide. You mentioned earlier about
fragmentation within the heap. To this, fragmentation across the address space.
So you end up limiting the maximum size of the contiguous allocation because
you're randomly selecting pages at random locations.
>> Emery Berger: Right. So we do it over -- so it's true. But because we're
randomly allocating so many pages, right, we allocate, I don't know if it's
terabites, but it's a huge, huge amount. Then fragmentation doesn't really
matter. I should also add that we directly M map up large objects. So that
mitigates things further.
>>: So are you saying that you prereserve a big chunk of the address space in
which you do the randomization.
>> Emery Berger: We pre-reserve a big chunk of the address space. If your
object exceeds, I can't remember what the threshold, but I think it's something
like 8-K, then we invoke virtual Alec or M Map directly which means it will pull it
off somewhere else.
>>: Then so on a 32 bit process, how big is that space?
>> Emery Berger: Yeah, 32 doesn't work as well. There's a limited amount of
entropy to exploit with 32 bits. 64 works great. 32 we get as much as we can.
>>: [inaudible] AMD?
>> Emery Berger: I expect it's the same. Basically the x86 architecture dictates
a particular layout of the TLB. So when you go -- if you have something that's
not satisfied by the TLB it has to go out and walk the page table entries. The
page table entries have to be laid out in memory a particular way. That's how
Linux and Windows have to.
>>: Does that have any implications for concurrency, the issue down, multiple
[inaudible]. Are you doing more walks and you're going to interrupt threats, more
likely interrupted threats in serializing execution?
>> Emery Berger: Excuse me. So obviously threads share the same TLB, right?
So I'm not exactly sure. I mean, the footprint is larger, right? So footprint being
larger is just a problem in general. I'm not sure concurrency is really much of an
issue since they share the TLB. It's just think of them as being serial but just
visiting a bigger address space. So it probably does exacerbate it.
Did you have a question?
>>: I think I heard Intel is implementing a multi-level TLB.
>> Emery Berger: Yes, that's a good point.
>>: Is that the process you used or is that ->> Emery Berger: No, I don't recall which one this was we used. It was some
Xeon. So it's true that Intel is actually putting in some level of multi-level TLB
which will mitigate the cost of TLB misses by having the sort of same thing you
would come to expect for caches you do the same thing for TLB. You have
slower memory but it's better. That said, it's you know -- the fact that -- basically
the way that it works it expands out a level and fills in all the entries, whenever
you visit a page in this space.
And so it consumes a huge amount of RAM just for maybe one entry. Okay? So
for us this is pessimal. There are sparse layouts that would be fine and in the
systems like the spark, then you have software TLB and software walkers and
then we could have done whatever we wanted and it would have been great. But
it is what it is.
>>: So what allocation strategy do you use for the pages that contain the bit
maps themselves?
>> Emery Berger: Those are themselves randomly allocated.
>>: Within the same span.
>> Emery Berger: Yes that's right.
>>: So there have to be references to those pages.
>> Emery Berger: There are. That's right. At the end of the day somebody has
a pointer that points to the actual heap. And to all these things, right? And so if
you can somehow get in and change some globals, then I'm screwed. But like I
said this is about protecting the heap. Protecting globals is maybe not as tricky.
>>: I guess that's true, but it's also the case that depending on what the entropy
is for the image that contains the references to those pages, discovery of where
all the pages are mapped may be possible. If the entropy is very low where that
image is located, that's ->> Emery Berger: That's the place to attack, that's right. I agree. The other
obvious place to attack would be the random number generator seed itself.
Right? So if you can force things to be deterministic, then you win.
So we actually have other work that's not security-related but has security
implications that randomizes everything and periodically rerandomizes everything
again. So this would avoid that sort of problem. But it's an excellent option.
So I'm going to give this the time it deserves. Okay and move on to an
application somebody might actually care about. All right. So we decided spec,
spec, who cares, right? I want to look at real applications. So the question was:
All right. We're going to run Firefox. We ran Firefox, I can't remember how many
pages, but we loaded up like the top 100 websites from alexa and copied them
all on to a local machine to avoid network effects and then just scripted Firefox to
load all these pages. So then the question was: What happens when you plug
in DieHarder? What is the performance overhead? So hey, look, I have a talk.
Okay. So this is how long it took without DieHarder. So this is on Linux.
So it took 44.2 seconds. So now the question is: What would you be willing to
tolerate for these great security benefits I just outlined? All this protection. How
much would you be willing to pay of overhead for your Firefox session pretending
you all use Firefox.
>>: No more than five percent.
>> Emery Berger: No more than five percent.
>>: How many pages?
>>: Are we talking runtime? So this is runtime costs, we're not talking about like
footprint as well.
>> Emery Berger: This is not talking about footprint. That's right. All right. Less
than five percent, that's it? Okay. Nobody's willing to -- tough crowd. I should
add -- yes?
>>: I'll go, anybody not willing to pay eight percent, state purchase tax is not
serious. [laughter].
>> Emery Berger: It should be related to taxation. So Europeans here should be
much more comfortable. Whatever your value-added tax is, that's the acceptable
overhead. And Canadians, too, of course. When I gave this talk, it's very
interesting, I gave this talk at CCS and recently at WOOT, which is a USNEX
workshop, and I asked the same question and somebody in the crowd said 2x.
So that tells you there's some different cultures going on here.
All right. So you ready? Here's the reveal. Okay. So it took less time. All right.
Now, it turns out that this number is not statistically significant. Okay? So you
can't reject the chance that this was just a random occurrence, right? But let's
face it. It's close enough that we don't care, right? If the differences are in the
noise. I should add this is not going to work all the time. There are definitely
cases where if you're using SunSpider exercising JavaScript, allocating tons and
tons of objects way more than normal and this allocates a lot of objects, by the
way. It allocates thousands of objects every time you move your mouse.
I wish I were making that up. Okay? So it allocates tons of objects. But the
runtime overhead is actually pretty acceptable. The footprint question, there is
definitely more memory consumed. If you are increasing the amount of space
between objects and you're randomly allocating things, the footprint is larger, not
just the overall memory usage but also the VM residence that size. That said, we
see something on the order of maybe 20 percent. You actually win a little bit by
getting rid of the metadata. So metadata for objects on 64 bit systems basically
has to be 16 bytes per object. And our overhead per object is one bit.
So we kind of -- we win there and then we lose with the randomness.
>>: If it's significant could you run it ->> Emery Berger: Yeah, we ran it -- these represent -- this was 100 invocations
or -- it was 100 pages. I can't remember how many iterations, but then we
repeated the experiment. Because you have to. There's a lot of
nondeterminism. I can't remember the amount of times I think it was 20.
>>: Average across the 20 runs?
>> Emery Berger: That's right.
>>: A lot of variance during that period?
>> Emery Berger: There's variance for sure, yeah. On both sides, actually.
>>: So what's the expansion factor?
>> Emery Berger: I believe the expansion factor for these experiments was four
thirds. So you need -- I should add that if you don't have an expansion factor
here, the bit map random probing doesn't work. So you do need some slack. It's
easy to explain. You have to have some constant fraction of 0 bits to have a
probability of finding something whenever you do random probing. So if you're
full, then you'd run forever. Okay?
>>: Sorry. You said this was -- was this Linux 64 bit Firefox?
>> Emery Berger: This was.
>>: Okay.
>>: Did you have to change any source code?
>> Emery Berger: No. No. So if you get to know me, you'll understand how little
I like to do source code. So almost everything I do actually is specifically, it's
automatic. You don't have to change any source code. Mostly because I just
like that. It's a nice world to live in. So, no, and God forbid we'd have to change
Firefox. Just recompiling Firefox takes like an hour. So let alone putting in a
change and hoping it worked.
>>: Try this on like more things like try them ->> Emery Berger: I'll tell you what the challenges are. Yes, it works for Chrome.
So we have it -- I mean I have this available, I haven't released it yet. DieHard is
available, and it works for Mac. It works for Windows. It works for Linux. The
biggest challenge with any of this stuff is custom allocators. So there are many
applications that have their own allocators.
And sadly, the Mozilla foundation people have taken the misguided step of
baking in a particular memory allocator for their upcoming and/or new Windows
builds of Firefox called JEmalloc. The bad part is, of course, now it's baked in.
Now everything is deterministic, hurray. And it makes it difficult to replace.
Although you can replace it. Luckily on the Mac this is still quite easy and Linux
it's very easy.
But many applications also just have their own, I will call malloc and get a big
chunk of memory, do with it what I please. And then my randomization doesn't
help you at all. Helps you a little but not as much as I like. But other than that,
there are no technical limitations to it. All right. Any other questions? All right.
Well, that's it. Thanks for your attention.
[applause]
Download