22969 >> Ben Zorn: It's a great pleasure to introduce... of Massachusetts. Emery is a long-time friend and colleague. ...

22969 >> Ben Zorn: It's a great pleasure to introduce Emery Berger from the University of Massachusetts. Emery is a long-time friend and colleague. He's worked with us in many different capacities. Emery's gotten numerous, numerous awards, including an SF Career Award, MSR Fellowship Award as a graduate student and teaching awards, et cetera. Emery's done work, as probably many know, in areas related to security, performance, reliability. Today he's going to be talking about security; in particular how to use randomization to thwart the bad guys. >> Emery Berger: Thanks, Ben. So the title, of course, as you can see is DieHarder, securing the heap. So obviously before DieHarder came DieHard, if any of you have watched the films. I certainly hope you have. So I'm not really talking about the movies, of course, I'm talking about a project called DieHard. DieHard is work that I did together with Ben a few years back. And this work eventually ended up being the inspiration for Windows 7's fault tolerant heap. So hence the little I really need to get one of those T-shirts. Maybe not the whole thing was my idea. But anyway, so anyway, so what is DieHard? So DieHard is about probabilistic memory safety for C or C++ programs. So it's a reliability story. And it's all about dealing with programs with bugs. So the goal of DieHard was to be able to withstand memory errors. That is, the kind of bugs you get in sort of run of the mill C, C++ programs. So here is an example where we've got some code and let's see if the clicker works. Slowly. Maybe. This is a good sign. Nothing is happening and the elapsed time clock on my, on the presentation timer is stopped. Yes, that's a good sign. Yeah. It's one of those. Yeah. It came up. Awesome. >>: Running on top of DieHard. >> Emery Berger: That's right. There you go. So actually in fact what's happening is, this is maybe not a very interesting technical detail, but it turns out that the Mac OS has a really, really broken virtual memory subsystem. So it very aggressively caches pages that you touch. So aggressively that it pages out other stuff. So presumably some part of the presentation went away. And probably the PowerPoint code itself then got paged back in. It's pretty bad. It's so bad that every now and then you actually, to get it to restore, I go to the command line and type in the word "purge". So and that forces it to dump all of its pages, and you can end up literally where you're running in free space where there's just a handful of K available. K. Okay. And then you type "purge" and suddenly there's 500 megs. So meanwhile it's swapping. It's horrible. All right. Anyway, back to DieHard. So here is an example of a kind of bug you can get in C, C++. Here print foo actually calls delete. If you invoke print foo twice on the same object you get a double free. And in plenty of heap organizations this can result in heap corruption. Another example, here are two examples. So you are free to invoke delete or free on anything. And that can include things that are part of the stack. Many memory allocators will happily accept a pointer to some region on the stack and make it available to you for use to satisfy future malloc requests which leads to some very entertaining and terrifying bugs. Like when all of a sudden your stack variables change value because there's a heap object changing. It's exciting. You can also inadvertently free an object in the middle. So you can free to a pointer that is to a heap object but it's sort of inside the heap object, and this can screw things up as well. So you have to be very, very careful. And, of course, there's dangling pointers. So dangling pointers where you free something too soon. You actually had a pointer left to it that you forgot about. So here we have a pointer to some foo object that's initialized with happy. We have a pointer X that points to the same thing. We delete F. Now we make a new object, G. So we've allocated new object in most allocators, that memory that was just freed up will be recycled. So now when I print out X info over here, it will probably say sad and not happy. All right? So and then, of course, the classic error. So that was a bunch of errors. Then there's the, I saved the best for last, which is the classic memory overflow. Right? You underallocate an object with ten here. Ten is just a random choice of number. And then you go beyond it and you write something and you land on some other data. All right. So here's all these bugs. So what's the deal with DieHard? The deal with DieHard is DieHard is meant to prevent or probabilistically tolerate these bugs. All right? So it replaces the heap entirely with a new heap organization. And the heap organization is bit map-based which is quite unusual. So there's no actual pointer, metadata. There's no graph in the heap. It's just these bits. And each bit corresponds to an allocated object. If it's set to one it means it's allocated. If it's 0, it's free. And so you can see right away, for example, that double frees go away because when you free something you just set it to 0. If you do it twice, you're just setting it to 0 again, it has no effect. What about allocation? What happens? So when I need to do allocation, I actually do it at random. So I don't just pick, say, the first empty bit. I randomly stir around. Thank you PowerPoint. And I find something and then I set the bit to one. All right? And return the corresponding pointer. So when I go to allocate another object, I'm going to do the exact same thing. I'm going to randomly choose once again and I alight on this one and I return that object. So what does this buy you? So it avoids double freeze like I mentioned before. The heap, I didn't really say this before but we also make sure the heap is bigger than you actually require. We grow the heap on demand. So there's always going to be a high likelihood of some empty space. So, for example, if you have an overflow from here, and it lands on nothing, then it's benign. So this overflow, it's sort as if it didn't happen. I should also mention that the issue of invalid freeze can be dispensed with quickly by DieHard as well because it knows where the addresses of objects are and it can cheaply identify them. So it never -- if you say free a stack object it says no. It just ignores it. If you try to free in the middle of an object it says oh, you meant this object that's the enclosing object and it takes that instead. So a lot of these problems go away. Some of the protection like this protection I showed just a second ago with the buffer overflow is just probabilistic. All right. There's one other thing which is about a dangling pointer protection. So suppose this object which was allocated you freed but you actually have a pointer into it? So now it's a dangled object. So the way that DieHard protects you in this scenario is totally based on the fact that you're doing random allocation. So the problem arises with the dangling pointer when you recycle the object. You recycle the object, you fill it with something. Now that object is corrupt. Here your chances of hitting that one red 0 are in this particular instance only 1 in 8 for the next time. Of course for a realistically sized heap we have millions of objects. The chance is one in a million and so on and so on. So you have a high likelihood this object will remain untouched until you actually are done with it. All right. And it is for some foreseeable circumstance. All right? So lots of -yes? >>: When you do these -- choice in fact you increase the size of the churn, right, as well, you have overhead of the memory allocated? >> Emery Berger: So I'm not exactly sure I understand the question. >>: Okay. Let me rephrase. When you say that, which one of those bits is actually being chosen for the next allocation, this is like a buddy system where you say, oh, the first is two to the power of I and then the same five plus one, et cetera. If you [inaudible] then you're going to allocate churn the size two to the power of I plus or minus something, plus something obviously and then it's not exactly the size that you needed but something slightly bigger because it goes into a different place or something. >> Emery Berger: So I guess the question you're asking is: If I allocate objects of size eight and I allocate objects of size ten, are those in the same space? And in fact it turns out that you can't really do this at all if you don't segregate things by sizes. And the reason is that you would otherwise get catastrophic fragmentation. The intuition is very simple. If you allow people to allocate one object, one tiny little object at random, and you allocate enough of them, then you'll completely fragment the heap and you'll never be able to put a big object anywhere. So they are segregated. Right now for the purposes of this example. I think in the current implementation it's base two but it's not required. It could be any base. >>: So my question was ->> Emery Berger: Any reasonable base. >>: Would the protection imply that you're going to allocate something to the power higher than necessary? So say, for example, you needed 16 bytes and finally you're going to allocate 32 of 64 because of this probabilistic choice. >> Emery Berger: Okay. So I think the short answer to your question, which I'll try to rephrase is do I overallocate to protect? So I do overallocate in the sense that I overallocate the entire aggregate heap, right? But I don't overallocate each individual object. So there is a probability -- let me go back a little bit for this buffer overflow. There's a probability that this object will remain unallocated. But it's not certitude. Okay? So I'm still allocating, say, in this particular case pretend it's 16 bytes. When I allocate or request 16 bytes then I get 16 bytes. It's just the fact that it falls out from the fact that I'm randomly allocating in this bigger than necessary heap that I get empty space. So I have a likelihood of empty space. It's actually a tuneable factor how much this expansion is. Right now this expansion by default is something like four thirds, but you could make it bigger and the bigger you make it and the higher probability you have of success of let's say correct execution in the face of buffer overflow. I should add there's other work followed on this called exterminator which finds the bugs and fixes them. But I'm not going to talk about it today. I'll be happy to talk about it off line. So there's all this good probability stuff for resilience. Let me advance past the animation. The runtime, you know, as you can all see, the runtime is really great. So the runtime is fine. That's not really the point of this discussion. I brought the little space shuttle image here because the actual original motivation here was to try to see how you could do redundant execution and get something meaningful out. So the space shuttle used tri modular redundancy so every single thing you do it three times, and you take the majority vote. It's not clear how -- it's clear for independent hardware errors that this is fine, but if you have a bug, bugs are deterministic. Right? So then the votes would always be the same. So if you actually run multiple replicas with DieHard and they all have their own independent random number generators or random seeds, then their heap layouts will be distinct. They'll be independent by design, and then voting would actually work. So we built a little prototype system that could do this, too. But you don't need to use replication. I just mentioned that because it's part of the paper. You can just use this as a stand-alone replacement for the heap. So the goal here was resilience or fault tolerance. Right? And it got brought into Windows 7 in a sense as the fault tolerant heap. A little bit watered down. So in particular, all this randomness stuff I mentioned earlier, that DieHard, it doesn't rely on, the bit map representation and all these things are great, you get some resilience, but it was eliminated. So there's no randomization. So one of the things that we thought about, and actually I came to Microsoft to talk to some of the Windows folks about this, when they're talking about Windows 8, now we care about security. Okay? So security is important. And my argument was: Look, DieHard is random. Random is really good for security. And it may be annoying for other reasons, but it's really, really great for security. So then the argument was: Well, maybe we don't need -- all right. Your corner, you're talking about randomness. Fine. But in our corner what we're going to do we're going to pepper the heap with canaries, little indicators that something has gone wrong and maybe we'll add some page protection here and there and we'll do this and do that and do this and it will be fine. So you don't need the randomization. The randomization is to make it really difficult for somebody who is exploiting a bug to actually locate objects or get their exploit in whereas these things are made to detect or raise the bar or make it just as hard. So I contended that randomness was really important. And then we had this discussion. And I realized we really didn't know -- we had no way of saying who was the winner, right? It's just argument by intimidation. You know, I'm right. That's not really very convincing. So let me tell you not the way we discussed in the paper. I welcome anybody to go look at the paper and see all the arguments that actually are mathematical of the they talk about these exploits and threat model and all this stuff. But instead I'm going to give you a flavor of the problem with a geographical analogy that I hope will not offend too many people. >>: I'm already offended. [laughter]. >> Emery Berger: Well, if you were Belgian you'd be more offended. But we'll wait for that. >>: Wait a minute. [laughter]. >> Emery Berger: All right. So here's France -- >>: [inaudible]. >> Emery Berger: That's good. That's good. You'll be happy, because France is going to represent Windows. Okay? And Germany is going to be represented by hackers. All right. So this is the black hats. All right. So, again, no aspersions meant on Germans or French. But it's a great analogy. Right? So Germany and France, some of you may recall, if you read your history books, that France and Germany have a checkered history. And maybe not the best relations throughout history. So in particular there have been lots of invasions, so Germany invades. And then eventually there's a counter-attack and Germany is pushed back. And then there's a period of intervening peace. And then Germany attacks again. And then there's a counterattack. So really this is, of course, meant to represent the threat of hackers, right? So the problem is that the hackers come and discover some vulnerability. Right? They're trying, during this period of peace, there's sort of quiescence while they're trying to come up with the vulnerability. They discover it. And there's some sort of countermeasure. In fact, there's a whole history of countermeasures that have been put into heap organizations both in Linux and Windows as well as in Mac, to try to deal with these issues. All right? So it used to be you could do this one thing and there would be a great vulnerability. You could exploit it really well. They said: Oh, we'll fix that. We'll make sure you can't do that. All right? So but then it's back and forth, back and forth. And, you know, people at Windows and people -- Microsoft and people in France got tired of constantly being invaded and having to push the invaders back. So what did they do? They decided to come up with the ultimate solution that would prevent any further attacks. Or at least invasions, which is...the Maginot line. So the Maginot line is, by the way, an amazing feat of engineering. We tend to make fun of it in the United States, I think without understanding the scope. In 1939, 1940 dollars, a little bit before '40, this was an unbelievable amount of money. It was billions and billions of dollars back then. So it was massive, massive undertaking, and each of these things, so if you go through the countryside along the border with Germany, you'll occasionally see these pill boxes. All you see are the cement pill boxes what you don't realize is all the stuff underground. It's really amazing. They have separate air supply. There's armored posts to repel invaders, they have trains so they had electric railroads to connect the stations. They had food. Air locks to deal with gas attacks. So really, really very impressive. So let's replace France with Windows. So let's say we go and we build the Maginot line for Windows and we're going to prevent all further attacks. Okay. We're done. Why do we know we're done? Because we put together this impregnable fortress. And then we block all ways for any attacker to get in, right? So here's Germany and here's France, and these little dots, these sort of dashed line indicate less fortification, less dense, but really, really dense here and less dense over here but we're really protected. Okay? So what happened? >>: Technology changed. >> Emery Berger: No. >>: People. >> Emery Berger: No, there was a hole. So there was an unexpected circumstance. The unexpected circumstance was that Germany invaded Belgium. They invaded Poland. That was different. They came through Belgium. It was neutral and therefore untouchable. You didn't really need as dense protection on the border with Belgium. And the Germans very quickly went through Belgium which wasn't in a position to put up much resistance, effectively went around the Maginot line. It wasn't that the Maginot line didn't work. It just really wasn't tested. Because they found another way. And so this is, you know, really a pretty fair analogy to the situation that we often find ourselves in. We're always sort of fighting the last war. So the last war is, man, the Germans just come across the border and come right in and we're going to stop them from doing that. Then it's oh they come through the Belgian border. Next time we'll seal off the Belgian border too and then next time next time. >>: One subject in 1914 they came from Belgium and what's this? [inaudible] creating that? >> Emery Berger: I'll say it came from Encarta. No, I have no idea where I got the map. I don't remember. Anyway, so here we are. So the question is today: Are we done? Are we secure? Right? Or is there another hidden channel that we're unaware of? Right? So the challenge is to actually know that we've done something. Right? If we build some defense, have we actually improved security? Are we just putting bars on the front windows and leaving the back door wide open? All right. So for the rest of this talk, as before with DieHard, I'll just be talking about the heap, which is scary. As you can see. So the heap is -- there's lots of work that focuses on code. Code is much more structured. It's a much more well understood problem. The heap is more problematic. And again a lot of these problems go away in the world of managed code. So if you're a managed code bigot, you know, this talk is over for you now. Because it's only, you know, or if you believe programs can never have bugs, if you can get a program with no errors, then it's also not a problem. Right? So every exploit depends on the unsafe programming language substrate and an error. No errors, no problems. All right? So, again, rather than go into gory technical details but to stick with my belligerent theme, I'll do an analogy on Stratego. How many have played the game or are familiar with that game? >> Emery Berger: Really. Man. Yes, every time, it's getting worse and worse. But there's a clear age divide. So Stratego is a board game, which means it was played with physical objects. People would get together and not be near electrical sources. [laughter] and play these games. So the idea behind Stratego is you have two players. And the two players face off. And you can see that there's red pieces and blue pieces. Pretend we are the blue player. So the blue player can see these little faces on his side, but on his opponent's side can't see anything. Just red. And obviously it's the same situation for the other one. And in effect Stratego is a fancy game of capture the flag. All right? So there are all these different pieces. And the pieces have these different powers. You have a small number of very powerful pieces and a large number of cannon fodder, let's say, and basically the higher number indicates that they're weaker. So if a 9 touches an 8 the 8 wins and the 9 piece is taken off the board. And should you find the flag, when you move your piece, and you can move your piece up and hit it, if you get the flag you win. All right? So in addition to all these pieces, there's also the little moving pieces, there are also bombs. So think of these as mines. If you hit a mine, unless you're the miner piece, then your piece blows up. All right? So let's use a very simple model where we only have two kinds of pieces. Okay? We have flags and we have bombs. All right? And our opponent, instead of actually having to move a piece linearly, right, going tap, tap, tap, they can reach down like the hand of God and touch a piece. Okay? If they touch a flag, they win. All right? And if they touch a bomb, then they lose. Okay? So right now you can see in this board, they have a 50/50 chance of winning. 50/50 chance of losing. So this is meant to represent a pretty conservative threat model, right? The attacker can exploit any object, can reach any object on the heap, right, at will. That is, they can go and pick -- I shouldn't say any object on the heap. I mean any location. So this is not always the case. But trying to preclude the Maginot line scenario, let's imagine that our attacker is really, really powerful. What can we do? Okay. So to be concrete, the flags, of course, represent sensitive data. All right? So there's some data that's somehow exploitable. And we argue that all data is sensitive. This is clearly not true in general. However, luckily for this example it turns out that on conventional heap organizations, it really is pretty much true that all data is sensitive, because right next to all data are these yellow objects which are heap metadata. So the heap metadata lives right next to these objects. And the heap metadata has information like a pointer, the size of the current object. So this is essentially the underlying graph representation, the data structure that holds the heap. And if you go and you overflow into it, as here, right, I've written the first byte of a pointer. That is bad news. Okay? This will not end well. Right? So in effect the heap is already peppered with sensitive data in that kind of an organization. So it turns out that the analysis basically is very simple. If you have metadata on the heap, then the game is already lost. All right? So but we'll continue. All data, metadata are sensitive. This bomb here is going to represent unmapped memory or a guard page. So if you go to access memory that is unmapped, this is what happens to you. Okay? You get one of these scary messages. And this is basically a Seg fault or an AV in Microsoft terminology. Let's dive down. So here is the layout in Stratego terms of Windows and Linux. I highly recommend for the amateur or beginner Stratego players that you do not employ this layout at home. This is a bad layout. So everywhere I touch, everywhere I see this stuff, right, it's all flags. So I just go and I touch it and I win. Okay. Yes? >>: How do you know you're not touching free space? >> Emery Berger: That's a good question. So the question is how do you know you're not touching free space? So the question free space is actually not such a problem for me, from the perspective of this analogy, because free space, the freed objects also have metadata in front of them. So you basically free object is just as vulnerable as an object that's in use. All right. So this is not entirely true, this picture. This is really bad. So Microsoft and the Linux community recognize that this is maybe not the best heap organization. So they did this. Did you see that? Did you see. Look, look. See, it moved. Okay, the whole thing moves. This is called ASLR. Or address space layout randomization. The idea is you pick up the whole heap and you place it at someplace at random. So it turns out that you don't do anything else with the objects. You just need to know where the start of the heap is. So the amount of entropy that's involved is actually surprisingly low. So if you know the address of any one object on the heap, you now know the address of all the other objects on the heap. So you can randomly find it fairly quickly, or you can just know one object and now you've given away the store. Okay. So back to DieHard. So when I first looked at this problem, I said: You know, DieHard has randomization. It has no metadata on the heap. Done. Right? Perfect. It's resilient and secure. Right? It's motherhood and apple pie. Who can argue with these things? So here is an example DieHard layout. Now I should add DieHard actually looks more like this. Okay? So DieHard unmaps all memory that's not currently in use. Right? And it can randomly allocate objects across the entire address space. Now, so here's this one layout, I'll remove the bombs but pretend the bombs are there. If you randomly go and pick a place in the heap you're probably going to hit a bomb. But you'll notice there's one, there's two, there's four, here's another possible layout. Again, one, two, four. Right? So the problem here is that we didn't really think about this, because we were not worried about security when we designed DieHard. But DieHard has this strategy I mentioned before that it grows the heap on demand. And the way it grows the heap is by doubling. So once one of these -- this is size 8 object, size 16 object. Once this one becomes too full, like half full, then you need to get more space. And the way we did it is we just got twice as much. Right? So this turns out to be very efficient. It actually allowed us to locate objects on the heap in amortized constant time. Very, very nice. In fact, all of DieHard is big 0 of 1 amortized. So we were like, yeah, we're geniuses. However, from a security standpoint not so much, because if you allocate four, then eight, then 16, then 32, at any one time, 50 percent of those objects live in the exact same block. So that means that all you need to know is one of the addresses and if you can overflow from there, you can trash half of the heap. So that's not so good. All right. So we said, all right, maybe we need to rethink things a little bit and see what we can do. But before we did that, we also looked around. We said: Well, there's Windows, there's Linux, not the most famous for security. But this little guy represents open BSD. This is their logo. And open BSD is by design this -- it's actually pretty annoying. Nobody really wants to use it because it's so crippled. Like they won't ever update things, because they're scared of introducing security leaks, right? Security vulnerabilities. But they're tremendously, tremendously anal about security. And they created a new allocator which was designed to enhance security. So this is roughly speaking how their allocator works. First, pretend there's bombs again, which you'll see in a second. But you break up the heap into pages. So there's four K chunks for the heap. And in effect in between all of these things there's bombs all the time. So if you are here and you overflow from this object, you're quite likely to hit a bomb. All right? So that's pretty good. It turns out that they also optionally -- that's what the asterisk is for, when you free an object, they can -- it's not set by default, they can overwrite the contents with crap. This turns out to be required. All right? It's not necessarily obvious the paper has more details. But there's a kind of attack called heap spraying which is an awesome, awesome attack where you fill up the heap, like you go to a Web page and it's JavaScript. JavaScript can allocate as many objects as it wants, fill them with whatever, you fill them with shellcode, and you just jump to an arbitrary place in the heap. And you're likely to hit shellcode. So the problem here is that if you actually don't destroy things as soon as you delete them, then all your free space can be filled with shellcode. All right? And this means you can have a gigantic sort of unbounded fraction of stuff that is all waiting to be jumped into. All right. So here's that. But still it's a little unsatisfying because the objects -like here's an object, here's an object, it's deterministic. In fact, I jumped back. It's not entirely deterministic. So what they do is they do some little bit of randomization. And the way they do the randomization is actually quite cute. What they do is they keep a few slots that correspond to like the low order bits of the objects. So you hash them in. When you go to free an object, you look in the corresponding slot. If there's no object there, you put the object there. If there is, then you free that object that was sitting there in its place. So it adds some uncertainty to where the objects -like when the objects get freed. It turns out, unfortunately, it's not that much entropy. So when you see entropy here, 2 to that power means sort of how long or how many times I'm going to have to like hit this to find my object that I really wanted to attack. Now, having all these bombs out here means that's great, right? The bombs are fantastic. It means purely random, I don't know anything, I'm going to hit a bomb. But if I do know one object address, and I need to reach another one, then this is really not very much. Right? This means that on the order of thousands I can go ahead and win. So what did DieHarder do? We adapted the other layout, but we basically take the best of both worlds with the exception that we can't really support dangling pointer protection anymore because we have to destroy the contents of freed objects. But these objects are completely independent of location. So one object is totally de correlated with any other object. You leak practically no information at all. You can't actually do better, it turns out. That is, the amount of entropy we get from these objects you could actually use as a random number generator. So if you drop like the bottom six bits or something, you have no way of knowing where those objects came from. So knowing the address of one object reveals no other information about other object locations. Yes? >>: The heap spraying, and you are linked to resort to something executing on the heap, can you do dangling pointers? Dangling point protection? [phonetic]. >> Emery Berger: That's a good question. So the question is: Could we go back to protecting against dangling pointers if we don't worry about heap spraying, right? >>: Or you protect the heap spraying via no execute [inaudible]. >> Emery Berger: Right. So if the heap is not executable and there's no way of making heap executable like M map or M protect gets taken over, it's not clear to me. I actually would want to do the math, to be honest. There's math in the paper, and math keeps you sane. So a lot of -- these results actually fell out from doing the analysis. So I would want to be more careful. Intuitively it seems like it's probably fine, because then you can still have a bunch of stuff lying around and it's free. Maybe it doesn't matter if it's not code. But I'm not entirely sure. It seems like information leakage for sure could be a problem. So the way that we did this was we extended the layout. We did the traditional computer science solution of one more level of interaction. So these many heaps instead of being contiguous, they're contiguous pointers to pages, right? Then these pages are chosen randomly. I should note that you cannot rely on virtual ALEC or M map to give you sufficiently randomized pages. Far from it. So we actually map a gigantic space and then randomly activate pages within it. Otherwise you get very, very little entropy. Much better on Windows, by the way, than on the Mac. So if you want to pat yourself on the back. Actually, with the new OSX lion they made some hay they've increased the randomization. I was like: Really? Like maybe two bits. So not very impressive. All right. So you might be asking yourself about performance. Here's the ritual performance graph. I'll just wrap this up. Boring. So basically this is GNU libc. So that's the default allocator for Linux. This is the DLmalloc, newer version which in principle one day will become a part of Linux. This is open BSD's more secure allocator. This is DieHard and this is DieHarder. For a lot of these, these are the spec benchmarks. Most of them it's a wash. There's a couple where it's not. Notably SLAN and omnet PP. The GO mean is roughly 20 percent, I believe. It's around 20 percent. So there's some degradation. Most of the performance degradation comes really from TLB. Right? So now we have pages. These objects are located in a bunch of random pages. That means that the TLB footprint is larger. And that's mostly what kills you. And really this is more -- it says more about Intel's annoyingly baked in TLB organization than anything else. It has a very -- it works great if things are dense. It doesn't work well if it's sparse. >>: Could you go back to the previous slide. You mentioned earlier about fragmentation within the heap. To this, fragmentation across the address space. So you end up limiting the maximum size of the contiguous allocation because you're randomly selecting pages at random locations. >> Emery Berger: Right. So we do it over -- so it's true. But because we're randomly allocating so many pages, right, we allocate, I don't know if it's terabites, but it's a huge, huge amount. Then fragmentation doesn't really matter. I should also add that we directly M map up large objects. So that mitigates things further. >>: So are you saying that you prereserve a big chunk of the address space in which you do the randomization. >> Emery Berger: We pre-reserve a big chunk of the address space. If your object exceeds, I can't remember what the threshold, but I think it's something like 8-K, then we invoke virtual Alec or M Map directly which means it will pull it off somewhere else. >>: Then so on a 32 bit process, how big is that space? >> Emery Berger: Yeah, 32 doesn't work as well. There's a limited amount of entropy to exploit with 32 bits. 64 works great. 32 we get as much as we can. >>: [inaudible] AMD? >> Emery Berger: I expect it's the same. Basically the x86 architecture dictates a particular layout of the TLB. So when you go -- if you have something that's not satisfied by the TLB it has to go out and walk the page table entries. The page table entries have to be laid out in memory a particular way. That's how Linux and Windows have to. >>: Does that have any implications for concurrency, the issue down, multiple [inaudible]. Are you doing more walks and you're going to interrupt threats, more likely interrupted threats in serializing execution? >> Emery Berger: Excuse me. So obviously threads share the same TLB, right? So I'm not exactly sure. I mean, the footprint is larger, right? So footprint being larger is just a problem in general. I'm not sure concurrency is really much of an issue since they share the TLB. It's just think of them as being serial but just visiting a bigger address space. So it probably does exacerbate it. Did you have a question? >>: I think I heard Intel is implementing a multi-level TLB. >> Emery Berger: Yes, that's a good point. >>: Is that the process you used or is that ->> Emery Berger: No, I don't recall which one this was we used. It was some Xeon. So it's true that Intel is actually putting in some level of multi-level TLB which will mitigate the cost of TLB misses by having the sort of same thing you would come to expect for caches you do the same thing for TLB. You have slower memory but it's better. That said, it's you know -- the fact that -- basically the way that it works it expands out a level and fills in all the entries, whenever you visit a page in this space. And so it consumes a huge amount of RAM just for maybe one entry. Okay? So for us this is pessimal. There are sparse layouts that would be fine and in the systems like the spark, then you have software TLB and software walkers and then we could have done whatever we wanted and it would have been great. But it is what it is. >>: So what allocation strategy do you use for the pages that contain the bit maps themselves? >> Emery Berger: Those are themselves randomly allocated. >>: Within the same span. >> Emery Berger: Yes that's right. >>: So there have to be references to those pages. >> Emery Berger: There are. That's right. At the end of the day somebody has a pointer that points to the actual heap. And to all these things, right? And so if you can somehow get in and change some globals, then I'm screwed. But like I said this is about protecting the heap. Protecting globals is maybe not as tricky. >>: I guess that's true, but it's also the case that depending on what the entropy is for the image that contains the references to those pages, discovery of where all the pages are mapped may be possible. If the entropy is very low where that image is located, that's ->> Emery Berger: That's the place to attack, that's right. I agree. The other obvious place to attack would be the random number generator seed itself. Right? So if you can force things to be deterministic, then you win. So we actually have other work that's not security-related but has security implications that randomizes everything and periodically rerandomizes everything again. So this would avoid that sort of problem. But it's an excellent option. So I'm going to give this the time it deserves. Okay and move on to an application somebody might actually care about. All right. So we decided spec, spec, who cares, right? I want to look at real applications. So the question was: All right. We're going to run Firefox. We ran Firefox, I can't remember how many pages, but we loaded up like the top 100 websites from alexa and copied them all on to a local machine to avoid network effects and then just scripted Firefox to load all these pages. So then the question was: What happens when you plug in DieHarder? What is the performance overhead? So hey, look, I have a talk. Okay. So this is how long it took without DieHarder. So this is on Linux. So it took 44.2 seconds. So now the question is: What would you be willing to tolerate for these great security benefits I just outlined? All this protection. How much would you be willing to pay of overhead for your Firefox session pretending you all use Firefox. >>: No more than five percent. >> Emery Berger: No more than five percent. >>: How many pages? >>: Are we talking runtime? So this is runtime costs, we're not talking about like footprint as well. >> Emery Berger: This is not talking about footprint. That's right. All right. Less than five percent, that's it? Okay. Nobody's willing to -- tough crowd. I should add -- yes? >>: I'll go, anybody not willing to pay eight percent, state purchase tax is not serious. [laughter]. >> Emery Berger: It should be related to taxation. So Europeans here should be much more comfortable. Whatever your value-added tax is, that's the acceptable overhead. And Canadians, too, of course. When I gave this talk, it's very interesting, I gave this talk at CCS and recently at WOOT, which is a USNEX workshop, and I asked the same question and somebody in the crowd said 2x. So that tells you there's some different cultures going on here. All right. So you ready? Here's the reveal. Okay. So it took less time. All right. Now, it turns out that this number is not statistically significant. Okay? So you can't reject the chance that this was just a random occurrence, right? But let's face it. It's close enough that we don't care, right? If the differences are in the noise. I should add this is not going to work all the time. There are definitely cases where if you're using SunSpider exercising JavaScript, allocating tons and tons of objects way more than normal and this allocates a lot of objects, by the way. It allocates thousands of objects every time you move your mouse. I wish I were making that up. Okay? So it allocates tons of objects. But the runtime overhead is actually pretty acceptable. The footprint question, there is definitely more memory consumed. If you are increasing the amount of space between objects and you're randomly allocating things, the footprint is larger, not just the overall memory usage but also the VM residence that size. That said, we see something on the order of maybe 20 percent. You actually win a little bit by getting rid of the metadata. So metadata for objects on 64 bit systems basically has to be 16 bytes per object. And our overhead per object is one bit. So we kind of -- we win there and then we lose with the randomness. >>: If it's significant could you run it ->> Emery Berger: Yeah, we ran it -- these represent -- this was 100 invocations or -- it was 100 pages. I can't remember how many iterations, but then we repeated the experiment. Because you have to. There's a lot of nondeterminism. I can't remember the amount of times I think it was 20. >>: Average across the 20 runs? >> Emery Berger: That's right. >>: A lot of variance during that period? >> Emery Berger: There's variance for sure, yeah. On both sides, actually. >>: So what's the expansion factor? >> Emery Berger: I believe the expansion factor for these experiments was four thirds. So you need -- I should add that if you don't have an expansion factor here, the bit map random probing doesn't work. So you do need some slack. It's easy to explain. You have to have some constant fraction of 0 bits to have a probability of finding something whenever you do random probing. So if you're full, then you'd run forever. Okay? >>: Sorry. You said this was -- was this Linux 64 bit Firefox? >> Emery Berger: This was. >>: Okay. >>: Did you have to change any source code? >> Emery Berger: No. No. So if you get to know me, you'll understand how little I like to do source code. So almost everything I do actually is specifically, it's automatic. You don't have to change any source code. Mostly because I just like that. It's a nice world to live in. So, no, and God forbid we'd have to change Firefox. Just recompiling Firefox takes like an hour. So let alone putting in a change and hoping it worked. >>: Try this on like more things like try them ->> Emery Berger: I'll tell you what the challenges are. Yes, it works for Chrome. So we have it -- I mean I have this available, I haven't released it yet. DieHard is available, and it works for Mac. It works for Windows. It works for Linux. The biggest challenge with any of this stuff is custom allocators. So there are many applications that have their own allocators. And sadly, the Mozilla foundation people have taken the misguided step of baking in a particular memory allocator for their upcoming and/or new Windows builds of Firefox called JEmalloc. The bad part is, of course, now it's baked in. Now everything is deterministic, hurray. And it makes it difficult to replace. Although you can replace it. Luckily on the Mac this is still quite easy and Linux it's very easy. But many applications also just have their own, I will call malloc and get a big chunk of memory, do with it what I please. And then my randomization doesn't help you at all. Helps you a little but not as much as I like. But other than that, there are no technical limitations to it. All right. Any other questions? All right. Well, that's it. Thanks for your attention. [applause]

22969 >> Ben Zorn: It's a great pleasure to introduce... of Massachusetts. Emery is a long-time friend and colleague. ...

Related documents

Products

Support

22969 &gt;&gt; Ben Zorn: It's a great pleasure to introduce... of Massachusetts. Emery is a long-time friend and colleague. ...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

22969 >> Ben Zorn: It's a great pleasure to introduce... of Massachusetts. Emery is a long-time friend and colleague. ...