>> Weidong Cui: So it's my great pleasure to... associate professor at Penn State. Before that, Trent worked...

>> Weidong Cui: So it's my great pleasure to introduce Trent Jaeger. He's associate professor at Penn State. Before that, Trent worked on many years in IBM Research, Watson, and Trent has done many interesting work on operating systems currently like access control, some SELinux stuff. So today Trent will talk about his recent work. Thank you very much. >> Trent Jaeger: Thank you, Weidong. Thanks everybody for coming. So I'm going to talk about taking the various system component that exist now that enforce security policy. So virtual machine monitors as well as operating systems as well as applications, and trying to put these things together into -- in some coherent way into a system that can enforce some well defined security goals. And hopefully all of this will be measurable in some sense. So we'll be able to measure whether the goals are enforced and if they're not enforced, you know, where they're not enforced, or if they're enforced based on some risk such as accepting some bad data in a way that won't break your program that we can identify all of those risks and so we get a comprehensive view of what's going on. So, yeah, I wanted to say a little before that. So this work is sort of built up over the last probably seven, eight years based on what I was doing at Watson. And when I was at Watson, the last four years I was probably as Crispen [phonetic] was familiar, the security liaison for the IBM Research to the Linux technology center. And so we had some interactions there. A lot of the examples will be in terms of Linux, the work is basically very Linux focused. But we'll just sort of assume -- I mean, the problems are the same across different systems. And so we'll just, you know, use Linux names to protect the innocent or something like that, or blame the other guilty people. So at Penn State we have a security lab, three faculty in the security lab. Patrick McDaniel and I are the co-directors. The lab, Adam Smith joined a couple years ago. Adam won what's called a PK award which is an award for the best -- there are more than one, but the best set of NSF career awards. So his career award was accepted, and it was one of the best career awards. So get a firm handshake from our president at some point. We work on basically all facets of hosts and network security except for hardware security, and in addition Adam does cryptography. We focus a lot on system security. That's my main focus. And then Patrick has done work on telecommunication security. I've looked at that mostly from a systems perspective. In terms of language based security, we mostly -- we aren't developing so many new language based tools. We're building on fundamental language features and trying to do useful things with those. Okay? My shameless plug for my book. This came out last year. So I don't know if you're familiar. Morrie Gasser has a book that he wrote in 1988 on how to build a secure system. It's out of print, so you have to scrounge around to find it. It's actually a pretty interesting book. But it was one of the main principles in that book on how to build systems was a thing called reference monitor. And we'll come back to this today, also. And so he described how you would go about building an entire system, including authentication and other features based on reference monitor principles and some others. And so my book looks for specifically at just the reference monitor principle and how well individual systems have or have not achieved the guarantees required of the reference monitor. And if you don't know what I'm talking about, I'll get to it at some point in the talk and explain it. But anyway, it looks at a variety of systems and a variety of different operating system mechanisms and how they are and are not amenable to the guarantees or achieving the guarantees of a reference monitor. Okay. So I have a lot of slides, and we have a fair bit of time. But I want to tell you where I'm going first so you have some idea what to look for, what to ignore, and what to focus on. So basically I'm going to start out talking about classical information flow. So multilevel security and Biba, and that this is, you know, formed the foundation for computer security in the DOD environment for, you know, 35 years or so. And in terms of building what people would consider, you know, bulletproof secure systems, this has been the focus of building those systems, one of the foundations for building those systems. These policies are really and idealization I'm going to claim, show you why, and give you an idea of where we need to go. What's going to come out of that first is that we do depend on a lot of application-level policy enforcement. So we started out with some applications outside the OS doing security enforcement. You know, Multix had more than one -- more than two rings, it had more than the OS and the rest of the system. It had several rings, and they assumed they'd have other trusted programs. Maybe not quite as trusted, but they would still be trusted. Whereas for many years we pushed security in the security community further and further down into the operating system and tried to trust less and less. But what of course is happening, and you guys see this here, is that there's a lot of security decision making going on in the application level and we really need to take that into account when we're building systems at large. We can't just ignore that. So we're going to look at application-level security, and then we're going to look also at adding the virtual machine monitor enforcement into the picture. So now we have virtual machine monitors and operating systems and applications. And of course if we go further, we also have network enforcement as well. And so ultimately we'd like to get our head around all of these things. And so I'm going to talk about what it takes conceptually and some progress that we've made on the last slide. That's the really very recent stuff that we're still sort of formulating and working on the paper in our head. And then the earlier piece of work that rectifies application, particular type of application, security angle fortunately with the system that it's running on in an automated way. Okay. So here's sort of what I just spoke about. Historically, we have the security community looks at, hey, we have operating system, it enforces the policy, we don't have to depend on the applications to do anything and what happened is in the '70s we found that with multilevel security in fact, we could show that even if there's a Trojan horse running in one of these applications that we could prevent the operating system policy and its enforcement mechanisms modulo covert channels could prevent secrets in the Trojan horse infected program from leaking to some other application based on this MLS policy and this enforcement at the security level. And so the bias seemed to go more and more toward pushing things into the operating system into really focussing on, you know, making this small and making this the full basis for enforcement in the system and trying to take anything away from the applications, assuming that we don't trust the applications, things like minimizing the TCB became maybe overminimizing. So what we have then for these kind of systems also in the DOD security community were these two types of approaches to security. One used much more than the other. Secrecy was a much heavier focus than integrity and still is. So the policies basically, multilevel secrecy says we're not going to allow any leakage of secrets to unauthorized principles. Okay. And so we're going to set up based on clearances and access classes of objects so that if you're not cleared to access certain objects, you won't be able to gain access to that data. Biba is just the dual of this for integrity. It says hey, if you have some low integrity process, it's not going to be able to receive -- it's not going to be able to send information, I should say, to a higher integrity process because we don't want that higher integrity process to depend on the data that it's receiving from the low integrity process, and so Biba will try to prevent this. And so both of these policies are described as information flow policies, and so they get interpreted in the community this way. So we have for secrecy some secret entity, and it may or may not have a flow to some other entity, and if it happens the way you configured the system that there's a mechanism such the secret system can cause information to flow to the public system, this would be in a violation of the MLS policy. And it's exactly the same thing, just in reverse, for Biba integrity. So if there's a flow from the low integrity process to the high integrity process, this would imply a dependency. So I mean there may not actually be a leakage, right? You know, you do a -- you enter your password into a program. That program that's receiving a password that's entered knows the secret password. Maybe you're an attacker, you don't really know the secret password. So you enter the password into the program, the program tells you that was the wrong password. Well, that's actually a leak of information. That's saying, hey, that password you entered isn't the right one. And it narrows the space of searching that you'll have to do for the right one. Maybe not very much if you have a reasonable size password, but it is from an MLS perspective a leak of information. Similarly, you know, you have some high integrity process. You know, it's going to provide services such as changing your password, for example. And when you access -ask to run that program, you're providing input to that program from a process that's -- you know, from a shell or something that's not necessarily trusted. But we still depend on the password program to do the right thing. But both of these would be violations to these policies. And so what you would need is some assured -- fully assured guard process to make sure it's okay to communicate between these two parties in both of these cases. And obviously in commercial systems we don't have such processes. So what we have is a situation where, you know, secrecy, we have had some success building systems for the DOD environment. Trusted Solaris has been around for a while which is essentially a commercial system that provides some -- provides enforcement of multilevel security in some -- in a mostly general purpose system. And we have other systems that also enforced MLS. Whereas we really haven't had any luck with integrity. So Biba and Clark Wilson and integrity models and variations on them are still things that we're toying with how we're going to make these work. You know, you guys have the Vista model, the UAC model, and that gives us a partial view of Biba. It didn't give us the other end where the trusted process might read bad stuff, but it prevents the bad guy from writing up. And so we're still trying to figure out how we're going to implement integrity in a comprehensive way that is practical. And so what happens is we do things like just trusting the applications or just trusting the programs to do the right thing and hope things work out. And, you know, we all know the story there. Okay. So what's an example of some trusted program? So the simplest one that we have in Linux environment is something called logrotate, and all it does is age logs. Okay? So it's a pretty simple thing. But it ages logs across the whole system. And because they're a set of logs, the logs may include information from processes that are processing secret data. And so the logs may contain secret data. And so in SELinux, MLS, logrotate is a trusted program. We designate it as trusted. Trusted to handle secret data and not leak it to low secrecy processes. As well as the program receives this log data which could also be from low integrity process, it could be from any process on a Linux system, and we're depending on it not to be compromised by that log data. And so logrotate is trusted to protect its integrity. Okay. So it's a trusted program. And it turns out when the SELinux folks identified or configured their MLS system, there were 34, I think, at the time I looked at it. Which was a couple years ago. There may be more now. They're never less than SELinux. That's one thing we learned is that rules never shrink. So it's always more than whatever it was when I looked at it. So there were 34 processes that were labeled as trusted. And so what was the basis for labeling these processes? Trusted was essentially that we had to trust these processes. There wasn't any sort of analysis of these that they were really doing the right thing. But we just have to depend on them. And so we're going to have to go back later and make sure things are okay and, you know, keep an eye on them and this sort of thing. But, you know, we want to get our head around this a little better. And for integrity, you know, it's just much, much worse, there are much more -yes? >>: [inaudible]. >> Trent Jaeger: Sure. >>: I mean here I don't see the Linux ->> Trent Jaeger: Yeah, yeah. Well the Linux kernel is clustered there, yeah. >>: [inaudible]. >> Trent Jaeger: So Linux kernel is definitely trusted too. So, yeah, this is probably -- I'm sure this isn't a comprehensive list, and definitely the kernel is part of this. >>: Is it only for [inaudible]. >> Trent Jaeger: Well, they do have a label for kernel too. >>: You do have? >> Trent Jaeger: Yes. >>: So that means all the drivers? >> Trent Jaeger: Well, it refers to user level processes. So processes running in simpatico with the kernel are labeled as kernel. Let's see. So and then for integrity, you know, obviously there are a lot of client processes, e-mail clients and browsers and so forth that receive low integrity data and we're expecting them to do the right thing, as well as many, many server processes, network facing daemons you're aware of course from Apparmor. And we're expecting this all to work out. And we haven't got our head around this -- gotten our head around this yet. So what we end up with is this kind of situation where we have policies being enforced in the applications as well as the systems. And they're written, you know, you -- certainly the system administrators have enough trouble configuring a system policy. So we're not necessarily doing a whole lot of work with application policy. So currently these are separate. And by policy I couldn't -- you know, it could be a procedural thing. We'd like a declarative policy, but many applications have procedural mechanisms to enforce whatever security they have. And now we have virtual machine monitors, you know, hoorah. We have another layer which is coarser grained where we can make maybe better decisions about security. And isolate these bad things that aren't working so well at the higher layer. But what happens is in many cases is that we have authority here and we delegate some authority to the operating system to do the right thing and then we delegate some authority to the application. So we refine the problem a bit, but we don't eliminate the problem. We still have problems that there are applications that are dependent upon to do the right thing, and we still need to do something about this. And we want to get a picture or an idea in our mind about how all of this is working together and whether it's all achieving the security we have in mind. And of course networking is in the picture as well. I'm just not talking about it here specifically. So with respect to information flow policies, we've done as a community two things. One I've talked about a lot, which is networking the exceptions. We label it trusted, and we hope for the best. We allow it to receive data we know is dangerous, and we hope for the best. Or we go to a policy that's not information flow aware. We just, you know, information flow is not working, it's too restrictive, and so we need to do something else. And so we go to some more access matrix oriented policy and then we often have problems coming up with meaningful security goals. So Apparmor probably had one of the few meaningful security goals in the sense that we want to combine network facing daemons. That's a security goal. Saying we want lease privilege is not a security goal. >>: [inaudible] criticisms the SE people -- SELinux people level that the security goal wasn't [inaudible]. >> Trent Jaeger: Okay. But the targeted policy is essentially the same. Yeah, yeah. But -- and the targeted policy is essentially at the same security goal. Yeah, I don't know what to say. But I believe it's a security goal. >>: The difference is that the targeted policy has a very meaningful goal for people who have to actually maintain real machines. But it's too dirty for a mathematical purist to get information flow out of it so it's [inaudible]. >> Trent Jaeger: Well, we try to get information flow out of SELinux style policies but the targeted policy because unconfined is ->>: Yeah. >> Trent Jaeger: Yeah, yeah, exactly. But we're going to try to get our head around this a bit today. We'll see how it goes. So what we end up with are systems where we have incomplete policies, policies that provide some security but don't provide full coverage of an information flow. They partially compare it. And then and or, so we're going to have both, we have also complex policies, policies where we have a lot of rules and it becomes challenging for people to figure out, did I do the right thing. So if you have thousands of rules, you know, is there one that's wrong? This is a, you know, non trivial problem, if you've ever -- probably not a lot of SELinux systems here. But if you try to administer them, it's -- you know, a very challenging task. So figuring out how to administer them is what we're going to work toward. So in terms of related work, clearly not going to spend a lot of time on this, but there are clearly a lot of applications that enforce access control. I don't think that's a implicit to anybody, databases being the most well known. Atomicmail is just an e-mail client. It's and old system saying oh, e-mail's receiving data that we care about, so we want to enforce security there. So it's more of a seminal thing rather than a system you currently work with. But clearly all of these programs and many, many others enforce access control. And then there are languages that have -- yes? >>: I just want to cater to the audience. Microsoft IRM in the apps. >> Trent Jaeger: IRM? What is IRM? Sorry. >>: Intellectual rights management. >> Trent Jaeger: Okay. >>: It goes by a bunch of names. >> Trent Jaeger: Okay. >>: It's basically ->> Trent Jaeger: Yeah, I'm DRM still. >>: Sort of mandatory crypto e-mail so you get an e-mail that somehow says IRM, the e-mail client won't let you print it or cut and paste it or anything. >> Trent Jaeger: Okay. >>: This of course is totally ownable if you crack your machine, but in principle it didn't [inaudible]. >> Trent Jaeger: Right. Right. Yeah, I'm not really going to get at quite that -that far, too. The systems that I'm looking at I'm assuming that the administrator and the user are -- have compatible goals? I'm not going to go as far as that -those kind of goals, but that kind of system is appropriate. >>: Yeah. >> Trent Jaeger: Yeah, definitely. So you're all familiar with I'm sure Java security, and many other systems have security mechanisms for their run times and then of course languages that have information flow security built in that you can connect to the system such as Jif, and of course Perl taint tracking. And then what we're seeing recently is work where the systems developers are trying to get their head a little bit around what's going on in the application. So we see in SELinux community they have policy servers for the applications as well. So you can sort of bring SELinux up to the applications. And then there's a variety of work in the research community. Some of ours and Neng Wui Li and and Shakar [phonetic] at SUNY Stony Brook looking at, okay, we're depending on applications to make decisions. How do we manage the overall security of the system given that these applications are going to make specific decisions and how do we reason about system security there? And then an interesting piece of work is this DIFC work from Stanford and from MIT, where they're allowing applications to create new labels that the system will be aware of to delegate those labels to other processes, and then to try to get an idea of what's going on in the system at large even, you know, taking into account certain things the application can do. What this doesn't enable us to do is know what's really going on inside the application. Does the same semantics for the labels as the system that it's handling, and is it -- are the information flows within the application compatible with the information flows that the system is trying to enforce? And do we really trust the application relative to some security goals to do the things that it's going to do? So, you know, if it's making -- the applications are making decisions about delegation and things like that and do we really want, just because it can communicate with the sort of thing, do we really want it to delegate this label access to this other process? Yes? >>: Could you clarify the second [inaudible]. >> Trent Jaeger: This one? >>: [inaudible] programming system [inaudible]. >> Trent Jaeger: Yes. So what I'm talking about is like the Java access control. So Java, if you have a -- you know, a class loader and you're running some program in Java, it can access the underlying file system, and you can write permissions for that particular set of code or that particular class loader or code based on that to access your file system. And so you can control code that's been loaded into the same Java ->>: So this thing is code access security procedure. >> Trent Jaeger: Yes. Thanks. So, yes, C#. >>: [inaudible]. >> Trent Jaeger: Yes. Thank you for being here. So, yeah, and a lot of programs have mechanisms for you to describe in your program what individual threads of execution can have access to -- in terms of resources from the system, what files it can access and sockets and so forth. Okay. So we're going to look at how these layers of security may be deployed in independent ways, maybe have independent label spaces, can be put together into a way that may mean something coherent. Okay? So that's basically what I'm saying here. So what we're going to do is we're going to build -- we're going to go back to information flow. We're going to start there. Because that's really a definition of a goal. And we're going to build a model of system policies and goals and assess whether the system consisting of applications, operating systems, VMs is really doing the right thing with respect to the goal. And so the hard part here is going to be identifying the goals. Because we don't have any specific definition of goals anywhere. We have policies. And whether the policies really reflect the goal or not is ambiguous. Then we're going to evaluate the compliance of the policies with respect to the goal. And evaluating compliance of policies, when one policy does something that another policy expects is not itself a new problem but couching it in terms of reference monitor so that we can ensure that the enforcement is really being done according to the guarantees. And again, I'll talk about that in a minute. That is expected of an enforcement mechanism is important and new. And then we're going to look at the problem in a more general -- in a general way look at generating constraints on how the policies relate to the goals necessary to evaluate general purpose systems. Okay. So I've been working on this stuff, this kind of stuff for a while. We've looked at trusted programs for example and looking at what it takes to enable them to enforce security goals in practical systems and being able to tell the system, hey, you know, I'm handling this untrusted data, but I know what I'm dock in terms of I know I'm expecting untrusted data. And you can check whether you believe that I'm handling it in the right way, but this is where I'm doing it. So we've made Clark-Wilson integrity practical and loosened the requirement for full formal validation in Clark-Wilson by only requiring that programs that receive some untrusted input only do so through limited interfaces that they declared in the system. So the operating system will only give them the untrusted input through those interfaces and those programs must filter them. And so you could check whether you believe that the filters are acceptable or things like this. But we are explicitly defining in a way that connects the application to the system where the untrusted data is coming into that, into the application. And so what we're doing now actually is looking at all the places where bad data goes into trusted programs in a similar, so we have written a tool, we're just starting to collect data with it now, where we can find out all of the places where a program receives data from some other program -- that some other program can modify that we consider untrusted. And so you know, a question is well how many interfaces are there on a system that receive untrusted data? Are there 10, are there a hundred, are there a thousand, are there a million? So we're going to get an idea of how many there are and start to assess also weather there are commonalities and how untrusted data is handled. Right now it looks, you know, obviously things like the classifiers and endorsers are ad hoc. How you handle untrusted data depends on what your program does. But there should be, you know, things like type safety. You should handle untrusted data certainly in a type safe way. And so there are some things, some libraries probably that we can build that can help define how you handle untrusted data and so we can -- and programmers can use this across programs. So we're trying to get an understanding of that. And then the other thing we're doing, I mentioned to Weidong earlier, has to do with mediating security sensitive operations. So if you have a program that's supposed to enforce a policy, it's supposed to mediate all of the ways that you can access this unsafe data, and so with -- I think actually talking to Crispen [phonetic] probably started us town this path when you guys were working on the LSM, I asked you, you know, well how did you do it, and you said, well, you know -- or -- whether you believe ->>: [inaudible]. >> Trent Jaeger: Yeah, yeah. I mean, you -- was there a principle on which it was done? Did you know it was correct? And you said of course the correct answer which was of course well, no, we didn't know it was correct. So we built some tools to look at whether placements of reference monitor hooks mediated all the security sensitive operations in the system. We've gone from finding bugs to trying to identify in code what security sensitive operations are to now having a mechanism for automatically modulo some annotations placing mediators in code to do runtime checks, to do -- to locate where endorsers where should be and where the classifiers should be. And so we have a prototype that works for Java. Not so much security high trust programs deployed in Java that we have source code for since I don't work for IBM anymore. But -- and probably you guys have many in C# or something like that would be interesting to us. But we're now looking at C programs. So we have the tool -- starting to work with C programs. We have our first intraprocedural analysis going, you know, so we're getting this. So I'm going to talk about two experiments though, the rest of the day. One is taking a specific kind of program, which I'll call a trusted program. So this is a program that runs on your system that you're entrusting with some authority to process data on your system according to some security goals. And so you want to be able to take these programs from packages. You want to be able to download the program, verify that you're going to be able to -- that this program is going to be able to enforce the policy that the system has in mind when you deploy it in a mostly automated way. So you'd like to take the package and when you install it, you want to not only know that you've installed the files, but you wants to know that okay, this thing is going to enforce a system policy every time it runs. So that's what we're going for for limited programs. And then we're going to try to generalize some of the ideas that we found here into, you know, all programs across virtual machine monitor operating system and application layer. So this will actually -- the second part will actually focus more on the virtual machine monitor and OS rather than applications. But the principles are the same. Okay. So we're going to giving in. Okay. So basically we have a program, and we want to download this program, and it's going to enforce some policy for the system, and, you know, when you install it is it really going to enforce the policy the system has in mind? That's the question. So in our context, we're going to download some Linux package on to a system with mandatory access control policy, in this case SELinux. So it could be, you know, any kind of package of any kind of program downloaded on to any system that enforces mandatory access control. And we're going to verify whether the program and policy that we -- so the program is going to have its own policy that's going to enforce. And so is that program with that policy going to enforce the systems policy? So that's the question. So what's interesting is that the package will contain not only files such as executables, libraries, data, configurations for that program, but also in this case it's going to contain a module that's going to extend the system policy. So the system policy exists, but it didn't know anything about this program. The program may have additional information that it wants to label. And so it's going to extend the system policy with that. And we want to know what the implications of that are. And then the program also is going to have its own policy. So it could be a Jif program, which has its own labeling, or it could have a user level reference monitor of some kind. So we looked at these two if it could be a Java program with Java monitoring or C# or Ruby or whatever. Okay. So what's going to happen, we hope, is that the program -- so we want to compose a program policy that enforces the system's goals. Okay. That doesn't sound too bad. So we have a program policy. We have a system policy. So we want to generate something in the end that's a program policy that satisfies the system security goals and at the bottom protects the program. So if there program data that have protection requirements, we want to also ensure that within the program. We'll make this a little more concrete in a minute. Now, by system policy, how the system policy's important is you know clearly the program has been written in some, you know, environment. But it doesn't know what your data is on your system. So we're going to deploy it on the system. Your system is going to have some policy. And we want to know that this policy is really being enforced by a program that didn't know about that policy before it got deployed. And then we're going to extend the system policy. So we have an SELinux module for the program or some policy, and we want to know that this policy is really being enforced by a program that didn't know about that policy before it got deployed. And then we're going to extend the system policy. So we have a SELinux module for the program or some policy module that's going to extend the system policy and we want to know that this program when we deploy it is going to be tamper protected on that system. So why do we want to know these things? These seem a little ad hoc. So we want to know these things because of this reference monitor concept, okay? So this concept was identified by James Anderson's panel in the early '70s, 1972 they wrote the panel document. There were about 10 people that worked on this. And they identified these three requirements for reference validation mechanism. So reference validation mechanism being a mechanism that enforces the security policy. So we have to have complete mediation. So, you know, every operation that's security sensitive has to be mediated. The reference validation mechanism has to be tamper protected. So we don't want untrusted processes messing with the policy or the code, right? That would circumvent security. So here if we don't -- if we have a security sensitive operation that's not mediated, obviously security is circumvented. If we can mess with the reference validation mechanism, then obviously we can't enforce anything. And then lastly, we want to know that this thing is actually correct in some way. So the way they stated it was that it was simple enough to be verified in some sense. They refer to code primarily, but we're also going to look at policy. Does the policy that's with this actually correspond to what we expect? So these are the requirements. And so basically from the early '70s through the mid '80s, this was the driving force behind secure system design. And then it's been sort of a little more on the periphery for a while, until the early part of this decade when people like Crispen [phonetic] started adding code that satisfied these guarantees or at least worked towards these guarantees. I mean, it's very hard to satisfy this last one in a general sense. You have to have a very small system. But work forward these guarantees for a commercial system. So certainly we want complete mediation, certainly we are aiming for tamper proofing when we configure this thing, and we're hoping that we're correct, but, you know, systems can be somewhat complex, and it's hard to assure that in a formal sense. But this is what we're aiming for. So complete mediation I'm not really going to talk about. We're going to assume that the program already has that and so some techniques that we -- that we've worked on such as the automated replacement of security code will work toward this, but we're going to assume this has been done in some way. We'll have some signed Jif program or something like that, and we can see that it's -- has complete mediation. So we are going to test tamper proofing. We want to know that the trusted program components that are downloaded are protected for untrusted programs. And this is a little -- you know, it seems fairly -- it's straightforward in one sense that we know some trusted program components, but we don't really know how many of them are necessarily need not be tampered and how this works on a particular system. And then more so we don't know what the set of untrusted programs are. Or maybe the converse, what programs we needy to trust in order to make a tamper proof guarantee of our process. And then we want to know in the running program not only the permissions of the process, but also the program policy protects the integrity of the program data when it's been loaded into the program. So we're going to test this. And we're going to test the policy. We're going to -we want a method that will ensure that the result -- that the program policy we're enforcing in that policy does enforce the system policy. Okay? And we'd like to do this automatically. So we're going to build some package. And we're going to build it somewhere, maybe here, maybe there, you know. And we want to then download it on some specific machine. And so we're going to compose a system policy that we're going to validate that there's tamper protection automatically on that particular machine, and we're going to compose a program policy for that particular machine, also, that we know will enforce the system goals. And then you can just run the program, run, run, run on that particular platform. And so it doesn't matter which platform. So what we found is that both of these correspond to what are called compliance tests. So that is there's a policy and then there is some security goal or some requirement. And so what we want to do is test whether the policy satisfies the requirement. And so there are two different tests that we want to do. There's one for tamper proofing, where we want to know that the system policy that includes the new program prevents un-- yeah I thought it says it permits untrusted subjects. It says does not permit. It prevents untrusted subjects from modifying that program. So we'll have to figure out what that means in some concrete terms. And then for verification we want to know that the program policy enforces the system's policy and protects the program from untrusted data that it might use. And so these are two policies and two requirements that can be expressed all in terms of information flow. Let me just go ahead here. So we're going to take the program policies and represent them in information flow terms as well as a system policy. And then we're going to want to evaluate those policies against security goals. And so the compliance is defined in terms of information flow this way where the flows of one program are all authorized by the goal, okay. So you have a policy and it has a set of information flows that are authorized, and the question is whether your goal also authorizes those flows. And if there's any flow that's not authorized by the goal in the program is not compliant. So this is pretty easy to test. The tricky part is setting up the property, is coming up with the policies, right? I mean because testing it is easy. But you have all of this stuff, and you have these ideas, and so the question then is well, how -- how do you know whether your goal is expressing the right thing so that you can do the test, and how do you interpret the flows from the policy that are relevant to the goal? And so ultimately we're going to talk a bit about mapping. So basically the idea is we have -- we have a program policy, and the program policy may include its own labels that the system has never seen before. And they provide these with this SELinux module. Okay. It should be allowed to do that. And then the system has its own labels. Actually I have it backwards, right? So this is the system, and this is the application. And they their own -- or no, I guess that's the system. Yeah, it says underscore T. And this is the application. But so the system has its own labels, and it has some semantics for what the labels mean in terms of information flow. And the program has its own labels, and it has its own semantics for what information flow means there. And you can interpret them separately. And we've done this for a long time, right? All of these user level programs, databases, Java programs, C# programs, whatever have their own policies and they enforce it, and it's sort of the system doesn't know what the heck's going on. It's hoping for the best. It set that bit. You can do it. Good luck. So now here we're going to try to figure out what the mapping should be. Okay? So we're going to try to come up with basically constraints that imply how these individual labels are mapped. And so we want to come up with constraints that will ultimately as much as possible give us a full mapping or at least gives us a mapping that indicates that the combination is safe. So we don't have to get a complete mapping, right? In SELinux we have thousands of types. We don't need to map every label in the application to every SELinux type. We don't even necessarily have to map it to one specific type. We just have to map it to types that result in compliant systems that we can test compliance and that it works out. And I'll talk more about that general idea later. Okay. But let's stick -- right. I'm jumping ahead a little bit. So let's stick with this. This is more concrete. So the basic intuition of the mapping and our -- we have paper from USENIX security last year talks about this intuition but not in as formal a form as I'll ultimately get to. But the idea is simple idea, you know, it just takes sort of the basic idea that hey, the trusted programs, these are programs that are going to run on your system and you're going to provide system data to these programs, and you're going to depend on the trusted programs to do the right thing every time it runs. So these trusted programs and it's not supposed to lower the integrate of any of the data it receives. So from an integrity standpoint, from a Biba standpoint, the trusted programs are higher integrity than any data they received. So it's a little counterintuitive that the system and the systems data is lower integrity than the application and the application's data. But that's what's going on. And I think that's -- let me know if you have any questions about that. And so the idea is then that the program components must be tamper proof. They must not be modified by anything that's untrusted. And so we're going to define a very small number of things, as small as we can number of processes. Basically things like a knit in the Linux environment so the initialization and the installer. And that's about it. So there are eight programs we're going to identify as trusted. And nothing else is allowed to tamper with these programs, modify any of their files. And then the system policy is in a sense isolated from the program. The program is higher grit, the flows are -- you know, the program flows. But whatever we're supposed to do with the system data, that's sort of defined by the system, and we should just be able to plug that in independently and tell the program, hey, this is your policy, enforce that system policy. That's your policy now. And, you know, make sure you protect your data from that. But just enforce the system policy, please. Okay? So that's the intuitive idea. It's a simple idea, but we're going to talk about whether it works. So the idea is that we have a -- so this is the program's view now. So the program is going to in a sense have this sort of model where -- and really this -- these are sort of all templates. So it's going to put its stuff here in HIGH, and whatever the system policy is, it's just going to jam it in the middle. And we have this catch-all for low integrity stuff. We haven't really found anything from trusted programs that goes there, but we still have it. The only exception to this kind of flow model, this is an integrity graph. So these are the high integrity items, so the program stuff can only -- can flow to the system stuff. But whatever data is on the system outside the program data can't flow to the program. And so in this model anything in the program could flow to the system, but that of course isn't true, because you may have secrets in your trusted program like SSH keys or something like that. So there are a few exceptions that you may have as high integrity and high secrecy and they can't flow down. But that's the basic idea. Okay? And so for a program like logrotate, a simple little program has some executables and some configuration files and then it receives in log files as input. We're going to ends up with this kind of integrity lattice for this program. And so logrotate will have these labels for these guys, and these are higher integrity than any of the labels for any of the log file data. But whatever the labels are, the log file data, if you have secret data or we're talking mainly about integrity here, but whatever the policy is for the system flows, it's going to enforce that on behalf of the system. You just have to plug it in. You don't have to do anything fancy to map it together in this particular case. Okay? So that's the basic idea. And we built some tools to test this out. There's a TISSEC paper coming out on the analysis tool. So it takes policies, it used to take Jif policies, now it can take SELinux policies for both this guy and this guy. It basically just needs mandatory access control policies and a translator to translate those in the information flows. So there are a number of policies that could be supported. And then we define in the case of the TISSEC paper a specific mapping manually. But what I'm talking about here today is generating these mappings automatically and without manually specifying. So this is a key thing. So we want to come up with mapping constraints now rather than a specific mapping. A mapping -- of course a specific mapping is a mapping constraint. It tells you this maps to that, that maps to that, that maps to that. But we're going to come up with higher, more abstract constraints and do the mapping. And then we have a system that will only load programs when they pass compliance tests, and this was published in USENIX Annual called SIESTA. So the idea is you build your program, and you generate compliance information, and then when give the program to SIESTA, it will only run that if it passes compliance tests. And of course SIESTA is configured on a system so that it many only have -- you can't circumvent and get some other program loaded that would be able to get these permissions without going through SIESTA. So it becomes assured pipeline, if you will, for this. And so one question is, you know, would you want this sort of thing in, you know, Microsoft style environment? Would you want to be able to have individual programs be able to download them to your system and configure them relative to system policies and deploy them with that authority in an automated way? We have some of these mechanisms. We haven't done it in a completely general sense, in the sense that we've looked at specific programs one at a time, but obviously you'd like to look at all of the programs and make sure all of them are tamper proof with respect to each other and when you add a new program you're doing the right thing with respect to that program given the history of the system. And so we haven't fully developed that history. But all of the pieces are there to do that sort of thing for these trusted programs. I should emphasize these are very specific kinds of programs that won't work for every kind of program. Okay? But so think about that. Okay. So what does it take? We built this palms tool, and it takes like about four seconds to go through and evaluate whether the program is compliant. And we evaluated it on eight trusted packages. We would have liked to have done more, but we needed SELinux policy modules. And of the 34, most of them don't even have policies explicitly defined. All the policy that they enforce is procedural. I mean, they're trusted so they must enforce some policy. But 26 of them don't have trusted packages I presume. Don't have, I'm sorry, don't have SELinux policy modules. So they don't have things that they define. I think I'm overspeaking a little bit. It may be that they just don't define any new labels, also. That would -- that would be required of policy modules. So in any event, we tested these eight, and we found that the policy was mostly set up correctly for these trusted programs, but there were still a couple of cases where any program that was loaded in etc could mess with any other program that was loaded in etc. It's what we wanted was each of the programs to be completely separate. And if you split a couple of permissions in etc and for the programs we looked at, it looked like that would be perfectly feasible, these files that were being added were independent for those particular programs, then these programs would be isolated from each other. Yes? >>: So would an example of [inaudible]. >> Trent Jaeger: Yes. Yes. Exactly. So the program in the package has a configuration file that gets load into the etc directory and because it's sort of a pain in the neck to introduce more labels and all of these programs are sort of trusted any way, the ones that load stuff into etc, but there's, you know, no reason that, you know, some other program should be able to modify this other -this logrotate, for example, configuration file. I mean, there just isn't any ->>: Is that -- isn't that the problem of the [inaudible] though, I mean, some of it [inaudible]. >> Trent Jaeger: Right. >>: [inaudible]. >> Trent Jaeger: Well, you could have different labels for each of the program files. This is the logrotate configuration file. And so it's an etc. And it, you know, it's just a place. So it can't really matter so much where exactly it is. If your labeling of that is fine grained enough. You know, making the labeling more fine grained is -- you know, obviously making SELinux even more fine grained, you know, believe it or not, they are trying to resist that in some cases. So it's -- but here, you know, you may -- this -- making this more fine grained would unable you to say that this program is in fact tamper proof accept for the installer and a couple of initializations. >>: [inaudible] label them definitely in SELinux is one of the most difficult [inaudible]. >> Trent Jaeger: Yeah. And so not too many people have really -- well, a lot of people have looked at it. But it's -- you know, people came up with ideas. And it seems to work. And as long as we don't have, you know -- it didn't cause a specific error then it's okay. But if this program would somehow get comprised, then it can go through and compromise all the others and vice versa. >>: [inaudible]. >> Trent Jaeger: Yes. >>: Okay. So read-write permission for the entire registry? >> Trent Jaeger: Well, there are some programs that I think that have some specialized etc files, so it maybe isn't everything in etc, but it's basically the directory and all the files in there with a couple of exceptions probably. But that's what it's tantamount to. Okay. So we looked at the specific programs. So I'm running a little behind. And they have a natural relationship. And we use this to determine mapping constraints that we could then test compliance on the program so we could come up with an automated soup to nuts way of deploying these programs to enforce policy. And now we want to know, well, is there a more, you know, is there an approach where we can leverage this for general purpose systems and can we determine whether a system in a broad sense rather than just a single program is in fact compliant? So that's what we're looking at now. And so we're doing that in -- looking at the virtual machine monitor policies in Xen. So recently the Xen community introduced a reference monitor interface for the Xen hypervisor called Xen Security Modules, excuse me, and it -- I guess we had done a prototype of this called Shype, but they generalized this to cover more security sensitive operations. We made some assumptions about what operations would be security sensitive in the Shype work. And they've been more literal will about what security sensitive operations are. And so they've identified more places in the kernel. Or, sorry, in the hypervisor. And they have policy model which is very similar to the SELinux policy model for that. And Shype -- our Shype stuff was also supported by this. But we're looking at the Flask policy. So Flask is in fact a precursor to SELinux. So you could think of it as SELinux. Same kind of policies. So we want to determine if the VMM policy and the policies of the VMs together and ultimately the policies of the applications lead to a system that enforces the security goals you have in mind. So, you know, you have a system, and it has some VMs, right, so we might have a privileged VM that gives you access to hardware. And then you have some server VMs that might provide resources to a number of applications enabling applications to work together. And then you have some applications, VMs, user VMs, they may be isolated or maybe a few of them may work together as peers and you think of like a cloud environment where you may have a few VMs that may work together or may have a isolated VM deployed in the cloud. So we have this kind of system. All of the VMs have their own mandatory access control policies and then within them they have applications that enforce policy and may have authority to do things that protect the system as well. And then all of this is administered by the XSM/Flask policy. So the Flask policy determines what VMs can talk to what other VMs. Okay? So -- yes? >>: Is a VM part of the system [inaudible]? >> Trent Jaeger: Yes. So Xen hypervisor is the virtual machine monitor for the Xen system. And so it has a policy. So it's going to do things like enforce what VMs can use the low level Xen mechanisms, grant tables and event channels to communicate with what other VMs. And then, you know, how memory is distributed among the VMs in the file system and things like that. And so that will be controlled by the Flask policy. And then within the VMs, this will say, you know, sort of the normal things about okay, this process can access these files, and these files of course are backed by the, you know, the XSM policy, whatever the partition was that the particular VM got. >>: In this case, a VM is [inaudible] becomes kind of the OS. VM [inaudible] application was this kernel. >> Trent Jaeger: Yes. Yeah. So it's essentially the same kind of problem. And this is, I haven't shown you any of the policies before. But those are the policies. So we're going to look at VMM policies. I think that's what those are. And then we have network policies that connect communication channels between VMs with labels. So networking is integrated with SELinux and a few different ways. And then mandatory access control within the VMs, using the SELinux policy within the system which is the same kind of policy that I was just talking about before with respect to the applications. I'm not showing any application policies here. And so we have a virtual machine monitor. I think many virtual machines, many of them have mandatory access control policies. And we're going to look at whether we have compliance in this context using information flow based analysis. So obviously this is more complex. We have more policies and more big policies. So each SELinux policies we're using right now is 35 megabytes of source. So you thought Vista was big. Yes. >>: [inaudible]. >> Trent Jaeger: The policy was written by hand, yes. Or no, no, no. I'm sorry. The -- this is the macro expanded policy. >>: Oh, okay. >> Trent Jaeger: So there's a policy that's smaller than this that was written by hand. It's still fairly substantial. >>: [inaudible] policy [inaudible]. >> Trent Jaeger: I think we're still megabytes. I'm not sure exactly. But it's still ->>: How many [inaudible]. >> Trent Jaeger: People have been working on it for a long time. >>: Okay. Thank you. >> Trent Jaeger: And so basically when a program is introduced to a Linux system that's security relevant, somebody comes up with a policy for that and then that becomes a policy module now so we can extend the policy easily. And so -- and these policies, I mean why I've been working with SELinux for many years, and I haven't said what I really like about it, but what I like about it is they're trying to get their head around everything that's going on in the system. And so from a measurable standpoint you need to know every security relevant decision at some level. Hopefully not, you know, manually you have to assess that, but you treat it like an assembly language. Here's your assembly language of what's happening from a security perspective. And hopefully we'll come up with higher level tools to assess what's going on, and so we can work at a -- at this higher level. And so they work at policy design at a higher level, but it's still a fairly low level because it's per application, and they I'm sure reuse a lot of the rules from other applications. So if you have a networking application, you'll reuse networking rules from some other application. So there's a fair bit of reuse going on in that respect. But the rules have to be cut and paste. >>: [inaudible]. >> Trent Jaeger: They have rules that are kind of like that, yes. So they have networking rules where you say okay, this is going to have network access, and then that will generate a bunch of rules for having network access. And it will generate the same rules for -- not the same, but, you know, but tweaked in the cross applications. >>: Did that mean like [inaudible]. >> Trent Jaeger: Well, these are changes at the system level and with respect to the labels. So if you add a new file, you may not have to change the policy. Because you can use an existing label. And you say okay, well, this is a configuration file. So you can say it's, you know, configuration and application team. And so you'll have to, you know, make sure that it's put together in such a way so that the package ensure that it's labeled correctly when it's installed. But there are different levels of changes to your program that -- and many of which are robust. >>: Where -- if I add a new function in the program as a new modular program and someone [inaudible] does ->> Trent Jaeger: Within the program? >>: [inaudible]. >> Trent Jaeger: Currently, yeah. >>: And so -- and you just said -- that's part of saying okay, make sure there's a policy and the [inaudible] all the reference monitor [inaudible] so that the policy can be enforced correctly. >> Trent Jaeger: Right. That's a little different level than what we have been talking about. We have been talking about the system policy and changes to the program that would affect the SELinux policy would be at the system resource level. So it would be in terms of files and networking and this sort of thing. But if you change the program so that you would need to change how the program enforces security, that's -- that's within the program policy and the program hook placements. And there aren't that many programs right now that have hooks, which is why we're working on techniques to automate them. So we have 34 programs that need to enforce security for MLS, many more for integrity, and, you know, the X server people have been working on reference monitoring and the X server for a while, and that's fairly mature. And then Gcomp and Dbus and Linux also have reference monitoring explicitly. But not -- I don't think -there might be another program I'm missing, but there aren't many more. So this is a very deliberate manual task for adding reference monitoring to programs specific to those programs. So those kind of changes, yeah, will be less robust. And so, yeah, if those affect the system and that becomes problematic too if you by changing the program it won't deploy on any systems anymore, then you have to change the system policy, too, that becomes -- and then you have to change the VMM policy too. Yeah, so we don't want to go there if we can help it. Then the other issue is that the goals are ambiguous. We don't know what they are in general. We came up with some for this other case for the trusted program. We're going to have to come up with some for a system in general. How are we going to do that? So -- let's see. So how many longer? >>: 10 minutes. >> Trent Jaeger: 10 minutes. Okay. >>: >>: 15. >> Trent Jaeger: So I'm going to talk about graph isomorphism a little bit, subgraph isomorphism, to be specific. And I'm not going to claim that our problem is NP-hard, but what I want to do is ensure we understand why the problem is what it is. And we're going to leverage some knowledge from looking at isomorphism and helping us to understand what the problem is in a general sense. So in subgraph isomorphism, you have a graph and you have another graph. Apple the question is whether a subgraph of this graph has the same flows as this graph. And what we -- this -- you know, intuitively corresponds to what we're doing at a certain level in the sense that hey, we want to test whether this policy is okay, so there has to be a subgraph in here that corresponds to this. And we have to figure out what a mapping is. Now, it turns out the mapping problem isn't exactly the same for reasons that I'll talk about later. But the problem is figuring out, hey, what's the mapping between these. Yes? >>: Try to understand. In your graph is every node the same, or do they have different ->> Trent Jaeger: Right, the nodes have different semantics. They're different constraints on different nodes. >>: Okay. So when you map, you have to [inaudible]. >> Trent Jaeger: Right, right. And so the notion of how systems are built will advise us in how we can map this. And so we won't end up with a pure, you know, subgraph isomorphism problem. But I just want to -- I find it useful to think about it in terms of okay, we need to come up with a mapping, we need to constrain the possible mappings. And we need to of course look at this -- come up with principles that we can -- that are provable that will show why this -- why our problem is not this problem. And looking at it at this level will help us come up with those principles. So why does it even start to look like this problem is that the goals -- you know, they may be in terms of different labels completely than the policy. Of course we want goal label -- goal policies to be much simpler than the existing policy. So there are going to be many, many more policy labels in an SELinux system then there are in a -- in our goal. At least we would certainly hope so. And the -- and what we have with this hierarchical notion is that one policy label such as a label of a VM such as this domU label actually represents a set of labels initially. And so we have layering going on here that we don't have normally. But the layering occludes maybe what the mapping is between this -this policy, the underlying labels and the goal labels. So the good news is that Weidong was indicating, we don't have a completely general problem. There are only certain mappings that are legal. The good news is that that will help us constrain the problem. Although we don't yet understand exactly how and why. And the bad news is that the constraints aren't specified. So we have to sort of figure them out sort of like we did with the other program. So we have to mine some constraints, we have to figure out what it is that tells us what the possible constraints are. We can determine compliance. And we're going to refine the system. So I'll tell you. We're going to take a top down view. There's a lot of work that's been done actually in policy compliance. I'm sort of surprised as I hadn't -- you know, I worked in policy for a long time and written a lot of policy papers. But I haven't really come across took in policy compliance until I dug it into it a bit more. So a lot of people have tried to test policies. But they do what you would expect to do as a researcher, they come up with the specific mapping between them themselves manually and they test them. And you know, we don't really want programmers to have to come up with these mappings for these systems if we can afford it, so if, you know, we want to make what they specify high level as well. The goals should be high level. Whatever mapping hints are given should also be high level, not labels on individual variables and programs or something like that. So what we're going to try to do is utilize the VM system configuration to infer some security goals, fairly high level goals, and we're going to see if we can push those through all the way to the individual applications. So we're going to use tamper proofing as a guide. And tamper proofing is good because tamper proofing gives us a guideline that we can get from the configuration automatically without much information. And if there are specific constraints on applications on user VMs and how they communicate, those will probably have to be application derived. We'll have to get some hints about that from the application programmer, I think. But it hopefully can be fairly high level. And then we're going to take a top down view, I'm saying top down, what I mean really, if you're thinking of VMM at the bottom and then the others at the top, what I mean is from the VMM policy, which is coarser grained, so I think of it as big and top down to the individual applications through the OS to the applications, which I consider at the bottom because they're finer grained, but what I mean is from VMMs to applications. And the result is we don't need to integrate the policies into a single information flow graph. So we don't have to take all the SELinux policies, all the applications policies, all the VM policies and put them all into one graph and then analyze it like a subgraph isomorphism problem. In a general sense, we can work from the top down and deal with the finer grain problems where we need to, where there's authority that governs what's going on. Okay. So I'm going to blast through these slides. So the point of this is we have a formal model for expressing what our system is. We can infer goals on that model. I'm just going blow through these slides if you don't mind. If you have questions later, by all means ask me, and you guys have the talk, too, if you want to look at these slower. We set a set of possible mappings for the individual objects called nodes. So we start with VMs as nodes and we're going to refine down to whatever level of granularity is necessary to determine whether things are safe. So we can figure out when we do this conservatively, figure out what the possible data is at each individual place. We then have a way of testing compliance. So basically what we're going to do is push the labels around the graph. So each VM generates labels of its own data initially, and they're going to be pushed around the graph based on the flows that are there. And we're going to assess -the key thing is identifying whether a flow between a VM and another VM is save or a node and another node are a set of principles really is what a node is. Whether it's safe, whether it's unsafe, so it violates the goal, or whether it's ambiguous and we can't tell yet. That is it may send multiple labels of data to the receiver. The receiver may be unable to handle some of them. But we don't know which flows are going where, and we'll have to look at it more carefully. And so we compute basically is disclosure of labels push them around the graph. We introduce flow constraints. So these are constraints to restrict what data can be sent where because not all data can be sent everywhere. And when we find ambiguous flows, we're going to decompose the nodes, so dom0 and SELinux are in the Xen rather is a privileged node, and it receives lots of data with lots of labels, and so we need to look at that more carefully to determine whether it's cure. And so what we end up with is this kind of approach, where we start with an information flow graph, we deduce and apply constraints on the flows and on the individual mappings between the nodes or sets of principles. We check compliance. If we find safe and unsafe flows, then -- if we find the whole system is safe, then we're done. Everything is cool. If we find unsafe flows, we need to fix them, resolve them. If we find ambiguous flows, then we need to look more detail. And so then we repeat from the beginning for the ambiguous principles. And so we dig down deeper and deeper. Yes? >>: [inaudible]. >> Trent Jaeger: Basically -- I'll talk about it in a second. So let me show you. So the fact that this works top down is based on the fact that nodes really represent equivalence classes of principles and that these principles all share an upper bound of flows and they all share a set of possible mappings. And if we find a safe solution at some level, then it's also safe for any decomposition. And we're working out the formal proofs for that. We have a little sketch here. And it's also the same for unsafe. If it's unsafe at a certain level, it's also going to be unsafe below. And so we only need to look to low level where we can determine safety or unsafety. For resolving issues, we'll have to look at the whole graph. Resolving is a much harder problem. But I think I have a slide -- yeah, the next slide. So we're finding that resolution at this level, because it's an abstract level in terms of sets of principles and flows, it works out to, hey, you can -- you can deal with the flows reducing what's been sent or changing what's been sent, or you can, as you were asking, decompose the principles. I'll talk about -- I'll get to your question on the next slide, I think. And so you have these two choices, and these are what individual rules in the policy effect these is what we're going to be gathering and showing to the user for resolution. Hopefully we can order them in some way, because there may be a lot in a SELinux policy. But it looks like looking at things at a higher level may help with suggesting resolutions, that we can search at a higher level and then just show them the rules behind that. Okay. Yeah. So what we have -- so this is the basically the breakdown. So we start at the VMM level, we look at the XSM/Flask policy and the network policy that describes what flows are possible between VMMs or VMs rather, I should say. And so this results -- and with the flow constraints we infer and the goals we infer what ends up happening is that VMs with authority like dom0, other privileged VMs such as a VTPM or server VMs because they may support multiple clients, these have ambiguity. But individual user VMs that don't have security issues, we can determine whether they're safe and unsafe right away. And so that's a nice thing. And so the fact that we're looking at the ones in detail for which detailed security decisions are made makes sense, and so that's kind of a nice thing. But for ambiguous flows, we have to break it open. So what we do is we break it open at the next level by looking at the SELinux network permission. So these are the permissions that connect the OS labels to the network. So the idea is that the mechanism, the algorithm that I showed you before, doesn't really tell you what the decomposition is, and we have some policy for describing it. And so what I'm talking about now is the policy. So we're saying, okay, let's look at the OS label that are connected to the network and see what kind of things are going on there. And so what we found there is that there are almost no constraints on network communication. So everybody can communicate with everybody over the network. And so the question is, you know, is that -- are there or should there be more cases, can we encourage the definition of okay, I've got this VM and it's supposed to talk, you know, it's on a cloud and it's on a cloud with some other VMs. And so it should only talk to those specific VMs. And so we could leverage these firewall policy in order to constrain the communication so that we could still proof safety in environment like cloud where we can't really now, now everything is communicating with everything. And then for those network facing daemons that are there just there that Crispen [phonetic] is so familiar with that are there receiving information from anybody in doing things on behalf of the whole Internet, you know, what we'd like to do is codify in our system this confinement policy that you guys did in AppArmor as the goal that we want and then see if we can you know, leverage that to keep refining the problem. So, you know, codify these sorts of things in an explicit way. If this doesn't work, then we're going to have to look at the OS layer and see okay, you know, what's happening at the SELinux label level and dig into these a little bit more. We're hoping -- I've done a fair bit of work on generating trusted computing bases for SELinux policies, so the idea would be to come up with trusted computing base and also we're thinking that in a VM sense in cloud you have a very specific application you're trying to deploy on the VM. So there's specific stuff that you really care about, and we can utilize these -- the focal points to also help us define, okay, this is what defines the integrity. As long as this application is protected on its system, so all the things the application depends on and the application itself maybe form a base that we're going to try to protect. And then, you know, we're going to try to see if we can protect them from these guys, for example. I'm sure this isn't enough constraints. But this is sort of a starting point for generating these kind of constraints. And this is the direction we're going down. So our goal is that we'll have a top down iterative refinement, and we'll come up with techniques for generating these mapping constraints and generating flow constraints so that we can understand what's happening in the policy at large and that the -- the solution will be, you know, provably efficient from a top down perspective, not the full isomorphism because of the properties of the systems at -- in our formal model. >>: [inaudible]. >> Trent Jaeger: We're hoping so. Yeah. So right now we're just looking at one box at a time. But so what we'd like is to say, okay, this VM is only talking to other VMs that communicate at this label, and then something would have to justify that that communication is satisfying some integrity criteria, you know, through, you know, TPMs haven't really caught on but, you know, through some kind of integrity verification or something along those lines. >>: [inaudible]. >> Trent Jaeger: Yeah. You're thinking on the right path. That's where we're trying to go down. Now, we haven't built this model yet, but that's the direction we're working down. But, yeah, that's exactly the intuition that we're trying to convey. So that's a good point. Okay. So I think we're basically out of time. So we have a tool for assessing this, and it shows you what flows are ambiguous and safe, and then you can crack open the ambiguous ones and look at what kind of flows are going on within VMs or within the labels that are ambiguous. And it will tell you what principles correspond to those and so forth. We have a system where when an VM image -- the idea is the VM image is loaded and say some demon that loads VMs receives it and it says well, you know, I don't know whether this is compliant with my system so I'll ask the management VM here if it's compliant, so we'll push the necessary information up to the daemon that receives it. It will push it into the compliance checker. It will check whether it's compliant, if it's not compliant then, you know, it may get pushed into the tool to fix it, which obviously will then become manual. But if it's all compliant, if everything's cool, then you should just be able to load the thing and so order studying you know what defines compliance in a robust sense. Because we're finding, you know, SELinux, macro-expanded policies are big. There really aren't a lot of differences among them. You know, just a few rules are different generally, and some of them are the same. So we should, once we prove that something is compliant, be able to prove that other things are compliant in, you know, if they are in fact compliant without involving the policy designers. The only other issue I wanted to make you aware of I thought was kind of interesting was, you know, the notion of completeness versus soundness. So we want a system that, you know, that we want to assess where it fits on this completeness soundness notion. So a complete system would indicate no false negatives and would tell us that, you know, hey, if you deploy this way, it's going to be secure. Whereas soundness tells us, okay, there are no false positives, so it tells us that there aren't going to be any errors that are not real errors. And so obviously -- if you're familiar with intrusion detection systems then you know these are issues, especially the false positives come in to play. Now, what we have right now in the really world is a completeness issue. We don't really have all the constraints. We say okay, we'll turn on this bid and we'll trust this program, but we don't really say why we trust it or, you know, as I've been talking about, what details justify that trust. So we don't really have constraints for that that should be satisfied before you would trust it. And so we have an incompleteness problem, and we're trying to use this kind of view to help generate a complete enough picture that you can say something about the system. But we can't really guarantee that you're going to collect all the constraints, you know. If a program receives data and it's untrusted, do we have enough constraints on how that program handles that data correctly? You know, we'd like to have a very precise definition that says it's handling that data correctly, but we don't have it as a community yet. So we're lacking there. So we're going to work toward this, and we'll be able to use more constraints as people come up with them, but we're not going to be able to guarantee completeness on our own. So we are going to try to focus on soundness because we don't really, you know, in these complex policies that we start introducing false positives as we're looking for errors is really going to be the death of this, so we really have to be very conservative about this and look at sound constraints and only refine things if we know for sure that this is a legal refinement. And so that's going to be a challenge for us. Because the -- you know, the obvious thing would be to start, you know, oh, it must be that one, oh, it wasn't that one, you know, and then backtracking. But with this kind of approach, that leads to potential computational issues and undoes some of the guarantees we're trying to make. So we're definitely targeting sound constraints, and so far we've been okay there. All right? So thanks for listening to the story here. Started from the first model of the atom, I think, and worked our way around. So basically we started with information flow. It's an idealization, but as far as the security goal, it's still the most useful one I think we have. At least at a general level. And we can apply, you know, other things as constraints relative to this and be able to get our head around the problem. But we need to account for application level enforcement, we can't just ignore it, we have to make it part of the picture, we have to understand what applications are doing and whether they are there satisfying -- helping us satisfy the security goals we have in mind in order to really be able to say comprehensive things about security. And then we'll have to address all of the layers once we -- you know, now that we have VM systems. So the VMMs should help us. They should provide nice coarse grain boundaries for VMs, and we don't have to depend on the operating system to have everything, all of the data, and keep it all straight. But there's still going to be applications that will have authority to do things that could break this system's security goals, so we need to be able to do it all the way from the top to the bottom. So thanks. I'll take any further questions, if you have them. [applause]. >>: I'm just curious [inaudible] composition and decomposition [inaudible] higher levels of policy [inaudible] think in terms of these different players from [inaudible] do you have [inaudible] tools to [inaudible] these different players and so [inaudible] essentially given these different policies? I mean [inaudible] much more bigger composition essentially talk about [inaudible]. >> Trent Jaeger: You bring up an interesting point. Composition, composition of policies has been a problem people have worked on in the threat community, but they didn't -- they found that composition didn't necessarily lead to security, and so we have to be careful about how we compose and decompose things I think. But ->>: [inaudible]. >> Trent Jaeger: Yeah event we are going to add the network underneath, and so then you could have, you know, multiple machines. And I guess in this case you wouldn't want to look at the whole Internet, but you'd want to look at the machines that are connected at least with respect to the VMs that you're interested in. And so we should be able to add another layer that says, okay, you can have flows from this machine to that machine or this subset of VMs on this machine to that subset of VMs on that machine so that subset of VMs on that machine and, you know, are we achieving the security goals we have in mind? So I think we should be able to ->>: [inaudible]. >> Trent Jaeger: Yeah. Well, the problem is going the other way, I think, is getting enough constraints going up the stack. We seem to have -- I mean network policies are a bit loose, but we seem to have more people doing network policies and more experience with network policies in terms of fire walls than we do with application policies and how applications handle data correctly. And so -so we haven't tried to get our whole head around it yet as a community. And, you know, each of the layers are sort of hand by different people at different expertise, so bringing them together is what we're hoping for. But I think the harder problems will be higher at the application layer. So we should be able to do the networking. Knock on wood. Yes? >>: So [inaudible] and then hope for the best sort of, is that, you know, these components that are going to ensure the -- you know, access control, they will do their job right in the future. So is that when a VM is started, you do checks and say okay, is this compliant, then you let it go. So like it's the load time versus runtime kind of integrity [inaudible]. >> Trent Jaeger: Yeah, we are working on the enforcement and where the placement of the reference model and that sort of thing. I didn't talk about it here. So that -- so that would affect ->>: [inaudible]. >> Trent Jaeger: And clearly as a much as you can prove at load time or at compile time, you know, the better off you will be in terms of work. So you'd like to minimize the runtime checking. But I'm sure there will still be runtime checking. So we're wording with some people at Georgia Tech who do VM introspection and trying to see how that will integrate into this sort of thing, to look at the runtime behavior of the system. So they've been looking at it for intrusion detection. And we're trying to look at you know, proving these kind of things, you know, for the cases we're not sure. Can we get runtime checks for those. We haven't done a whole lot on that front yet. But we're just gathering up information for them now. >> Weidong Cui: All right. Thank you very much. >> Trent Jaeger: Okay. Thanks. [applause]

>> Weidong Cui: So it's my great pleasure to... associate professor at Penn State. Before that, Trent worked...

Related documents

Products

Support

&gt;&gt; Weidong Cui: So it's my great pleasure to... associate professor at Penn State. Before that, Trent worked...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Weidong Cui: So it's my great pleasure to... associate professor at Penn State. Before that, Trent worked...