>> Weidong Cui: So it's my great pleasure to... associate professor at Penn State. Before that, Trent worked...

advertisement
>> Weidong Cui: So it's my great pleasure to introduce Trent Jaeger. He's
associate professor at Penn State. Before that, Trent worked on many years in
IBM Research, Watson, and Trent has done many interesting work on operating
systems currently like access control, some SELinux stuff. So today Trent will
talk about his recent work. Thank you very much.
>> Trent Jaeger: Thank you, Weidong. Thanks everybody for coming. So I'm
going to talk about taking the various system component that exist now that
enforce security policy. So virtual machine monitors as well as operating
systems as well as applications, and trying to put these things together into -- in
some coherent way into a system that can enforce some well defined security
goals.
And hopefully all of this will be measurable in some sense. So we'll be able to
measure whether the goals are enforced and if they're not enforced, you know,
where they're not enforced, or if they're enforced based on some risk such as
accepting some bad data in a way that won't break your program that we can
identify all of those risks and so we get a comprehensive view of what's going on.
So, yeah, I wanted to say a little before that. So this work is sort of built up over
the last probably seven, eight years based on what I was doing at Watson. And
when I was at Watson, the last four years I was probably as Crispen [phonetic]
was familiar, the security liaison for the IBM Research to the Linux technology
center. And so we had some interactions there.
A lot of the examples will be in terms of Linux, the work is basically very Linux
focused. But we'll just sort of assume -- I mean, the problems are the same
across different systems. And so we'll just, you know, use Linux names to
protect the innocent or something like that, or blame the other guilty people.
So at Penn State we have a security lab, three faculty in the security lab. Patrick
McDaniel and I are the co-directors. The lab, Adam Smith joined a couple years
ago. Adam won what's called a PK award which is an award for the best -- there
are more than one, but the best set of NSF career awards. So his career award
was accepted, and it was one of the best career awards. So get a firm
handshake from our president at some point.
We work on basically all facets of hosts and network security except for hardware
security, and in addition Adam does cryptography. We focus a lot on system
security. That's my main focus. And then Patrick has done work on
telecommunication security. I've looked at that mostly from a systems
perspective.
In terms of language based security, we mostly -- we aren't developing so many
new language based tools. We're building on fundamental language features
and trying to do useful things with those.
Okay? My shameless plug for my book. This came out last year. So I don't
know if you're familiar. Morrie Gasser has a book that he wrote in 1988 on how
to build a secure system. It's out of print, so you have to scrounge around to find
it. It's actually a pretty interesting book. But it was one of the main principles in
that book on how to build systems was a thing called reference monitor. And
we'll come back to this today, also.
And so he described how you would go about building an entire system,
including authentication and other features based on reference monitor principles
and some others. And so my book looks for specifically at just the reference
monitor principle and how well individual systems have or have not achieved the
guarantees required of the reference monitor. And if you don't know what I'm
talking about, I'll get to it at some point in the talk and explain it.
But anyway, it looks at a variety of systems and a variety of different operating
system mechanisms and how they are and are not amenable to the guarantees
or achieving the guarantees of a reference monitor.
Okay. So I have a lot of slides, and we have a fair bit of time. But I want to tell
you where I'm going first so you have some idea what to look for, what to ignore,
and what to focus on.
So basically I'm going to start out talking about classical information flow. So
multilevel security and Biba, and that this is, you know, formed the foundation for
computer security in the DOD environment for, you know, 35 years or so. And in
terms of building what people would consider, you know, bulletproof secure
systems, this has been the focus of building those systems, one of the
foundations for building those systems.
These policies are really and idealization I'm going to claim, show you why, and
give you an idea of where we need to go. What's going to come out of that first is
that we do depend on a lot of application-level policy enforcement. So we started
out with some applications outside the OS doing security enforcement. You
know, Multix had more than one -- more than two rings, it had more than the OS
and the rest of the system. It had several rings, and they assumed they'd have
other trusted programs. Maybe not quite as trusted, but they would still be
trusted. Whereas for many years we pushed security in the security community
further and further down into the operating system and tried to trust less and less.
But what of course is happening, and you guys see this here, is that there's a lot
of security decision making going on in the application level and we really need
to take that into account when we're building systems at large. We can't just
ignore that.
So we're going to look at application-level security, and then we're going to look
also at adding the virtual machine monitor enforcement into the picture. So now
we have virtual machine monitors and operating systems and applications. And
of course if we go further, we also have network enforcement as well.
And so ultimately we'd like to get our head around all of these things. And so I'm
going to talk about what it takes conceptually and some progress that we've
made on the last slide. That's the really very recent stuff that we're still sort of
formulating and working on the paper in our head. And then the earlier piece of
work that rectifies application, particular type of application, security angle
fortunately with the system that it's running on in an automated way.
Okay. So here's sort of what I just spoke about. Historically, we have the
security community looks at, hey, we have operating system, it enforces the
policy, we don't have to depend on the applications to do anything and what
happened is in the '70s we found that with multilevel security in fact, we could
show that even if there's a Trojan horse running in one of these applications that
we could prevent the operating system policy and its enforcement mechanisms
modulo covert channels could prevent secrets in the Trojan horse infected
program from leaking to some other application based on this MLS policy and
this enforcement at the security level.
And so the bias seemed to go more and more toward pushing things into the
operating system into really focussing on, you know, making this small and
making this the full basis for enforcement in the system and trying to take
anything away from the applications, assuming that we don't trust the
applications, things like minimizing the TCB became maybe overminimizing.
So what we have then for these kind of systems also in the DOD security
community were these two types of approaches to security. One used much
more than the other. Secrecy was a much heavier focus than integrity and still is.
So the policies basically, multilevel secrecy says we're not going to allow any
leakage of secrets to unauthorized principles. Okay. And so we're going to set
up based on clearances and access classes of objects so that if you're not
cleared to access certain objects, you won't be able to gain access to that data.
Biba is just the dual of this for integrity. It says hey, if you have some low
integrity process, it's not going to be able to receive -- it's not going to be able to
send information, I should say, to a higher integrity process because we don't
want that higher integrity process to depend on the data that it's receiving from
the low integrity process, and so Biba will try to prevent this.
And so both of these policies are described as information flow policies, and so
they get interpreted in the community this way. So we have for secrecy some
secret entity, and it may or may not have a flow to some other entity, and if it
happens the way you configured the system that there's a mechanism such the
secret system can cause information to flow to the public system, this would be in
a violation of the MLS policy.
And it's exactly the same thing, just in reverse, for Biba integrity. So if there's a
flow from the low integrity process to the high integrity process, this would imply
a dependency. So I mean there may not actually be a leakage, right? You
know, you do a -- you enter your password into a program. That program that's
receiving a password that's entered knows the secret password. Maybe you're
an attacker, you don't really know the secret password. So you enter the
password into the program, the program tells you that was the wrong password.
Well, that's actually a leak of information. That's saying, hey, that password you
entered isn't the right one. And it narrows the space of searching that you'll have
to do for the right one. Maybe not very much if you have a reasonable size
password, but it is from an MLS perspective a leak of information. Similarly, you
know, you have some high integrity process. You know, it's going to provide
services such as changing your password, for example. And when you access -ask to run that program, you're providing input to that program from a process
that's -- you know, from a shell or something that's not necessarily trusted.
But we still depend on the password program to do the right thing. But both of
these would be violations to these policies. And so what you would need is some
assured -- fully assured guard process to make sure it's okay to communicate
between these two parties in both of these cases. And obviously in commercial
systems we don't have such processes.
So what we have is a situation where, you know, secrecy, we have had some
success building systems for the DOD environment. Trusted Solaris has been
around for a while which is essentially a commercial system that provides some
-- provides enforcement of multilevel security in some -- in a mostly general
purpose system. And we have other systems that also enforced MLS. Whereas
we really haven't had any luck with integrity. So Biba and Clark Wilson and
integrity models and variations on them are still things that we're toying with how
we're going to make these work. You know, you guys have the Vista model, the
UAC model, and that gives us a partial view of Biba. It didn't give us the other
end where the trusted process might read bad stuff, but it prevents the bad guy
from writing up. And so we're still trying to figure out how we're going to
implement integrity in a comprehensive way that is practical.
And so what happens is we do things like just trusting the applications or just
trusting the programs to do the right thing and hope things work out. And, you
know, we all know the story there.
Okay. So what's an example of some trusted program? So the simplest one that
we have in Linux environment is something called logrotate, and all it does is age
logs. Okay? So it's a pretty simple thing. But it ages logs across the whole
system. And because they're a set of logs, the logs may include information from
processes that are processing secret data. And so the logs may contain secret
data. And so in SELinux, MLS, logrotate is a trusted program. We designate it
as trusted. Trusted to handle secret data and not leak it to low secrecy
processes.
As well as the program receives this log data which could also be from low
integrity process, it could be from any process on a Linux system, and we're
depending on it not to be compromised by that log data. And so logrotate is
trusted to protect its integrity.
Okay. So it's a trusted program. And it turns out when the SELinux folks
identified or configured their MLS system, there were 34, I think, at the time I
looked at it. Which was a couple years ago. There may be more now. They're
never less than SELinux. That's one thing we learned is that rules never shrink.
So it's always more than whatever it was when I looked at it. So there were 34
processes that were labeled as trusted. And so what was the basis for labeling
these processes?
Trusted was essentially that we had to trust these processes. There wasn't any
sort of analysis of these that they were really doing the right thing. But we just
have to depend on them. And so we're going to have to go back later and make
sure things are okay and, you know, keep an eye on them and this sort of thing.
But, you know, we want to get our head around this a little better.
And for integrity, you know, it's just much, much worse, there are much more -yes?
>>: [inaudible].
>> Trent Jaeger: Sure.
>>: I mean here I don't see the Linux ->> Trent Jaeger: Yeah, yeah. Well the Linux kernel is clustered there, yeah.
>>: [inaudible].
>> Trent Jaeger: So Linux kernel is definitely trusted too. So, yeah, this is
probably -- I'm sure this isn't a comprehensive list, and definitely the kernel is part
of this.
>>: Is it only for [inaudible].
>> Trent Jaeger: Well, they do have a label for kernel too.
>>: You do have?
>> Trent Jaeger: Yes.
>>: So that means all the drivers?
>> Trent Jaeger: Well, it refers to user level processes. So processes running in
simpatico with the kernel are labeled as kernel.
Let's see. So and then for integrity, you know, obviously there are a lot of client
processes, e-mail clients and browsers and so forth that receive low integrity
data and we're expecting them to do the right thing, as well as many, many
server processes, network facing daemons you're aware of course from
Apparmor.
And we're expecting this all to work out. And we haven't got our head around
this -- gotten our head around this yet. So what we end up with is this kind of
situation where we have policies being enforced in the applications as well as the
systems. And they're written, you know, you -- certainly the system
administrators have enough trouble configuring a system policy. So we're not
necessarily doing a whole lot of work with application policy. So currently these
are separate. And by policy I couldn't -- you know, it could be a procedural thing.
We'd like a declarative policy, but many applications have procedural
mechanisms to enforce whatever security they have.
And now we have virtual machine monitors, you know, hoorah. We have another
layer which is coarser grained where we can make maybe better decisions about
security. And isolate these bad things that aren't working so well at the higher
layer.
But what happens is in many cases is that we have authority here and we
delegate some authority to the operating system to do the right thing and then we
delegate some authority to the application. So we refine the problem a bit, but
we don't eliminate the problem. We still have problems that there are
applications that are dependent upon to do the right thing, and we still need to do
something about this. And we want to get a picture or an idea in our mind about
how all of this is working together and whether it's all achieving the security we
have in mind.
And of course networking is in the picture as well. I'm just not talking about it
here specifically.
So with respect to information flow policies, we've done as a community two
things. One I've talked about a lot, which is networking the exceptions. We label
it trusted, and we hope for the best. We allow it to receive data we know is
dangerous, and we hope for the best.
Or we go to a policy that's not information flow aware. We just, you know,
information flow is not working, it's too restrictive, and so we need to do
something else. And so we go to some more access matrix oriented policy and
then we often have problems coming up with meaningful security goals. So
Apparmor probably had one of the few meaningful security goals in the sense
that we want to combine network facing daemons. That's a security goal.
Saying we want lease privilege is not a security goal.
>>: [inaudible] criticisms the SE people -- SELinux people level that the security
goal wasn't [inaudible].
>> Trent Jaeger: Okay. But the targeted policy is essentially the same. Yeah,
yeah. But -- and the targeted policy is essentially at the same security goal.
Yeah, I don't know what to say. But I believe it's a security goal.
>>: The difference is that the targeted policy has a very meaningful goal for
people who have to actually maintain real machines. But it's too dirty for a
mathematical purist to get information flow out of it so it's [inaudible].
>> Trent Jaeger: Well, we try to get information flow out of SELinux style policies
but the targeted policy because unconfined is ->>: Yeah.
>> Trent Jaeger: Yeah, yeah, exactly. But we're going to try to get our head
around this a bit today. We'll see how it goes.
So what we end up with are systems where we have incomplete policies, policies
that provide some security but don't provide full coverage of an information flow.
They partially compare it. And then and or, so we're going to have both, we have
also complex policies, policies where we have a lot of rules and it becomes
challenging for people to figure out, did I do the right thing. So if you have
thousands of rules, you know, is there one that's wrong? This is a, you know,
non trivial problem, if you've ever -- probably not a lot of SELinux systems here.
But if you try to administer them, it's -- you know, a very challenging task.
So figuring out how to administer them is what we're going to work toward. So in
terms of related work, clearly not going to spend a lot of time on this, but there
are clearly a lot of applications that enforce access control. I don't think that's a
implicit to anybody, databases being the most well known. Atomicmail is just an
e-mail client. It's and old system saying oh, e-mail's receiving data that we care
about, so we want to enforce security there. So it's more of a seminal thing
rather than a system you currently work with. But clearly all of these programs
and many, many others enforce access control.
And then there are languages that have -- yes?
>>: I just want to cater to the audience. Microsoft IRM in the apps.
>> Trent Jaeger: IRM? What is IRM? Sorry.
>>: Intellectual rights management.
>> Trent Jaeger: Okay.
>>: It goes by a bunch of names.
>> Trent Jaeger: Okay.
>>: It's basically ->> Trent Jaeger: Yeah, I'm DRM still.
>>: Sort of mandatory crypto e-mail so you get an e-mail that somehow says
IRM, the e-mail client won't let you print it or cut and paste it or anything.
>> Trent Jaeger: Okay.
>>: This of course is totally ownable if you crack your machine, but in principle it
didn't [inaudible].
>> Trent Jaeger: Right. Right. Yeah, I'm not really going to get at quite that -that far, too. The systems that I'm looking at I'm assuming that the administrator
and the user are -- have compatible goals? I'm not going to go as far as that -those kind of goals, but that kind of system is appropriate.
>>: Yeah.
>> Trent Jaeger: Yeah, definitely. So you're all familiar with I'm sure Java
security, and many other systems have security mechanisms for their run times
and then of course languages that have information flow security built in that you
can connect to the system such as Jif, and of course Perl taint tracking. And
then what we're seeing recently is work where the systems developers are trying
to get their head a little bit around what's going on in the application. So we see
in SELinux community they have policy servers for the applications as well. So
you can sort of bring SELinux up to the applications.
And then there's a variety of work in the research community. Some of ours and
Neng Wui Li and and Shakar [phonetic] at SUNY Stony Brook looking at, okay,
we're depending on applications to make decisions. How do we manage the
overall security of the system given that these applications are going to make
specific decisions and how do we reason about system security there?
And then an interesting piece of work is this DIFC work from Stanford and from
MIT, where they're allowing applications to create new labels that the system will
be aware of to delegate those labels to other processes, and then to try to get an
idea of what's going on in the system at large even, you know, taking into
account certain things the application can do.
What this doesn't enable us to do is know what's really going on inside the
application. Does the same semantics for the labels as the system that it's
handling, and is it -- are the information flows within the application compatible
with the information flows that the system is trying to enforce? And do we really
trust the application relative to some security goals to do the things that it's going
to do? So, you know, if it's making -- the applications are making decisions about
delegation and things like that and do we really want, just because it can
communicate with the sort of thing, do we really want it to delegate this label
access to this other process? Yes?
>>: Could you clarify the second [inaudible].
>> Trent Jaeger: This one?
>>: [inaudible] programming system [inaudible].
>> Trent Jaeger: Yes. So what I'm talking about is like the Java access control.
So Java, if you have a -- you know, a class loader and you're running some
program in Java, it can access the underlying file system, and you can write
permissions for that particular set of code or that particular class loader or code
based on that to access your file system. And so you can control code that's
been loaded into the same Java ->>: So this thing is code access security procedure.
>> Trent Jaeger: Yes. Thanks. So, yes, C#.
>>: [inaudible].
>> Trent Jaeger: Yes. Thank you for being here. So, yeah, and a lot of
programs have mechanisms for you to describe in your program what individual
threads of execution can have access to -- in terms of resources from the
system, what files it can access and sockets and so forth. Okay.
So we're going to look at how these layers of security may be deployed in
independent ways, maybe have independent label spaces, can be put together
into a way that may mean something coherent. Okay? So that's basically what
I'm saying here.
So what we're going to do is we're going to build -- we're going to go back to
information flow. We're going to start there. Because that's really a definition of
a goal. And we're going to build a model of system policies and goals and
assess whether the system consisting of applications, operating systems, VMs is
really doing the right thing with respect to the goal.
And so the hard part here is going to be identifying the goals. Because we don't
have any specific definition of goals anywhere. We have policies. And whether
the policies really reflect the goal or not is ambiguous.
Then we're going to evaluate the compliance of the policies with respect to the
goal. And evaluating compliance of policies, when one policy does something
that another policy expects is not itself a new problem but couching it in terms of
reference monitor so that we can ensure that the enforcement is really being
done according to the guarantees. And again, I'll talk about that in a minute.
That is expected of an enforcement mechanism is important and new.
And then we're going to look at the problem in a more general -- in a general way
look at generating constraints on how the policies relate to the goals necessary
to evaluate general purpose systems.
Okay. So I've been working on this stuff, this kind of stuff for a while. We've
looked at trusted programs for example and looking at what it takes to enable
them to enforce security goals in practical systems and being able to tell the
system, hey, you know, I'm handling this untrusted data, but I know what I'm
dock in terms of I know I'm expecting untrusted data. And you can check
whether you believe that I'm handling it in the right way, but this is where I'm
doing it.
So we've made Clark-Wilson integrity practical and loosened the requirement for
full formal validation in Clark-Wilson by only requiring that programs that receive
some untrusted input only do so through limited interfaces that they declared in
the system. So the operating system will only give them the untrusted input
through those interfaces and those programs must filter them. And so you could
check whether you believe that the filters are acceptable or things like this. But
we are explicitly defining in a way that connects the application to the system
where the untrusted data is coming into that, into the application.
And so what we're doing now actually is looking at all the places where bad data
goes into trusted programs in a similar, so we have written a tool, we're just
starting to collect data with it now, where we can find out all of the places where
a program receives data from some other program -- that some other program
can modify that we consider untrusted.
And so you know, a question is well how many interfaces are there on a system
that receive untrusted data? Are there 10, are there a hundred, are there a
thousand, are there a million? So we're going to get an idea of how many there
are and start to assess also weather there are commonalities and how untrusted
data is handled. Right now it looks, you know, obviously things like the
classifiers and endorsers are ad hoc. How you handle untrusted data depends
on what your program does.
But there should be, you know, things like type safety. You should handle
untrusted data certainly in a type safe way. And so there are some things, some
libraries probably that we can build that can help define how you handle
untrusted data and so we can -- and programmers can use this across programs.
So we're trying to get an understanding of that.
And then the other thing we're doing, I mentioned to Weidong earlier, has to do
with mediating security sensitive operations. So if you have a program that's
supposed to enforce a policy, it's supposed to mediate all of the ways that you
can access this unsafe data, and so with -- I think actually talking to Crispen
[phonetic] probably started us town this path when you guys were working on the
LSM, I asked you, you know, well how did you do it, and you said, well, you know
-- or -- whether you believe ->>: [inaudible].
>> Trent Jaeger: Yeah, yeah. I mean, you -- was there a principle on which it
was done? Did you know it was correct? And you said of course the correct
answer which was of course well, no, we didn't know it was correct. So we built
some tools to look at whether placements of reference monitor hooks mediated
all the security sensitive operations in the system.
We've gone from finding bugs to trying to identify in code what security sensitive
operations are to now having a mechanism for automatically modulo some
annotations placing mediators in code to do runtime checks, to do -- to locate
where endorsers where should be and where the classifiers should be. And so
we have a prototype that works for Java. Not so much security high trust
programs deployed in Java that we have source code for since I don't work for
IBM anymore. But -- and probably you guys have many in C# or something like
that would be interesting to us. But we're now looking at C programs. So we
have the tool -- starting to work with C programs.
We have our first intraprocedural analysis going, you know, so we're getting this.
So I'm going to talk about two experiments though, the rest of the day. One is
taking a specific kind of program, which I'll call a trusted program. So this is a
program that runs on your system that you're entrusting with some authority to
process data on your system according to some security goals. And so you want
to be able to take these programs from packages. You want to be able to
download the program, verify that you're going to be able to -- that this program
is going to be able to enforce the policy that the system has in mind when you
deploy it in a mostly automated way.
So you'd like to take the package and when you install it, you want to not only
know that you've installed the files, but you wants to know that okay, this thing is
going to enforce a system policy every time it runs. So that's what we're going
for for limited programs. And then we're going to try to generalize some of the
ideas that we found here into, you know, all programs across virtual machine
monitor operating system and application layer. So this will actually -- the
second part will actually focus more on the virtual machine monitor and OS
rather than applications. But the principles are the same. Okay. So we're going
to giving in.
Okay. So basically we have a program, and we want to download this program,
and it's going to enforce some policy for the system, and, you know, when you
install it is it really going to enforce the policy the system has in mind? That's the
question.
So in our context, we're going to download some Linux package on to a system
with mandatory access control policy, in this case SELinux. So it could be, you
know, any kind of package of any kind of program downloaded on to any system
that enforces mandatory access control. And we're going to verify whether the
program and policy that we -- so the program is going to have its own policy
that's going to enforce. And so is that program with that policy going to enforce
the systems policy? So that's the question.
So what's interesting is that the package will contain not only files such as
executables, libraries, data, configurations for that program, but also in this case
it's going to contain a module that's going to extend the system policy. So the
system policy exists, but it didn't know anything about this program. The
program may have additional information that it wants to label. And so it's going
to extend the system policy with that. And we want to know what the implications
of that are. And then the program also is going to have its own policy. So it
could be a Jif program, which has its own labeling, or it could have a user level
reference monitor of some kind.
So we looked at these two if it could be a Java program with Java monitoring or
C# or Ruby or whatever.
Okay. So what's going to happen, we hope, is that the program -- so we want to
compose a program policy that enforces the system's goals. Okay. That doesn't
sound too bad. So we have a program policy. We have a system policy. So we
want to generate something in the end that's a program policy that satisfies the
system security goals and at the bottom protects the program. So if there
program data that have protection requirements, we want to also ensure that
within the program. We'll make this a little more concrete in a minute.
Now, by system policy, how the system policy's important is you know clearly the
program has been written in some, you know, environment. But it doesn't know
what your data is on your system. So we're going to deploy it on the system.
Your system is going to have some policy. And we want to know that this policy
is really being enforced by a program that didn't know about that policy before it
got deployed.
And then we're going to extend the system policy. So we have an SELinux
module for the program or some policy, and we want to know that this policy is
really being enforced by a program that didn't know about that policy before it got
deployed.
And then we're going to extend the system policy. So we have a SELinux
module for the program or some policy module that's going to extend the system
policy and we want to know that this program when we deploy it is going to be
tamper protected on that system.
So why do we want to know these things? These seem a little ad hoc. So we
want to know these things because of this reference monitor concept, okay? So
this concept was identified by James Anderson's panel in the early '70s, 1972
they wrote the panel document. There were about 10 people that worked on this.
And they identified these three requirements for reference validation mechanism.
So reference validation mechanism being a mechanism that enforces the
security policy.
So we have to have complete mediation. So, you know, every operation that's
security sensitive has to be mediated. The reference validation mechanism has
to be tamper protected. So we don't want untrusted processes messing with the
policy or the code, right? That would circumvent security. So here if we don't -- if
we have a security sensitive operation that's not mediated, obviously security is
circumvented. If we can mess with the reference validation mechanism, then
obviously we can't enforce anything.
And then lastly, we want to know that this thing is actually correct in some way.
So the way they stated it was that it was simple enough to be verified in some
sense.
They refer to code primarily, but we're also going to look at policy. Does the
policy that's with this actually correspond to what we expect? So these are the
requirements. And so basically from the early '70s through the mid '80s, this was
the driving force behind secure system design. And then it's been sort of a little
more on the periphery for a while, until the early part of this decade when people
like Crispen [phonetic] started adding code that satisfied these guarantees or at
least worked towards these guarantees. I mean, it's very hard to satisfy this last
one in a general sense. You have to have a very small system. But work
forward these guarantees for a commercial system.
So certainly we want complete mediation, certainly we are aiming for tamper
proofing when we configure this thing, and we're hoping that we're correct, but,
you know, systems can be somewhat complex, and it's hard to assure that in a
formal sense. But this is what we're aiming for.
So complete mediation I'm not really going to talk about. We're going to assume
that the program already has that and so some techniques that we -- that we've
worked on such as the automated replacement of security code will work toward
this, but we're going to assume this has been done in some way. We'll have
some signed Jif program or something like that, and we can see that it's -- has
complete mediation.
So we are going to test tamper proofing. We want to know that the trusted
program components that are downloaded are protected for untrusted programs.
And this is a little -- you know, it seems fairly -- it's straightforward in one sense
that we know some trusted program components, but we don't really know how
many of them are necessarily need not be tampered and how this works on a
particular system. And then more so we don't know what the set of untrusted
programs are. Or maybe the converse, what programs we needy to trust in order
to make a tamper proof guarantee of our process.
And then we want to know in the running program not only the permissions of the
process, but also the program policy protects the integrity of the program data
when it's been loaded into the program.
So we're going to test this. And we're going to test the policy. We're going to -we want a method that will ensure that the result -- that the program policy we're
enforcing in that policy does enforce the system policy. Okay? And we'd like to
do this automatically. So we're going to build some package. And we're going to
build it somewhere, maybe here, maybe there, you know. And we want to then
download it on some specific machine. And so we're going to compose a system
policy that we're going to validate that there's tamper protection automatically on
that particular machine, and we're going to compose a program policy for that
particular machine, also, that we know will enforce the system goals. And then
you can just run the program, run, run, run on that particular platform. And so it
doesn't matter which platform.
So what we found is that both of these correspond to what are called compliance
tests. So that is there's a policy and then there is some security goal or some
requirement. And so what we want to do is test whether the policy satisfies the
requirement. And so there are two different tests that we want to do. There's
one for tamper proofing, where we want to know that the system policy that
includes the new program prevents un-- yeah I thought it says it permits
untrusted subjects. It says does not permit. It prevents untrusted subjects from
modifying that program. So we'll have to figure out what that means in some
concrete terms.
And then for verification we want to know that the program policy enforces the
system's policy and protects the program from untrusted data that it might use.
And so these are two policies and two requirements that can be expressed all in
terms of information flow. Let me just go ahead here. So we're going to take the
program policies and represent them in information flow terms as well as a
system policy. And then we're going to want to evaluate those policies against
security goals.
And so the compliance is defined in terms of information flow this way where the
flows of one program are all authorized by the goal, okay. So you have a policy
and it has a set of information flows that are authorized, and the question is
whether your goal also authorizes those flows. And if there's any flow that's not
authorized by the goal in the program is not compliant. So this is pretty easy to
test.
The tricky part is setting up the property, is coming up with the policies, right? I
mean because testing it is easy. But you have all of this stuff, and you have
these ideas, and so the question then is well, how -- how do you know whether
your goal is expressing the right thing so that you can do the test, and how do
you interpret the flows from the policy that are relevant to the goal? And so
ultimately we're going to talk a bit about mapping.
So basically the idea is we have -- we have a program policy, and the program
policy may include its own labels that the system has never seen before. And
they provide these with this SELinux module. Okay. It should be allowed to do
that. And then the system has its own labels.
Actually I have it backwards, right? So this is the system, and this is the
application. And they their own -- or no, I guess that's the system. Yeah, it says
underscore T. And this is the application. But so the system has its own labels,
and it has some semantics for what the labels mean in terms of information flow.
And the program has its own labels, and it has its own semantics for what
information flow means there. And you can interpret them separately. And we've
done this for a long time, right?
All of these user level programs, databases, Java programs, C# programs,
whatever have their own policies and they enforce it, and it's sort of the system
doesn't know what the heck's going on. It's hoping for the best. It set that bit.
You can do it. Good luck.
So now here we're going to try to figure out what the mapping should be. Okay?
So we're going to try to come up with basically constraints that imply how these
individual labels are mapped. And so we want to come up with constraints that
will ultimately as much as possible give us a full mapping or at least gives us a
mapping that indicates that the combination is safe. So we don't have to get a
complete mapping, right? In SELinux we have thousands of types. We don't
need to map every label in the application to every SELinux type.
We don't even necessarily have to map it to one specific type. We just have to
map it to types that result in compliant systems that we can test compliance and
that it works out. And I'll talk more about that general idea later. Okay. But let's
stick -- right. I'm jumping ahead a little bit. So let's stick with this. This is more
concrete.
So the basic intuition of the mapping and our -- we have paper from USENIX
security last year talks about this intuition but not in as formal a form as I'll
ultimately get to. But the idea is simple idea, you know, it just takes sort of the
basic idea that hey, the trusted programs, these are programs that are going to
run on your system and you're going to provide system data to these programs,
and you're going to depend on the trusted programs to do the right thing every
time it runs. So these trusted programs and it's not supposed to lower the
integrate of any of the data it receives.
So from an integrity standpoint, from a Biba standpoint, the trusted programs are
higher integrity than any data they received. So it's a little counterintuitive that
the system and the systems data is lower integrity than the application and the
application's data. But that's what's going on. And I think that's -- let me know if
you have any questions about that.
And so the idea is then that the program components must be tamper proof.
They must not be modified by anything that's untrusted. And so we're going to
define a very small number of things, as small as we can number of processes.
Basically things like a knit in the Linux environment so the initialization and the
installer. And that's about it. So there are eight programs we're going to identify
as trusted. And nothing else is allowed to tamper with these programs, modify
any of their files.
And then the system policy is in a sense isolated from the program. The
program is higher grit, the flows are -- you know, the program flows. But
whatever we're supposed to do with the system data, that's sort of defined by the
system, and we should just be able to plug that in independently and tell the
program, hey, this is your policy, enforce that system policy. That's your policy
now. And, you know, make sure you protect your data from that. But just
enforce the system policy, please. Okay?
So that's the intuitive idea. It's a simple idea, but we're going to talk about
whether it works. So the idea is that we have a -- so this is the program's view
now. So the program is going to in a sense have this sort of model where -- and
really this -- these are sort of all templates. So it's going to put its stuff here in
HIGH, and whatever the system policy is, it's just going to jam it in the middle.
And we have this catch-all for low integrity stuff. We haven't really found
anything from trusted programs that goes there, but we still have it. The only
exception to this kind of flow model, this is an integrity graph. So these are the
high integrity items, so the program stuff can only -- can flow to the system stuff.
But whatever data is on the system outside the program data can't flow to the
program. And so in this model anything in the program could flow to the system,
but that of course isn't true, because you may have secrets in your trusted
program like SSH keys or something like that. So there are a few exceptions that
you may have as high integrity and high secrecy and they can't flow down. But
that's the basic idea. Okay?
And so for a program like logrotate, a simple little program has some executables
and some configuration files and then it receives in log files as input. We're
going to ends up with this kind of integrity lattice for this program. And so
logrotate will have these labels for these guys, and these are higher integrity than
any of the labels for any of the log file data. But whatever the labels are, the log
file data, if you have secret data or we're talking mainly about integrity here, but
whatever the policy is for the system flows, it's going to enforce that on behalf of
the system. You just have to plug it in. You don't have to do anything fancy to
map it together in this particular case. Okay?
So that's the basic idea. And we built some tools to test this out. There's a
TISSEC paper coming out on the analysis tool. So it takes policies, it used to
take Jif policies, now it can take SELinux policies for both this guy and this guy.
It basically just needs mandatory access control policies and a translator to
translate those in the information flows. So there are a number of policies that
could be supported.
And then we define in the case of the TISSEC paper a specific mapping
manually. But what I'm talking about here today is generating these mappings
automatically and without manually specifying. So this is a key thing. So we
want to come up with mapping constraints now rather than a specific mapping. A
mapping -- of course a specific mapping is a mapping constraint. It tells you this
maps to that, that maps to that, that maps to that. But we're going to come up
with higher, more abstract constraints and do the mapping.
And then we have a system that will only load programs when they pass
compliance tests, and this was published in USENIX Annual called SIESTA. So
the idea is you build your program, and you generate compliance information,
and then when give the program to SIESTA, it will only run that if it passes
compliance tests. And of course SIESTA is configured on a system so that it
many only have -- you can't circumvent and get some other program loaded that
would be able to get these permissions without going through SIESTA. So it
becomes assured pipeline, if you will, for this.
And so one question is, you know, would you want this sort of thing in, you know,
Microsoft style environment? Would you want to be able to have individual
programs be able to download them to your system and configure them relative
to system policies and deploy them with that authority in an automated way?
We have some of these mechanisms. We haven't done it in a completely
general sense, in the sense that we've looked at specific programs one at a time,
but obviously you'd like to look at all of the programs and make sure all of them
are tamper proof with respect to each other and when you add a new program
you're doing the right thing with respect to that program given the history of the
system. And so we haven't fully developed that history.
But all of the pieces are there to do that sort of thing for these trusted programs.
I should emphasize these are very specific kinds of programs that won't work for
every kind of program. Okay? But so think about that.
Okay. So what does it take? We built this palms tool, and it takes like about four
seconds to go through and evaluate whether the program is compliant. And we
evaluated it on eight trusted packages. We would have liked to have done more,
but we needed SELinux policy modules. And of the 34, most of them don't even
have policies explicitly defined. All the policy that they enforce is procedural. I
mean, they're trusted so they must enforce some policy. But 26 of them don't
have trusted packages I presume. Don't have, I'm sorry, don't have SELinux
policy modules. So they don't have things that they define.
I think I'm overspeaking a little bit. It may be that they just don't define any new
labels, also. That would -- that would be required of policy modules.
So in any event, we tested these eight, and we found that the policy was mostly
set up correctly for these trusted programs, but there were still a couple of cases
where any program that was loaded in etc could mess with any other program
that was loaded in etc. It's what we wanted was each of the programs to be
completely separate. And if you split a couple of permissions in etc and for the
programs we looked at, it looked like that would be perfectly feasible, these files
that were being added were independent for those particular programs, then
these programs would be isolated from each other. Yes?
>>: So would an example of [inaudible].
>> Trent Jaeger: Yes. Yes. Exactly. So the program in the package has a
configuration file that gets load into the etc directory and because it's sort of a
pain in the neck to introduce more labels and all of these programs are sort of
trusted any way, the ones that load stuff into etc, but there's, you know, no
reason that, you know, some other program should be able to modify this other -this logrotate, for example, configuration file. I mean, there just isn't any ->>: Is that -- isn't that the problem of the [inaudible] though, I mean, some of it
[inaudible].
>> Trent Jaeger: Right.
>>: [inaudible].
>> Trent Jaeger: Well, you could have different labels for each of the program
files. This is the logrotate configuration file. And so it's an etc. And it, you know,
it's just a place. So it can't really matter so much where exactly it is. If your
labeling of that is fine grained enough. You know, making the labeling more fine
grained is -- you know, obviously making SELinux even more fine grained, you
know, believe it or not, they are trying to resist that in some cases. So it's -- but
here, you know, you may -- this -- making this more fine grained would unable
you to say that this program is in fact tamper proof accept for the installer and a
couple of initializations.
>>: [inaudible] label them definitely in SELinux is one of the most difficult
[inaudible].
>> Trent Jaeger: Yeah. And so not too many people have really -- well, a lot of
people have looked at it. But it's -- you know, people came up with ideas. And it
seems to work. And as long as we don't have, you know -- it didn't cause a
specific error then it's okay. But if this program would somehow get comprised,
then it can go through and compromise all the others and vice versa.
>>: [inaudible].
>> Trent Jaeger: Yes.
>>: Okay. So read-write permission for the entire registry?
>> Trent Jaeger: Well, there are some programs that I think that have some
specialized etc files, so it maybe isn't everything in etc, but it's basically the
directory and all the files in there with a couple of exceptions probably. But that's
what it's tantamount to.
Okay. So we looked at the specific programs. So I'm running a little behind.
And they have a natural relationship. And we use this to determine mapping
constraints that we could then test compliance on the program so we could come
up with an automated soup to nuts way of deploying these programs to enforce
policy. And now we want to know, well, is there a more, you know, is there an
approach where we can leverage this for general purpose systems and can we
determine whether a system in a broad sense rather than just a single program is
in fact compliant? So that's what we're looking at now.
And so we're doing that in -- looking at the virtual machine monitor policies in
Xen. So recently the Xen community introduced a reference monitor interface for
the Xen hypervisor called Xen Security Modules, excuse me, and it -- I guess we
had done a prototype of this called Shype, but they generalized this to cover
more security sensitive operations. We made some assumptions about what
operations would be security sensitive in the Shype work. And they've been
more literal will about what security sensitive operations are. And so they've
identified more places in the kernel. Or, sorry, in the hypervisor.
And they have policy model which is very similar to the SELinux policy model for
that. And Shype -- our Shype stuff was also supported by this.
But we're looking at the Flask policy. So Flask is in fact a precursor to SELinux.
So you could think of it as SELinux. Same kind of policies.
So we want to determine if the VMM policy and the policies of the VMs together
and ultimately the policies of the applications lead to a system that enforces the
security goals you have in mind. So, you know, you have a system, and it has
some VMs, right, so we might have a privileged VM that gives you access to
hardware. And then you have some server VMs that might provide resources to
a number of applications enabling applications to work together. And then you
have some applications, VMs, user VMs, they may be isolated or maybe a few of
them may work together as peers and you think of like a cloud environment
where you may have a few VMs that may work together or may have a isolated
VM deployed in the cloud.
So we have this kind of system. All of the VMs have their own mandatory access
control policies and then within them they have applications that enforce policy
and may have authority to do things that protect the system as well.
And then all of this is administered by the XSM/Flask policy. So the Flask policy
determines what VMs can talk to what other VMs. Okay? So -- yes?
>>: Is a VM part of the system [inaudible]?
>> Trent Jaeger: Yes. So Xen hypervisor is the virtual machine monitor for the
Xen system. And so it has a policy. So it's going to do things like enforce what
VMs can use the low level Xen mechanisms, grant tables and event channels to
communicate with what other VMs. And then, you know, how memory is
distributed among the VMs in the file system and things like that. And so that will
be controlled by the Flask policy.
And then within the VMs, this will say, you know, sort of the normal things about
okay, this process can access these files, and these files of course are backed
by the, you know, the XSM policy, whatever the partition was that the particular
VM got.
>>: In this case, a VM is [inaudible] becomes kind of the OS. VM [inaudible]
application was this kernel.
>> Trent Jaeger: Yes. Yeah. So it's essentially the same kind of problem.
And this is, I haven't shown you any of the policies before. But those are the
policies.
So we're going to look at VMM policies. I think that's what those are. And then
we have network policies that connect communication channels between VMs
with labels. So networking is integrated with SELinux and a few different ways.
And then mandatory access control within the VMs, using the SELinux policy
within the system which is the same kind of policy that I was just talking about
before with respect to the applications. I'm not showing any application policies
here.
And so we have a virtual machine monitor. I think many virtual machines, many
of them have mandatory access control policies. And we're going to look at
whether we have compliance in this context using information flow based
analysis.
So obviously this is more complex. We have more policies and more big
policies. So each SELinux policies we're using right now is 35 megabytes of
source. So you thought Vista was big. Yes.
>>: [inaudible].
>> Trent Jaeger: The policy was written by hand, yes. Or no, no, no. I'm sorry.
The -- this is the macro expanded policy.
>>: Oh, okay.
>> Trent Jaeger: So there's a policy that's smaller than this that was written by
hand. It's still fairly substantial.
>>: [inaudible] policy [inaudible].
>> Trent Jaeger: I think we're still megabytes. I'm not sure exactly. But it's still
->>: How many [inaudible].
>> Trent Jaeger: People have been working on it for a long time.
>>: Okay. Thank you.
>> Trent Jaeger: And so basically when a program is introduced to a Linux
system that's security relevant, somebody comes up with a policy for that and
then that becomes a policy module now so we can extend the policy easily. And
so -- and these policies, I mean why I've been working with SELinux for many
years, and I haven't said what I really like about it, but what I like about it is
they're trying to get their head around everything that's going on in the system.
And so from a measurable standpoint you need to know every security relevant
decision at some level. Hopefully not, you know, manually you have to assess
that, but you treat it like an assembly language. Here's your assembly language
of what's happening from a security perspective. And hopefully we'll come up
with higher level tools to assess what's going on, and so we can work at a -- at
this higher level.
And so they work at policy design at a higher level, but it's still a fairly low level
because it's per application, and they I'm sure reuse a lot of the rules from other
applications. So if you have a networking application, you'll reuse networking
rules from some other application. So there's a fair bit of reuse going on in that
respect. But the rules have to be cut and paste.
>>: [inaudible].
>> Trent Jaeger: They have rules that are kind of like that, yes. So they have
networking rules where you say okay, this is going to have network access, and
then that will generate a bunch of rules for having network access. And it will
generate the same rules for -- not the same, but, you know, but tweaked in the
cross applications.
>>: Did that mean like [inaudible].
>> Trent Jaeger: Well, these are changes at the system level and with respect to
the labels. So if you add a new file, you may not have to change the policy.
Because you can use an existing label. And you say okay, well, this is a
configuration file. So you can say it's, you know, configuration and application
team. And so you'll have to, you know, make sure that it's put together in such a
way so that the package ensure that it's labeled correctly when it's installed. But
there are different levels of changes to your program that -- and many of which
are robust.
>>: Where -- if I add a new function in the program as a new modular program
and someone [inaudible] does ->> Trent Jaeger: Within the program?
>>: [inaudible].
>> Trent Jaeger: Currently, yeah.
>>: And so -- and you just said -- that's part of saying okay, make sure there's a
policy and the [inaudible] all the reference monitor [inaudible] so that the policy
can be enforced correctly.
>> Trent Jaeger: Right. That's a little different level than what we have been
talking about. We have been talking about the system policy and changes to the
program that would affect the SELinux policy would be at the system resource
level. So it would be in terms of files and networking and this sort of thing.
But if you change the program so that you would need to change how the
program enforces security, that's -- that's within the program policy and the
program hook placements. And there aren't that many programs right now that
have hooks, which is why we're working on techniques to automate them. So we
have 34 programs that need to enforce security for MLS, many more for integrity,
and, you know, the X server people have been working on reference monitoring
and the X server for a while, and that's fairly mature. And then Gcomp and Dbus
and Linux also have reference monitoring explicitly. But not -- I don't think -there might be another program I'm missing, but there aren't many more. So this
is a very deliberate manual task for adding reference monitoring to programs
specific to those programs.
So those kind of changes, yeah, will be less robust. And so, yeah, if those affect
the system and that becomes problematic too if you by changing the program it
won't deploy on any systems anymore, then you have to change the system
policy, too, that becomes -- and then you have to change the VMM policy too.
Yeah, so we don't want to go there if we can help it.
Then the other issue is that the goals are ambiguous. We don't know what they
are in general. We came up with some for this other case for the trusted
program. We're going to have to come up with some for a system in general.
How are we going to do that?
So -- let's see. So how many longer?
>>: 10 minutes.
>> Trent Jaeger: 10 minutes. Okay.
>>:
>>: 15.
>> Trent Jaeger: So I'm going to talk about graph isomorphism a little bit,
subgraph isomorphism, to be specific. And I'm not going to claim that our
problem is NP-hard, but what I want to do is ensure we understand why the
problem is what it is. And we're going to leverage some knowledge from looking
at isomorphism and helping us to understand what the problem is in a general
sense.
So in subgraph isomorphism, you have a graph and you have another graph.
Apple the question is whether a subgraph of this graph has the same flows as
this graph.
And what we -- this -- you know, intuitively corresponds to what we're doing at a
certain level in the sense that hey, we want to test whether this policy is okay, so
there has to be a subgraph in here that corresponds to this. And we have to
figure out what a mapping is. Now, it turns out the mapping problem isn't exactly
the same for reasons that I'll talk about later. But the problem is figuring out, hey,
what's the mapping between these. Yes?
>>: Try to understand. In your graph is every node the same, or do they have
different ->> Trent Jaeger: Right, the nodes have different semantics. They're different
constraints on different nodes.
>>: Okay. So when you map, you have to [inaudible].
>> Trent Jaeger: Right, right. And so the notion of how systems are built will
advise us in how we can map this. And so we won't end up with a pure, you
know, subgraph isomorphism problem. But I just want to -- I find it useful to think
about it in terms of okay, we need to come up with a mapping, we need to
constrain the possible mappings. And we need to of course look at this -- come
up with principles that we can -- that are provable that will show why this -- why
our problem is not this problem.
And looking at it at this level will help us come up with those principles. So why
does it even start to look like this problem is that the goals -- you know, they may
be in terms of different labels completely than the policy. Of course we want goal
label -- goal policies to be much simpler than the existing policy. So there are
going to be many, many more policy labels in an SELinux system then there are
in a -- in our goal. At least we would certainly hope so.
And the -- and what we have with this hierarchical notion is that one policy label
such as a label of a VM such as this domU label actually represents a set of
labels initially. And so we have layering going on here that we don't have
normally. But the layering occludes maybe what the mapping is between this -this policy, the underlying labels and the goal labels.
So the good news is that Weidong was indicating, we don't have a completely
general problem. There are only certain mappings that are legal. The good
news is that that will help us constrain the problem. Although we don't yet
understand exactly how and why. And the bad news is that the constraints aren't
specified. So we have to sort of figure them out sort of like we did with the other
program.
So we have to mine some constraints, we have to figure out what it is that tells us
what the possible constraints are. We can determine compliance. And we're
going to refine the system. So I'll tell you. We're going to take a top down view.
There's a lot of work that's been done actually in policy compliance. I'm sort of
surprised as I hadn't -- you know, I worked in policy for a long time and written a
lot of policy papers. But I haven't really come across took in policy compliance
until I dug it into it a bit more. So a lot of people have tried to test policies. But
they do what you would expect to do as a researcher, they come up with the
specific mapping between them themselves manually and they test them.
And you know, we don't really want programmers to have to come up with these
mappings for these systems if we can afford it, so if, you know, we want to make
what they specify high level as well. The goals should be high level. Whatever
mapping hints are given should also be high level, not labels on individual
variables and programs or something like that.
So what we're going to try to do is utilize the VM system configuration to infer
some security goals, fairly high level goals, and we're going to see if we can push
those through all the way to the individual applications.
So we're going to use tamper proofing as a guide. And tamper proofing is good
because tamper proofing gives us a guideline that we can get from the
configuration automatically without much information. And if there are specific
constraints on applications on user VMs and how they communicate, those will
probably have to be application derived. We'll have to get some hints about that
from the application programmer, I think. But it hopefully can be fairly high level.
And then we're going to take a top down view, I'm saying top down, what I mean
really, if you're thinking of VMM at the bottom and then the others at the top, what
I mean is from the VMM policy, which is coarser grained, so I think of it as big
and top down to the individual applications through the OS to the applications,
which I consider at the bottom because they're finer grained, but what I mean is
from VMMs to applications. And the result is we don't need to integrate the
policies into a single information flow graph. So we don't have to take all the
SELinux policies, all the applications policies, all the VM policies and put them all
into one graph and then analyze it like a subgraph isomorphism problem. In a
general sense, we can work from the top down and deal with the finer grain
problems where we need to, where there's authority that governs what's going
on.
Okay. So I'm going to blast through these slides. So the point of this is we have
a formal model for expressing what our system is. We can infer goals on that
model. I'm just going blow through these slides if you don't mind. If you have
questions later, by all means ask me, and you guys have the talk, too, if you want
to look at these slower.
We set a set of possible mappings for the individual objects called nodes. So we
start with VMs as nodes and we're going to refine down to whatever level of
granularity is necessary to determine whether things are safe.
So we can figure out when we do this conservatively, figure out what the possible
data is at each individual place. We then have a way of testing compliance. So
basically what we're going to do is push the labels around the graph. So each
VM generates labels of its own data initially, and they're going to be pushed
around the graph based on the flows that are there. And we're going to assess -the key thing is identifying whether a flow between a VM and another VM is save
or a node and another node are a set of principles really is what a node is.
Whether it's safe, whether it's unsafe, so it violates the goal, or whether it's
ambiguous and we can't tell yet. That is it may send multiple labels of data to the
receiver. The receiver may be unable to handle some of them. But we don't
know which flows are going where, and we'll have to look at it more carefully.
And so we compute basically is disclosure of labels push them around the graph.
We introduce flow constraints. So these are constraints to restrict what data can
be sent where because not all data can be sent everywhere. And when we find
ambiguous flows, we're going to decompose the nodes, so dom0 and SELinux
are in the Xen rather is a privileged node, and it receives lots of data with lots of
labels, and so we need to look at that more carefully to determine whether it's
cure.
And so what we end up with is this kind of approach, where we start with an
information flow graph, we deduce and apply constraints on the flows and on the
individual mappings between the nodes or sets of principles. We check
compliance. If we find safe and unsafe flows, then -- if we find the whole system
is safe, then we're done. Everything is cool. If we find unsafe flows, we need to
fix them, resolve them. If we find ambiguous flows, then we need to look more
detail. And so then we repeat from the beginning for the ambiguous principles.
And so we dig down deeper and deeper. Yes?
>>: [inaudible].
>> Trent Jaeger: Basically -- I'll talk about it in a second. So let me show you.
So the fact that this works top down is based on the fact that nodes really
represent equivalence classes of principles and that these principles all share an
upper bound of flows and they all share a set of possible mappings.
And if we find a safe solution at some level, then it's also safe for any
decomposition. And we're working out the formal proofs for that. We have a little
sketch here. And it's also the same for unsafe. If it's unsafe at a certain level, it's
also going to be unsafe below. And so we only need to look to low level where
we can determine safety or unsafety.
For resolving issues, we'll have to look at the whole graph. Resolving is a much
harder problem. But I think I have a slide -- yeah, the next slide.
So we're finding that resolution at this level, because it's an abstract level in
terms of sets of principles and flows, it works out to, hey, you can -- you can deal
with the flows reducing what's been sent or changing what's been sent, or you
can, as you were asking, decompose the principles. I'll talk about -- I'll get to
your question on the next slide, I think. And so you have these two choices, and
these are what individual rules in the policy effect these is what we're going to be
gathering and showing to the user for resolution. Hopefully we can order them in
some way, because there may be a lot in a SELinux policy.
But it looks like looking at things at a higher level may help with suggesting
resolutions, that we can search at a higher level and then just show them the
rules behind that.
Okay. Yeah. So what we have -- so this is the basically the breakdown. So we
start at the VMM level, we look at the XSM/Flask policy and the network policy
that describes what flows are possible between VMMs or VMs rather, I should
say.
And so this results -- and with the flow constraints we infer and the goals we infer
what ends up happening is that VMs with authority like dom0, other privileged
VMs such as a VTPM or server VMs because they may support multiple clients,
these have ambiguity. But individual user VMs that don't have security issues,
we can determine whether they're safe and unsafe right away. And so that's a
nice thing.
And so the fact that we're looking at the ones in detail for which detailed security
decisions are made makes sense, and so that's kind of a nice thing. But for
ambiguous flows, we have to break it open. So what we do is we break it open
at the next level by looking at the SELinux network permission. So these are the
permissions that connect the OS labels to the network.
So the idea is that the mechanism, the algorithm that I showed you before,
doesn't really tell you what the decomposition is, and we have some policy for
describing it. And so what I'm talking about now is the policy. So we're saying,
okay, let's look at the OS label that are connected to the network and see what
kind of things are going on there. And so what we found there is that there are
almost no constraints on network communication. So everybody can
communicate with everybody over the network. And so the question is, you
know, is that -- are there or should there be more cases, can we encourage the
definition of okay, I've got this VM and it's supposed to talk, you know, it's on a
cloud and it's on a cloud with some other VMs. And so it should only talk to
those specific VMs. And so we could leverage these firewall policy in order to
constrain the communication so that we could still proof safety in environment
like cloud where we can't really now, now everything is communicating with
everything.
And then for those network facing daemons that are there just there that Crispen
[phonetic] is so familiar with that are there receiving information from anybody in
doing things on behalf of the whole Internet, you know, what we'd like to do is
codify in our system this confinement policy that you guys did in AppArmor as the
goal that we want and then see if we can you know, leverage that to keep
refining the problem. So, you know, codify these sorts of things in an explicit
way.
If this doesn't work, then we're going to have to look at the OS layer and see
okay, you know, what's happening at the SELinux label level and dig into these a
little bit more. We're hoping -- I've done a fair bit of work on generating trusted
computing bases for SELinux policies, so the idea would be to come up with
trusted computing base and also we're thinking that in a VM sense in cloud you
have a very specific application you're trying to deploy on the VM. So there's
specific stuff that you really care about, and we can utilize these -- the focal
points to also help us define, okay, this is what defines the integrity. As long as
this application is protected on its system, so all the things the application
depends on and the application itself maybe form a base that we're going to try to
protect. And then, you know, we're going to try to see if we can protect them
from these guys, for example. I'm sure this isn't enough constraints. But this is
sort of a starting point for generating these kind of constraints. And this is the
direction we're going down.
So our goal is that we'll have a top down iterative refinement, and we'll come up
with techniques for generating these mapping constraints and generating flow
constraints so that we can understand what's happening in the policy at large and
that the -- the solution will be, you know, provably efficient from a top down
perspective, not the full isomorphism because of the properties of the systems at
-- in our formal model.
>>: [inaudible].
>> Trent Jaeger: We're hoping so. Yeah. So right now we're just looking at one
box at a time. But so what we'd like is to say, okay, this VM is only talking to
other VMs that communicate at this label, and then something would have to
justify that that communication is satisfying some integrity criteria, you know,
through, you know, TPMs haven't really caught on but, you know, through some
kind of integrity verification or something along those lines.
>>: [inaudible].
>> Trent Jaeger: Yeah. You're thinking on the right path. That's where we're
trying to go down. Now, we haven't built this model yet, but that's the direction
we're working down. But, yeah, that's exactly the intuition that we're trying to
convey. So that's a good point.
Okay. So I think we're basically out of time. So we have a tool for assessing
this, and it shows you what flows are ambiguous and safe, and then you can
crack open the ambiguous ones and look at what kind of flows are going on
within VMs or within the labels that are ambiguous. And it will tell you what
principles correspond to those and so forth.
We have a system where when an VM image -- the idea is the VM image is
loaded and say some demon that loads VMs receives it and it says well, you
know, I don't know whether this is compliant with my system so I'll ask the
management VM here if it's compliant, so we'll push the necessary information
up to the daemon that receives it. It will push it into the compliance checker. It
will check whether it's compliant, if it's not compliant then, you know, it may get
pushed into the tool to fix it, which obviously will then become manual. But if it's
all compliant, if everything's cool, then you should just be able to load the thing
and so order studying you know what defines compliance in a robust sense.
Because we're finding, you know, SELinux, macro-expanded policies are big.
There really aren't a lot of differences among them. You know, just a few rules
are different generally, and some of them are the same.
So we should, once we prove that something is compliant, be able to prove that
other things are compliant in, you know, if they are in fact compliant without
involving the policy designers.
The only other issue I wanted to make you aware of I thought was kind of
interesting was, you know, the notion of completeness versus soundness. So we
want a system that, you know, that we want to assess where it fits on this
completeness soundness notion. So a complete system would indicate no false
negatives and would tell us that, you know, hey, if you deploy this way, it's going
to be secure. Whereas soundness tells us, okay, there are no false positives, so
it tells us that there aren't going to be any errors that are not real errors. And so
obviously -- if you're familiar with intrusion detection systems then you know
these are issues, especially the false positives come in to play.
Now, what we have right now in the really world is a completeness issue. We
don't really have all the constraints. We say okay, we'll turn on this bid and we'll
trust this program, but we don't really say why we trust it or, you know, as I've
been talking about, what details justify that trust. So we don't really have
constraints for that that should be satisfied before you would trust it. And so we
have an incompleteness problem, and we're trying to use this kind of view to help
generate a complete enough picture that you can say something about the
system. But we can't really guarantee that you're going to collect all the
constraints, you know.
If a program receives data and it's untrusted, do we have enough constraints on
how that program handles that data correctly? You know, we'd like to have a
very precise definition that says it's handling that data correctly, but we don't
have it as a community yet. So we're lacking there. So we're going to work
toward this, and we'll be able to use more constraints as people come up with
them, but we're not going to be able to guarantee completeness on our own.
So we are going to try to focus on soundness because we don't really, you know,
in these complex policies that we start introducing false positives as we're looking
for errors is really going to be the death of this, so we really have to be very
conservative about this and look at sound constraints and only refine things if we
know for sure that this is a legal refinement. And so that's going to be a
challenge for us. Because the -- you know, the obvious thing would be to start,
you know, oh, it must be that one, oh, it wasn't that one, you know, and then
backtracking. But with this kind of approach, that leads to potential
computational issues and undoes some of the guarantees we're trying to make.
So we're definitely targeting sound constraints, and so far we've been okay there.
All right? So thanks for listening to the story here. Started from the first model of
the atom, I think, and worked our way around.
So basically we started with information flow. It's an idealization, but as far as
the security goal, it's still the most useful one I think we have. At least at a
general level. And we can apply, you know, other things as constraints relative
to this and be able to get our head around the problem. But we need to account
for application level enforcement, we can't just ignore it, we have to make it part
of the picture, we have to understand what applications are doing and whether
they are there satisfying -- helping us satisfy the security goals we have in mind
in order to really be able to say comprehensive things about security. And then
we'll have to address all of the layers once we -- you know, now that we have VM
systems. So the VMMs should help us. They should provide nice coarse grain
boundaries for VMs, and we don't have to depend on the operating system to
have everything, all of the data, and keep it all straight. But there's still going to
be applications that will have authority to do things that could break this system's
security goals, so we need to be able to do it all the way from the top to the
bottom.
So thanks. I'll take any further questions, if you have them.
[applause].
>>: I'm just curious [inaudible] composition and decomposition [inaudible] higher
levels of policy [inaudible] think in terms of these different players from [inaudible]
do you have [inaudible] tools to [inaudible] these different players and so
[inaudible] essentially given these different policies? I mean [inaudible] much
more bigger composition essentially talk about [inaudible].
>> Trent Jaeger: You bring up an interesting point. Composition, composition of
policies has been a problem people have worked on in the threat community, but
they didn't -- they found that composition didn't necessarily lead to security, and
so we have to be careful about how we compose and decompose things I think.
But ->>: [inaudible].
>> Trent Jaeger: Yeah event we are going to add the network underneath, and
so then you could have, you know, multiple machines. And I guess in this case
you wouldn't want to look at the whole Internet, but you'd want to look at the
machines that are connected at least with respect to the VMs that you're
interested in. And so we should be able to add another layer that says, okay,
you can have flows from this machine to that machine or this subset of VMs on
this machine to that subset of VMs on that machine so that subset of VMs on that
machine and, you know, are we achieving the security goals we have in mind?
So I think we should be able to ->>: [inaudible].
>> Trent Jaeger: Yeah. Well, the problem is going the other way, I think, is
getting enough constraints going up the stack. We seem to have -- I mean
network policies are a bit loose, but we seem to have more people doing network
policies and more experience with network policies in terms of fire walls than we
do with application policies and how applications handle data correctly. And so -so we haven't tried to get our whole head around it yet as a community. And,
you know, each of the layers are sort of hand by different people at different
expertise, so bringing them together is what we're hoping for. But I think the
harder problems will be higher at the application layer.
So we should be able to do the networking. Knock on wood. Yes?
>>: So [inaudible] and then hope for the best sort of, is that, you know, these
components that are going to ensure the -- you know, access control, they will do
their job right in the future. So is that when a VM is started, you do checks and
say okay, is this compliant, then you let it go. So like it's the load time versus
runtime kind of integrity [inaudible].
>> Trent Jaeger: Yeah, we are working on the enforcement and where the
placement of the reference model and that sort of thing. I didn't talk about it here.
So that -- so that would affect ->>: [inaudible].
>> Trent Jaeger: And clearly as a much as you can prove at load time or at
compile time, you know, the better off you will be in terms of work. So you'd like
to minimize the runtime checking. But I'm sure there will still be runtime
checking.
So we're wording with some people at Georgia Tech who do VM introspection
and trying to see how that will integrate into this sort of thing, to look at the
runtime behavior of the system. So they've been looking at it for intrusion
detection. And we're trying to look at you know, proving these kind of things, you
know, for the cases we're not sure. Can we get runtime checks for those. We
haven't done a whole lot on that front yet. But we're just gathering up information
for them now.
>> Weidong Cui: All right. Thank you very much.
>> Trent Jaeger: Okay. Thanks.
[applause]
Download