>> Aman Kansal: Hello, everyone. It's my pleasure... of Southern California. She'll be talking about simplifying the...

advertisement
>> Aman Kansal: Hello, everyone. It's my pleasure to welcome Nupur Kothari from University
of Southern California. She'll be talking about simplifying the task of building robust and
efficient networked systems. Her research interests are in systems and networking. And she has
also interned here previously. If you're familiar with the Jewel Meter [phonetic] project, she
worked on that, and she's collaborating with some of you on other projects as well.
So over to Nupur.
>> Nupur Kothari: Thanks, Aman. I'm really excited to be here and talk about my contributions
to simple identifying the task of building robust and efficient networked systems. So let's just
get started.
It's a well-known fact that networked systems can be quite complex. Their functionality is
distributed over multiple nodes and they also have to deal with lots of issues like heterogeneity,
network dynamics and failures, as well as application requirements like privacy, et cetera.
Therefore, building such robust and efficient networked systems can be quite hard.
In addition, with the emergence of new domains like datacenters, smartphones and sensor
networks, while these domains have brought in new applications they've also brought in new
requirements, which makes this task of building such networked systems only more challenging.
So the goal of my research is to develop practical tools to simplify the task of programming
robust and efficient networked systems.
So I -- in my research I've addressed various programming challenges faced by networked
systems developers. Let's first take a look at some of the challenges that I've addressed.
So one of the most important programming challenges that a programmer faces while building a
network system is that its functionality needs to be distributed over multiple nodes.
For example, when a programmer is writing a smartphone application, she may have to try and
partition the functionality of this application over the phone and this cluster of servers in order to
improve efficiency.
Not only does the programmer have to partition this application, she then also has to worry about
requirements like communication, coordination, et cetera.
These and other application and domain-specific requirements form another challenge for
networked systems programmers.
Here are some examples of various requirements that a programmer may be expected to handle.
One important requirement is that of energy efficiency, which is crucial in the domains of sensor
networks, smartphones, and datacenters.
Also a lot of times there may be a disparity between the functionality that the programmer had
intended and the actual functionality of the code that they wrote.
So, for example, the programmer may think of the functionality of a piece of code in terms of a
very high-level representation, while that the code -- the code that they may generate may be
very low level, hence it may be difficult to try and detect such gaps between intent and actual
functionality.
Lastly, there may be potential vulnerabilities left behind in networked systems code. These
vulnerabilities can be really dangerous because then in a networked system, if you have
malicious nodes, these nodes can exploit these vulnerabilities, not only to improve their own
performance but also to do worse things like maybe crash over nodes or crash the entire system,
et cetera.
In my research, I've addressed these four challenges for various domains. My approach to
addressing these programming challenges has been to develop and adapt program analysis
techniques as well as build networked system infrastructures for specific domains. And I've
done so by leveraging domain knowledge from these domains.
Most of my prior work has been in the area of sensor networks. I've built Pleiades, which is a
programming framework to develop sensor network applications. FSMGen is a tool extract
high-level functionality from sensor network programs. And Hermes is a runtime interposition
framework that allows the programmer to inspect the programmer state at the deployed sensor
node.
I've taken the lessons that I've learned from building all these systems for sensor networks and
am currently in the process of applying them to build solutions for other domains.
I'm working on GuFi, which is a tool to detect vulnerabilities in code. And I'm also working on
MobiProg, which is a programming framework for cloud-enabled cell phone applications.
So first I'm going to talk mainly about my prior work in the domain of sensor networks. I'm
going to focus here on the work on Pleiades and FSMGen. After that I'm going to spend some
time discussing my work, GuFi, which is -- which detects vulnerabilities in network protocol
implementations.
So let's start off with sensor networks.
So what are sensor networks? Sensor networks are made up of devices which are small and
battery powered, and generally we defer to them as motes. They sense the environment around
them using various kinds of sensors, and they also talk to each other wirelessly.
One important property that most sensor network devices share is that they are constrained in
terms of memory, computational power, energy, as well as communication bandwidth.
There are a variety of rich applications of sensor networks, ranging from volcanic activity
monitoring to microhabitat monitoring to aquatic sensing and datacenter monitoring.
In the case of sensor networks, unfortunately the resource-constrained nature of most sensor
network devices makes these programming challenges that I described earlier even harder to
address. Firstly, most sensor network languages and operating systems promote an event-driven
execution model.
So one example of a sensor networking language and operating system is the nesC/TinyOS
framework, which is also event driven.
Because of this event-driven model, what happens is that the code for these sensor network
applications becomes more complicated, which makes these challenges harder to address.
Also, debugging is really hard in sensor networks. Because of the resource-constrained nature of
these devices, the regular mechanisms for debugging, like maybe logging, et cetera, become too
resource intensive, hence people have to come up with very low-level ways of debugging sensor
networks.
Testing before deploying is not good enough for sensor networks because sensor networks have
deep interaction with the environment and, hence, a lot of times most bugs are not exposed while
testing before deployment in a controlled environment.
So -- yeah.
>> [inaudible] sensor networks special compared to, say, other embedded systems? These
problems that you're alluding to, aren't they common to all embedded systems?
>> Nupur Kothari: Yes. I think most of these problems are common to most embedded systems.
I think the one thing here for sensor networks of course is that most of them communicate with
each other wirelessly. And that adds another dimension, because now communication is also a
big constraint.
So I've tried to address these challenges in my work. I've build Pleiades, which is a
programming framework that distributes functionality over multiple nodes and that also handles
many application- and domain-specific requirements.
Secondly, I've built FSMGen, which is a tool to extract high-level finite state machines from
TinyOS programs. These finite state machines can help in detecting potential vulnerabilities in
the code and can also help in determining what the differences are between actual functionality
versus intended functionality.
First I'm going to talk about Pleiades. So in the conventional style of sensor network
programming has been that programmers write node-level programs. NesC is one of the most
popular languages used in sensor networks. So these node-level programs are written from the
point of view of a single node. They are then compiled into a mode binary and then installed
over all the nodes.
In Pleiades what we propose is a centralized approach to programming where the programmer
writes a central program which can address the entire network as a single entity. This program
then only needs to specify the application behavior. It is compiled using a Pleiades compiler to
node-level program which along with the Pleiades runtime then is compiled to a mode binary.
The advantage of this approach is twofold. Firstly, because we are specifying a centralized
program, the programmer doesn't have to worry about distributing functionality over multiple
nodes, and this is handled by Pleiades.
Secondly, the various domain-specific requirements of energy efficiency, concurrency,
consistency and failure recovery are offloaded to the compiler in runtime. So, again, the
programmer doesn't have to worry about handling these requirements.
So at the time we built Pleiades there were a few other sensor networking languages out there
which were trying do similar things. However, Pleiades is different in that most of the previous
languages had a very restricted programming model or did not worry about requirements like
consistency and failure recovery, et cetera.
After Pleiades a few other languages have come out which tried to -- which have similar goals to
Pleiades. However, these promote nontraditional programming models like data [inaudible]
programming or functional programming or declarative programming, while Pleiades promotes a
very simple, imperative, C-style programming model.
So let's take a quick example of what a Pleiades program can look like. So this is a simple
example in Pleiades which attempts to compute the maximum temperature over a network of
nodes. As is obvious from the code, Pleiades is a simple dialect of C. Just by looking at this
example, although, admittedly, this is a toy example, we can see that it's -- the readability is
definitely improved by programming in a centralized fashion. And this is a piece of code that
someone sitting at a desktop would have written.
Now, what Pleiades provides is Pleiades provides the concept of a node-local variable, whereby
annotating a variable as node local the programmer effectively says that one instance of this
variable exists at every node in the network.
All the other variables are known as centralized variables where only a single instance of such a
variable exists across a network. Pleiades also provides an @ operator which allows one to
access node-local variables from various nodes. So here temp@n basically means give me the
value of the variable temp at the node n.
In order to obtain efficiency, Pleiades allows for the programmer to specify concurrency using a
cfor primitive. A cfor is basically a concurrent-for loop. Yeah.
>> I'm confused. So all [inaudible] procedure local variable, I would have [inaudible].
>> Nupur Kothari: It is a procedure local variable, but it is a central variable in the sense that
only one copy of this variable exists across the entire network. So basically what that means is if
this procedure were to migrate from -- to be migrated from node to node, then this variable
would -- yes, exactly.
>> So it's not like there'll be multiple copies of main running on different nodes.
>> Nupur Kothari: No.
>> It's not like that.
>> Nupur Kothari: No.
>> I see, I see. Okay.
>> Nupur Kothari: So it's a central program. You only have one program executing. And the
way we get concurrency is by using this concurrent-for primitive. Yeah.
>> What's the -- when you have [inaudible].
>> Nupur Kothari: What do we have?
>> [inaudible]
>> Nupur Kothari: Oh, I'm sorry. It should have been N equal to null. Sorry. That's a small
edit.
>> And this get available nodes, does it have some semantic meaning? Is it a library function
that's ->> Nupur Kothari: Yes, it's a library ->> It's a function of a node, right?
>> Nupur Kothari: Exactly. So Pleiades also provides a notion of a node where node is a
network node. And get available nodes is a runtime -- it's a functionality provided by the
runtime, where the runtime will figure out what the available nodes are and provide a list of
those nodes.
Okay. So the property of this concurrent-for loop guarantees is serializability.
>> [inaudible] where does this mainly execute? Does it execute on every node? Does it execute
on just one node at a time?
>> Nupur Kothari: I'll come to that. But basically what it does is you move -- you move the
execution of this main function. But that's just the next thing I'm going to talk about.
Okay. So a concurrent-for loop basically means that this for-loop corresponds to some
sequential execution of the loop iterations, although this execution may not necessarily be in
order.
So now that we've seen what a central program looks like, the main questions to ask are how do
the compiler and runtime actually efficiently execute the centralized program over a network of
nodes, and, secondly, how do we handle concurrent-fors while guaranteeing serializability.
So I think this is the question which ->> [inaudible] semantics of concurrent-for is that each iteration of the loop executes atomically?
>> Nupur Kothari: It looks like it has executed atomically.
>> Even in the presence of -- even if the different iterations are accessing shared variables
[inaudible]?
>> Nupur Kothari: Yes. Yes. So it's a -- I know it's a very -- it's strict consistency that we're
providing. Yeah.
>> Okay.
>> All right. I'm wondering how big can a -- maybe I'm jumping ahead 50 slides here, but how
big can a network scale? It seems like there has to be some -- since you're programming in a
centralized style, it seems like the network would have to be small enough to be able to apply
centralized constraints, let's say do centralized coordination to guarantee some of these
constraints. I'm wondering how big can the network scale before this style starts to break down.
>> Nupur Kothari: That's a good question. So I think -- although we are programming in a
centralized fashion, we have -- with the concept of concurrent-for loops we can actually run
pieces of code concurrently on each node.
So as far as that is concerned, you know, scalability is not an issue. The problem comes when
we are trying to guarantee serializability. There, yes, it is possible that beyond a certain scale it
may be hard to guarantee serializability, but then we can come up with other relaxed consistency
mechanisms that can be guaranteed.
>> I guess it seems like if you relax the consistency mechanisms, it seems like it might have a lot
of dissemantics of your language, though. In other words, if a language says let's say that the
temperature acquisition is happening in rounds, then what we want to do is have everybody in
the network sample for one round and then do the comparison and then do the next round. Once
the network gets let's say multihop, then just being able to guarantee that these rounds are
synchronous then imposes a huge overhead.
And so I'm wondering -- let me ask you a different question. How big have you actually
implemented these networks yet?
>> Nupur Kothari: We've done up to 50 nodes.
>> Okay. And all single hop or multihop?
>> Nupur Kothari: Multihop.
Okay. So going back to the challenges faced by the complier in runtime. The first challenge is
how do we efficiently execute a centralized program over a network of nodes, and the second
challenge is how do we handle or provide concurrency for cfors while guaranteeing
serializability. Lastly, how do we detect and recover from failures.
So I'm going to talk in this -- right now I'm going to talk about how the complier in runtime
handled the first two challenges. First let's see how is a centralized program executed in an
efficient fashion over a network of nodes.
So here I take another even simpler example which only consists of a sequence of instructions.
And let's try to see how the Pleiades compiler in runtime would execute this program.
Now, there are two extreme possibilities here. The first probability is we migrate -- we move the
data to the computation. That is, we execute this program at a central location, and then we
move the data to that location.
The other possibility is we move the computation to the data. That is, every time we access a
new variable reciting at a different node, we migrate the flow of control to that node. Both of
these strategies will result in high-communication overhead and latency.
So the approach we adopt is actually a mixture of both which -- in which we say that we do
control flow migration as well as data movement.
So what the Pleiades compiler does is that it statically partitions the code into smaller pieces of
code called nodecuts. The property that these nodecuts satisfy is that the location of variables
accessed within a nodecut should be known beforehand.
Now, the reason why these nodecuts have this property, so that now before a nodecut executes,
the runtime not only knows the location of all the variables within the nodecut, it can then try to
position the nodecut in an appropriate fashion to improve energy efficiency and also prefetch all
the variables required in the nodecut to improve latency.
>> Do you use a subset of memory which allows [inaudible]?
>> Nupur Kothari: We -- yeah, we don't -- we don't really deal with [inaudible].
>> You don't use pointers, right? Okay. That's fine.
>> Nupur Kothari: So -- yeah.
>> So what subset of the variables can you actually [inaudible]? Because you [inaudible] at any
given time any node might -- might want to access environment, right? So for what fraction of
the variables or for what fraction of time for each variable can you actually assume that you
[inaudible]?
>> Nupur Kothari: So the location of the variable -- so because we are writing a centralized
program, it looks very sequential in nature, right? So I'm talking about the sequential parts. And
the sequential parts, before -- so the way we partition the code into nodecuts is before we -- so a
nodecut, all the variables which are accessed within a nodecut, we should know where they
reside beforehand.
Because -- so here, for example, you can see val@n3, right? So what that means is that I should
know what the value of the variable n3 is before I enter the nodecut so that I know where the -where the -- this particular variable resides.
Because the assumption is that these variables are tied down to the nodes at which -- to which
they belong.
>> But the same variables should also be accessed by different nodes, right [inaudible] is owned
by a single node?
>> Nupur Kothari: No local variables, yes. The centralized variables we don't worry so much
about the location of because normally they will decide with the [inaudible] control. Yeah.
>> Should you also take the size of the variables into account when doing [inaudible]? Or, I
mean, if you were to end up moving [inaudible] data [inaudible] ->> Nupur Kothari: That's a good point. Yeah, we don't do that for now. We assume that all the
variables are of the same size. But that's a good point. Yeah.
>> How do you handle failures?
>> Nupur Kothari: So I just said I'm not going to talk about that ->> Oh, sorry.
>> Nupur Kothari: That's fine. But -- I'm sorry. I'm kidding. But basically what -- the way we
handle failures is we implement some -- a [inaudible] mechanism.
>> Okay.
>> Nupur Kothari: So whenever we detect a failure, we try to [inaudible] the computation that
existed.
>> Okay.
>> Nupur Kothari: Okay. So moving on. So this is how we -- the compiler basically partitions
the code into these nodecuts.
Now, the runtime, when deciding to execute each nodecut, tries to figure out what is a good node
to run the nodecut at so that the data is close by and as well as the cost of moving the
computation to that node is not too high.
So in this example it can be node n1. So does the runtime migrate the flow of the control to node
n1. Before executing the nodecut, it also fetches -- prefetches variables from nearby nodes that
are required for the execution of the nodecut.
And then once it's done executing the nodecut, it then decides where to execute the next nodecut
and so on and so forth.
The two questions here are, firstly, how does a complier partition this code into nodecuts, and,
secondly, how does the runtime know where to execute each nodecut in order to improve
efficiency.
So let's first see how the compiler generates nodecuts. So this is the control flow graph for the
simple max example that I talked about before.
Now, what the compiler does is it uses the property that the location of the variables accessed
within a nodecut should be known beforehand. So, for example, in the CFG node 4 we see that
the variable temp is accessed at the network node n, while in CFG node 2 we see that this
network node n is actually defined or given a value.
So in order for us to be able to figure out where 10 resides in the CFG node 4, 2 and 4 cannot be
in the same nodecut. So the compiler introduces a split between nodes 2 and 4.
So using this strategy and a couple of other heuristics that the complier uses to maximize
concurrency, what we try to do is we developed an algorithm to minimize the number of edges
cutting across nodecuts.
Intuitively, why we want to minimize the number of edges cutting across nodecuts is because
each edge effectively corresponds to a migration of the flow of control. And we want to
minimize this migration because it incurs communication cost, and hence we try to minimize
this.
So for this example the Pleiades complier generates these following nodecuts. Now that we
know how the complier generates the nodecuts, let's see how the runtime actually figures out
where to run each nodecut.
So like I mentioned before, the runtime finds -- tries to find the lowest communication cost node
to execute the nodecut. For this, again, it used -- utilizes the property that the location of
variables is known beforehand in a nodecut and assuming that it has topological information, it
can actually try and figure out exactly where to run the nodecut in order to minimize
communication cost.
And this communication cost consists not only of the cost of prefetching variables but also the
cost of migrating the flow of control.
So next let's see how does Pleiades handle cfors and guarantee serializability.
>> Sorry. It seems like the topology needs to come into account there somehow.
>> Nupur Kothari: Yeah.
>> Are you going to talk about that?
>> Nupur Kothari: I was not going to talk about it. But basically so we assume either that you
have a topology or we don't even need a topology, we just need some cost metric by which we
can say that this -- executing something at this node is actually more costlier than executing the
nodecut at some other node.
So we don't really need like the entire network topology, but just knowing about its neighbors in
nearby nodes should be good enough.
>> So that's -- that's not at runtime. I'm assuming not.
>> Nupur Kothari: That's not at runtime, yes.
>> And so I guess at runtime then has let's say a half discovery mechanism or something and it's
updating its costs based on how far away the nodes are?
>> Nupur Kothari: Yes. Yes. So for -- I mean, for the experiments that we conducted, we
actually used static topology. And so for that, of course, the answer is easy. But if you wanted
to do it for a dynamic topology, then you would have to have a path updating mechanism by
which you figure out how many hops away each node was, et cetera, and what the cost was.
So -- okay. So talking about serializability. So serializability is a concept that's often used in
databases. And in our scenario what it means is that the outcome of a cfor execution should be
equal to the outcome of its iteration executed sequentially, although not necessarily in the same
order.
So the challenge here is how do we guarantee serializability while still providing some
concurrency. And our approach to do this has been inspired by the strong strict two phase
locking approach. Yeah.
>> And you assume that you don't have nested cfors?
>> Nupur Kothari: We do assume we have -- we can have nested cfors.
>> You cannot.
>> Nupur Kothari: We can.
>> You can. And what is the semantics in that case?
>> Nupur Kothari: So the semantics in that case is that each level of nesting is serializable. And
then for -- yeah. So each level of nesting is serializable with each other.
>> So it's like -- so if you -- let's say the algorithm calls [inaudible] iteration, then you have
[inaudible] iterations and you are going to get [inaudible] serializability of those M times N
blocks.
>> Nupur Kothari: No. We're going to get the serializability of the N iterations within the
[inaudible] cfor loop. And then once those have been -- those have completed, then we are going
to get the serializability of the outer ->> So it's like a [inaudible], I guess ->> Nupur Kothari: Yes. Exactly.
>> Okay.
>> Nupur Kothari: Exactly. Yeah. And so the way we do that is by using the strong strict two
phase blocking approach where we can have multiple readers and the single writer.
Also, in order to be able to deal with nested cfors we developed a hierarchical approach,
hierarchical version of this approach.
What this approach buys us is, firstly, only write operations to variables require serial access. So
if cfor iterations are only accessing -- are only reading variables, then they can provide
concurrently.
This approach guarantees us the prompt of strictness which allows us to build an efficient
deadlock detection and recovery mechanism.
Also, all of this locking deadlock detection and recovery code has been generated completely by
the compiler and does not require any inputs from the programmer.
So moving on to the implementation. We've implemented the previous complier as an extension
to CIL and we've implemented our nodecut algorithm over the CIL dataflow framework -dataflow analysis framework.
The Pleiades compiler generates nesC code, and we've evaluated it on TelosB modes.
We've implemented a couple of different applications using Pleiades, two of them being street
parking and pursuit-evasion. So next I'll talk about the performance of Pleiades with respect to
these applications.
So street park is a simple application where we envision that a sensor node is placed at each
parking spot on a street and it basically maintains a record of whether the parking spot is empty
or full. So now when a new car comes in, it can ask the closest sensor in order to find it an
empty parking spot, and then the sensor network will find an empty parking spot and direct the
car towards that.
So here we've implemented a street parking application over Pleiades. On the X axis we have
the request ID of the various cars coming in and asking for new parking spots. And on the Y
axis we have a latency of satisfying each request.
So we implemented a basic version of the street parking application with Pleiades runtime,
which had no locking. And for this we found that the application does not execute correctly and
the last few requests cannot be satisfied if we don't have locking.
On the other hand, for Pleiades with locking, we found that the application executes correctly
and is able to satisfy all the requests.
Now, we also implemented another version of the street parking application with induced
deadlock. So we induced a deadlock in the application. And we ran it for a version of the
Pleiades runtime, which did not have deadlock detection and recovery. In that case we found
that after the first request itself the application got dead locked and then crashed. Yeah.
>> [inaudible]
>> Nupur Kothari: Yes. Basically what happens is that two spots are assigned to the same car.
And so what happens is then, you know, at the end when new cars come in, all the spots show
that they're busy when they really are not.
So we also -- so we tested this version of the street parking application with the induced
deadlocks for Pleiades with deadlock detection and recovery. And I forgot to mention that in the
case of correct execution with locking versus the case of incorrect execution without locking, the
difference in latencies is not much, leading us to believe that the locking overhead is small.
So moving on to the case with induced deadlocks, we found that we actually -- the application
performance correctly, managing to satisfy each request. Again, in this case, the difference
between Pleiades without deadlock detection and recovery and Pleiades with deadlock detection
and recovery is not much, hence the deadlock overhead is also small compared to the benefits it
gives us.
An interesting thing to note here is that the latency is a little high to satisfy each request. And
that's actually an implementation artifact, not really a result of the design of Pleiades. It's
basically because we were using a very unoptimized transport layer, and if we replace that
transport layer by a better one, then these numbers should improve. Yeah.
>> So let's say I program this application using my [inaudible] programming model which is,
say, nesC. Then at some level all the complexity you're hiding inside your abstraction would
have to be dealt with by the programmer, right?
>> Nupur Kothari: Yes.
>> But in that case most likely the program -- if the programmer managed to construct a correct
program, it could be more efficient.
>> Nupur Kothari: Yes.
>> Right? So did you actually do a performance comparison? Meaning, what is the hit you are
getting in performance because you move to this higher level of abstraction?
>> Nupur Kothari: So the problem is that for street parking we actually found it very difficult to
write a nesC program. Because the amount of consistency that is required is very high. So
we've done this comparison instead for a different program application, which I'll talk about
next, which is pursuit-evasion.
So pursuit-evasion is an application where we have two robots, one is a pursuer and the other is
an evader. And the sensor -- the goal of the sensor network is to try and locate the evader so that
the pursuer can then catch the evader.
And we implemented a version of this in Pleiades, and we compared it to a hand coded version
in nesC that others have written, not us, but someone else.
And what we found was that, yeah, lines of code Pleiades obviously does really well, but
surprisingly, for the other metrics as well on which we compared it, Pleiades does surprisingly
well. And it's respectable. Latency is a little high, but, then, again, that's due to the optimized
transport layer that we're using. So that's that. But for everything else, Pleiades does
respectively well.
And so we found that Pleiades can actually successfully reduce code complexity while providing
acceptable performance and efficiency.
Also, all these requirements of consistency, synchronization, and failure recovery are handled by
Pleiades and the programmer doesn't require to intervene in those.
So now they've talked about Pleiades -- yeah.
>> What about power?
>> Nupur Kothari: What about it? Power?
>> Yeah. Go back a slide.
>> Nupur Kothari: Okay.
>> I mean, power is the dominant issue on ->> Nupur Kothari: Yeah, so we try to do this in an energy-efficient fashion.
>> What?
>> Nupur Kothari: We do try do this in an energy-efficient fashion.
>> Well, I guess my question is so what's the difference ->> Nupur Kothari: The comparison?
>> Yeah. I mean, this guy just looked at the message overhead and say [inaudible].
>> Nupur Kothari: Yeah, yeah.
>> [inaudible] so it's ->> Nupur Kothari: Yeah, I think a messenger overhead is probably a good approximation.
>> So anticipate it will be 50 percent more power?
>> Nupur Kothari: Yeah.
>> So you mentioned data recovery. I'm sorry to go back to this, but so this is the robot
mechanism that you were talking about before?
>> Nupur Kothari: Yeah.
>> I guess I -- I wanted to push a little bit more on that, because it seems like failure is such a
common case.
>> Nupur Kothari: Right.
>> And the semantics are so different across applications it's not something that you can just sort
of push under the rug. For example, if a node just dies, then you can't execute that loop anymore
because you just won't be able to protect it. So do you just forget about it and then when it goes
back in do you deal with it again? I mean, it seems like failure --
>> Nupur Kothari: Right. So we have two things here. The first thing is the node executing an
iteration, cfor iteration dies. In that case what happens is that this is detected and the
computation of that node is rolled back and executed at another node.
The other possibility is the node where some data is residing dies, which is worse because now
we don't have access to that data. In that case we just have to declare that iteration, that
particular iteration to have failed. Because there's no way to recover from such a failure.
>> So if a node dies, does that mean the system makes no further progress?
>> Nupur Kothari: So if a node dies and it doesn't have any data that is required by the system,
the system continues to make progress. In case the node dies and it has some data that is
required, then what happens is that particular iteration does not complete.
And so the cfor realizes that one of the iterations has died because of a node dying, and so it
decides whether it wants to move on and continue progress further on or does it want to declare
failure at that point.
>> I get -- you showed a loop in your toy example. It said we want to compute the maximum
temperature.
>> Nupur Kothari: Right.
>> So we collect temperature from every node and then find the max. So if one of the nodes
dies, it will never succeed in -- every single time you go through that loop is it still going to try to
get data from that node and then fail and then the loop fails?
>> Nupur Kothari: No, we just maintain a list of nodes that have died.
>> All right. So ->> Nupur Kothari: So basically every time the loop -- the first time the loop [inaudible] and we
figure out that the node has died, what we do is that we just abandon that particular iteration and
we continue the computation. And so at the end what will happen is the maximum value that
will be generated will be for the rest of the nodes, not for the node that just died.
>> Okay. So basically -- so in this case -- so is the semantics of your programming language
that in the case of a node failing it's as if that iteration did not happen as all. Is that the -- so
for -- what I'm asking is that like the idea that trying to compute the max of N number
[inaudible] compute a max with some subset of the -- that is the semantics? That is the expected
semantics of the program?
>> Nupur Kothari: Yes. If the node dies, then that is what even a sensor network programmer
would do, which is if a node dies, you don't have access to the data at the node, you would just
ignore it.
>> I mean, what if I wanted to write a sensor network program that you wanted to detect that a
node has died and in that case sent a message to some, you know, centralized control station
saying, oh, [inaudible] send somebody over there, do some fixing or something. How would I
program that?
>> Nupur Kothari: So actually in this we do have a mechanism by which we can detect that a
node has died. And so one could imagine exposing that at the programming level saying that,
okay, a node has died and this is a particular iteration and the user can then handle that failure in
whatever way they see.
>> But your current programming model does not expose that.
>> Nupur Kothari: No.
>> Okay.
>> Nupur Kothari: But definitely it's a possibility. Okay.
So I'm going to move on and next I'll talk about FSMGen, which is a fool to extract high-level
finite state machines from TinyOS programs.
So the motivation behind this is that most programmers, when they write event-driven code, they
tend to think in terms of finite state machines because finite state machines actually match up
really well with the structure of event-driven code.
However, the code that they generate can be really complicated, and so it will be nice if we could
have high-level representation of this code that the programmers could then match what they had
thought about the functionality to be and then look for where the actual functionality difference
from the intended functionality.
Also, having such a high-level representation would help in locating potential vulnerabilities in
the code.
So what we've done in FSMGen is given a TinyOS program, we've built a tool that from this
[inaudible] program can extract a compact, human-readable finite -- high-level finite state
machine.
So finite state machines have -- the concept of finite state machines has been used before in
sensor networks. However, finite state machines have been used more as -- people have
designed languages based on finite state machines to program sensor networks, and people have
also built finite state machine specifications for various interfaces in sensor networking
languages.
Program analysis has also been used as a tool for sensor networks, however, for specific tasks
like memory safety, et cetera.
Lastly, there have been tools to derive finite state machines from programs in regular languages.
However, these finite state machines were intended more to be intermediate finite state machines
that were fed into other model checkers or other verifying steps.
So I believe that FSMGen is unique in its goal of trying to find a human-understandable,
compact finite state machine given event-driven TinyOS program.
So what are the challenges in building a tool like this? The first challenge is how do we
statically obtain information about the execution of a program. Secondly, how do we distill this
information to form a finite state machine. And, lastly, how do we ensure that the inferred finite
state machine is compact and human-understandable.
So we address these challenges by taking certain techniques from the program analysis
community, programming languages community, and adopting and modifying them for
event-driven programs.
We use symbolic execution to obtain information statically about the execution of a program and
then use predicate abstraction to take this information and generate a finite state machine.
Lastly, we've built a variant of the Myhill-Nerode FSM minimization algorithm to ensure that the
inferred finite state machines are compact and user-understandable.
So let's first take an example of what a finite state machine would look like in TinyOS nesC
code. Now, so here I've [inaudible] all the nesC and TinyOS-specific details so that the example
is a little more readable. But this is an example. This is a piece of code that a lot of sensor
network programmers would actually be very familiar with.
Like I mentioned before, debugging is really hard in sensor networks. So a lot of people, what
they do is that for debugging they use this very simple mechanism of checking program state
within their code and turning LEDs on and off depending on what the program state is.
So they do that in the code. And then when the program is running on a sensor node, they just sit
back and observe the sequence of LEDs to figure out what errors are happening on -- what
execution path the program is taking.
What this example does is very -- something that is similar. What it does is it says, well, if the
value -- if I receive a message and if the value of the received message is less than 5, then
everything is okay and I turn the red LED off. But if the value is not less than 5, then I turn the
red LED on to maybe indicate a problem or to indication is going -- you know, something is
wrong.
So let's now try and see what a finite state machine for this example would look like manually.
So finite state machine for this would have three states. One is the initial state, but we don't
know what the state of the red LED is. The other two are states where we know that the red LED
is either on or off.
So if this message received event occurred and the value of message was less than 5, then the
state machine would transition from the initial state to the state where the red LED was turned
off.
On the other hand, if the value of message less than 5 was false, then the state machine would
transition to the state where our LED is turned off.
For now I'm not going to look at scenarios where you can have multiple occurrences of the same
message received event handler but assume that this event handler can only occur once. So this
is the simple state machine that is represented by this snippet of code.
Now I'm going to talk about FSMGen, how it works, and what finite state machine it would
generate for this simple snippet of code.
>> Can I ask a question?
>> Nupur Kothari: Yeah.
>> Your goal is to generate a finite state machine for an individual event handler?
>> Nupur Kothari: No, for the entire [inaudible].
>> So which can have lots of event handlers.
>> Nupur Kothari: Yes.
>> So you will -- the model you generate is going to capture the causality between an event
handler executing and then generating further events that are going to trigger other event
handlers.
>> Nupur Kothari: Yes.
>> Okay. Good.
>> Nupur Kothari: Okay. So -- although, I will not talk about that in detail too much.
So next let's take a look at the overview of the FSMGen system. The main module in the
FSMGen system the FSM Generator which takes as input a TinyOS program and a concept of
interesting modular events that the user wants to study in detail the functionality within the
TinyOS program.
This FSM Generator uses a symbolic execution framework and predicate abstraction module to
generate a finite state machine which is then minimized by the FSM minimizer to give us the
final output, which is a user-readable, compact finite state machine.
So now going back to the debug example that I presented, let's try and see how each individual
component works for that example.
So first we use symbolic execution to statically extract information about the execution of a
program. So symbolic execution is a simple program analysis technique where rather than
providing concrete values to program inputs like let's say 1, 2, or 10, we provide symbolic values
which can represent any arbitrary values.
Using these symbolic values we then track all possible paths of execution through the program.
At each point we maintain a mapping from variables to the actual symbolic values as well as
what were the assumptions that needed to be made about these symbolic values to reach that
particular program point.
So in that example, debug example, what this would mean is that we -- when we -- for the
message received event handler, let's say a symbol beta would be assigned to the input message.
And then we would track the execution of this event handler using the value of a message being
beta.
At the end of the execution, we could have two potential symbolic program states, one in which
beta less than 5 is equal to true, and that means that red LED was turned off and red LED is set
to false. The second being the state where beta less than 5 is equal to false, which means that red
LED was turned true.
So this is how symbolic execution takes a program and generates all the possible symbolic states
at the end of execution.
Now, in predicate abstraction we take the symbolic state and convert it into a state in the finite
state machine. So in this scenario, a finite state machine state is basically evaluation of a set of
predicates. This set of predicates is actually derived from the program code itself. And in this
case it ends up being the two predicates message less than 5 and red LED.
So predicate abstraction derives what the value of these two predicates is by looking at the
symbolic state, which in this case turns out to be true and false. So this is how predicate
abstraction works.
All of this is put together by the FSM Generator. What the FSM Generator does is that first it
symbolically executes the main function of the program to get what the initial state of this state
machine should be, which in this case turns out that both of these predicates are unknown.
It then keeps track of what are the events enabled. So here in this case the only event that we're
looking at is the message received event. And so it symbolically executes that and then adds the
resulting finite state machine states to the state machine, where on the edges it puts just -- not
just the name of the event but also the assumptions that needed to be made in order to end up in
that particular state.
So this is how FSM Generator generates finite state machine. It continues this process, of
course, until no new states or edges are added. And here it basically handles track -- keeps track
of what events are enabled and when they can be triggered or not.
So it's interesting to note that this finite state machine is actually the same as the finite state
machine that we had manually derived. For this case, because it is such a simple example, we
didn't have to worry about minimizing the finite state machine.
So because this was such a simple example, there were a lot of things that -- things that looked
really simple. However, to get it to work for actual real TinyOS applications, there were a
number of challenges that we had to handle.
So, firstly, we had to adopt symbolic execution for event-driven programs. This involved
handling and tracking various events, when they occur, what they are triggered by, and also
dealing with asynchronous function calls.
In TinyOS we have a concept of asynchronous function calls called tasks where you post a task
and then you forget about it and then it executes whenever event handlers are finished executing.
So we have to deal with those.
We also have to deal with this concept of split phase function calls which are basically pairs of
calls and events in TinyOS where a call somehow triggers the ensuing event. An example is the
send and the send done pair where the function call send basically asks the radio to send out a
message which then triggers a send done event from hardware which indicates that the message
was sent out correctly.
So tracking the functionality of this can be quite tricky. So we deal with this as well.
Also, we have to try and reduce computational complexity without compromising on accuracy.
In order to do that, we have a concept of interesting modules where we say that the user can
actually specify what are the modules that he wants to focus on or know the details of the
functionality of so that we can abstract with the lower-level details from the other modules.
Also, we built a very coarse-grained execution model for TinyOS, which while reducing
computational complexity still provides good accuracy.
Lastly, we have to deal with minimizing finite state machines for many applications.
So we've implemented FSMGen over the CIL framework, and we've used the CVC3 constraint
solver in our implementation. We tested FSMGen for a number of different TinyOS applications
and system components, including version 1 of TinyOS as well as version 2.
We found that of all the finite state machines that we generated, the maximum size was 16 states.
And this was for a very complicated protocol, a time synchronization protocol called FTSP.
We were able to highlight two inconsistencies in various components, Surge and
MultiHopEngine, using the generated finite state machines. Next I'll show what was the finite
state machine generated for Surge and what was the inconsistency that we found.
So Surge is a simple data collection application which periodically polls the sensor and sends the
value -- routes the value to the base station. So we manually verified the functionality of this
finite state machine is actually the same as that in the TinyOS component.
An interesting thing to note here is that once the execution transitions from state 2 to state 4, it
seems like one would be in an infinite loop between states 4 and 6. These are basically error
states indicating that this doesn't seem right. So I went back and looked at the code and found
that the state machine is correct.
The reason why Surge has never -- people haven't come complained about this while using Surge
is because this transition between 2 to 4 depends on the way an underlying interface is
implemented in TinyOS. So in practice it is never taken. However, if that underlying
implementation were to change, then this edge definitely could be taken and this would result in
a bug.
>> So your model generation was able to detect this edge because you model that underlying
interface in a very general way?
>> Nupur Kothari: Yes. Because we abstracted away the details of that.
>> [inaudible]
>> Nupur Kothari: Yes. And so we were able to get this edge. However, in reality, this edge is
not taken because of the way that interface was implemented.
So up until now I've talked about Pleiades, which a centralized framework for building sensor
network applications and which handles a lot of these application- and domain-specific
requirements and manages to reduce code complexity while providing acceptable performance.
I also talked about FSMGen -- yeah.
>> [inaudible] you have an error because you [inaudible] in a very general way. Does the
opposite error happen, that you miss certain things because your interface model is too simple
and it misses [inaudible].
>> Nupur Kothari: It's definitely possible. We haven't seen that, but it is definitely possible. In
that case, the -- if -- it seems that the error would lie within that implementation of that interface.
And so the programmer would then have to change the focus to that interface. If that was
specified as an interesting module, then that would show up in the finite state machine.
So the way -- this abstracting away of details is sort of controlled by the programmer when -- or
the user when they say that, okay, you know, these are the modules I'm interested in looking at
the functionality of. So if they want to look at that functionality as well, then it would show up.
So, okay, so I talked about FSMGen derives compact finite state machines and we find that these
finite state machines can sometimes also point to inconsistencies in code.
Next I'm going to talk about GuFi, which is some ongoing work that I've been involved in with
people here at MSR and which I'm really excited about. And let's move on and see what GuFi is.
So GuFi, or Gullibility Finder, is a tool that for a given network protocol implementation
modifies incoming messages at a node to try and repeatedly drive the execution of the
implementation down to a particular interesting statement.
So this interesting statement for now we assume is specified by the user of this tool. So an
example could be in the TCP implementation the user must specify that all statements that
increase the congestion window at this TCP sender are interesting statements. And GuFi would
try to find mechanisms of modifying incoming messages at the TCP sender to try and get the
implementation to execute these interesting statements.
So the goal of GuFi is, firstly, how do we modify these incoming messages to trigger execution
of these interesting statements and whether it is even possible to modify messages in such a
fashion. And, secondly, can this manipulation be used by malicious or selfish participants to
improve performance.
So, intuitively, if these interesting statements are statements that allocate some form of resources
to a particular participant in the protocol, what would happen is if you repeatedly can trigger
these statements, then the resources allocated to that particular participant would increase and
then that node would actually benefit. Yeah.
>> I have a question about the problem statement in the first bubble. It particularly -- no, no. Go
ahead. Go forward. Why do you use the word repeatedly there? I mean, you just want to find
one modification of the input message that it can reach particular statement once, right?
>> Nupur Kothari: No, we want to be able to do it every time for each incoming message, or for
a large chunk of the incoming messages. Because the idea is we're looking for these sort of
resource attacks where just -- they're very subtle in the sense if you modify one message, you
may only get a little bit of a benefit. But if you can do it repeatedly, you'll get a lot of benefit.
So we want to be able to do this repeatedly.
>> [inaudible] then lies in finding the interesting statement, not about how to [inaudible] the first
part of the question seems to be -- and correct me if I'm wrong, I'm not an expert in this, it seems
like it's a test case generation problem, right, which is there's so many solutions to [inaudible] it's
the leap from the first statement to the next statement that's a challenge, which is these
interesting statements result in some sort of interesting behavior. And how do you find that
given a protocol spectrum.
>> Nupur Kothari: Right. But I think that even the first part also challenging because it's not
just about finding one case that works but being able to do it repeatedly. That is interesting to us.
And, yes, I mean, finding interesting statements of course leads us to, you know, discovering
these mechanisms, but hopefully the number of interesting statements should be small enough
because we're looking only at resource allocation statements.
So effectively you're sort moving the complexity while it is still challenging to find these
statements, you're effectively moving the complexity of finding an actual attack or the
mechanism for an attack to the tool. And we want to do this repeatedly.
>> [inaudible] then you can launch a denial of service attacks or something like that because you
can create so much memory consumption or something [inaudible].
>> Nupur Kothari: Right. Not just a -- yeah, not just a denial of service attack, but even more
subtle attacks where let's say the user is -- the malicious node is selfish and wants to improve its
own performance. So if that's the only thing that interests it, it doesn't care about what happens
at the receiver. That's something that's very hard to detect because you're not really hurting the
receiver in any way. What you're doing is you're extracting more performance than is fairly your
share.
So, yeah, so, I mean, so the second question of course is how can this manipulation be used to
form -- by malicious and selfish participants to improve performance.
So we built an initial prototype of this tool GuFi and we used it on TCP. And I'm going to
explain how GuFi works using the TCP example.
We used a user-level implementation of TCP called Daytona. And in this case our goal is to see
whether the TCP sender can be manipulated in this fashion. So we want to analyze the code at
the TCP sender. And the interesting statement that we provide to it is any statement that
decreases the number of outstanding packets of the sender.
So intuitively what this means is if I decrease the number of outstanding packets at the sender,
the sender will send out more packets towards me because it thinks that I've received all the
packets that it's already sent out. And so if -- what this will do is it will give me a better
throughput than I would have gotten otherwise.
So for this particular example let's see how GuFi works. So GuFi has two parts to it. The first is
the program analysis part, and the second is the testing phase. In the program analysis phase,
GuFi uses symbolic execution to take as input the TCP code as well as what the interesting
statements are and generate a list of potential paths that can lead to interesting statements.
Also what it does is it generates a module which tries to emulate a malicious receiver, TCP
receiver.
In the testing phase we run the original TCP implementation at the sender and the receiver.
What we then do is we assume -- so here the assumption of course is that the sender is the honest
guy and we're trying to see how the receiver can manipulate the sender. So we then execute the
module which emulates a malicious TCP receiver in the middle.
So in this setup, whenever the TCP receiver sends out an acknowledgment message to the
sender, it is intercepted by this module. This module then takes global state from the TCP sender
and also takes the list of potential paths that we have generated that leaded to interesting
statements and figures out how to modify this ACK packet in order to trigger one of these
interesting statements. It then sends back this modified ACK to the TCP sender.
And we measure this, the performance for the receiver in this setup with the regular setup where
both the sender and the receiver are honest.
And, oh, here I forgot to mention we actually for now we also restrict what particular fields of
the message can be modified by the malicious receiver.
So for this -- we actually performed this comparison using GuFi for Daytona and we found that
the malicious receiver actually does pretty well. So on the X axis we have the time and on the Y
axis we have the sequence number of the packets sent out at any given time.
We can see that for a particular time the malicious receiver succeeds in getting the sender to send
out a lot more packets than the honest receiver. When we looked at exactly how GuFi modified
the TCP ACK packet we found that what it does is that it modifies the ACK sequence number in
the ACK packet, so it increases the sequence number that has been ACKed in the ACK packet.
So this is actually the optimistic ACKing attack which was introduced by Stefan Savage and
others in 1999. So GuFi effectively by being provided with the protocol implementation and an
interesting statement can actually provide us with the mechanism of a particular attack.
>> [inaudible] can only slow down the network, it doesn't have any sort of a safety -- it doesn't
produce a safety violation in what the TCP protocol is supposed to do, it's only going to lead to a
performance problem. Is that right?
>> Nupur Kothari: Yes. But, I mean, it can lead to [inaudible] performance problems. In fact,
yeah --
>> I was just trying to understand [inaudible].
>> [inaudible]
>> Nupur Kothari: It violates fairness. It can cause denial of service attacks. In fact, this
particular attack has been documented in the US-CERT database of gullibilities -- or
vulnerabilities as being an attack that can cause network failures, et cetera. So yeah.
So GuFi is still ongoing work. I'm still working on it. And I'm really excited about it. I think
GuFi can prove to be a valuable tool for designers to explore implementations for gullibility by
simply providing a hopefully very small set of interesting statements that govern the resource
allocation of resources like air time or number of packets that are outstanding, et cetera.
I believe GuFi can be used to detect other protocol gullibilities in TCP as well as 802.11. I'm
really excited to see if it can actually find attacks which have -- which were not known earlier.
Also, GuFi can be extended to deal with more complicated attacks, where we have multistep
attacks, where we have each step the user wants to do something very specific. We can also
have packet injection attacks where rather than simply modifying the incoming packets we
actually inject new packets into the system. And, lastly, we can have multiple attackers trying to
collude together to cause -- to attack a single node.
So now that I've talked about GuFi, let me talk about some of my -- yeah.
>> I know that [inaudible] team has been doing some work where they look at input files ->> Nupur Kothari: Yes.
>> -- and they do static analysis.
>> Nupur Kothari: Yes.
>> Other than that fact, is the main difference that it's network packets that are the input rather
than ->> Nupur Kothari: Well, so that is one difference. We're studying a particular aspect of the
implementation. They actually -- from what I know, they use binaries. They analyze -- they
don't analyze actual source code. And the other thing is they're looking for very specific kinds
of ->> But if they can do it with binaries ->> Nupur Kothari: That's true.
>> -- source code, isn't that better?
>> Nupur Kothari: But we're focusing on a very -- so TCP can be a really big protocol. And
they do it -- so we're focusing on a very small aspect of that. And I think the other thing is that
they're not looking for these sort of subtle attacks, they're looking for attacks which can crash the
machines, et cetera, like at the particular node.
So in the future, so, like I mentioned already, I'm working on MobiProg, which is a tool to build
cloud-enabled smartphone applications. And the challenge here is to figure out how to
dynamically partition a smartphone application between the phone and the cloud so as to
minimize energy consumption and delay while still providing guarantees for privacy, et cetera.
And I'm interesting in the future to continue to work in the area of smartphones and look at other
kinds of smartphone applications where you have multiple smartphones interacting with each
other, like multiplayer games, or collaborating sensing applications.
Another area of interest is building tools to find vulnerabilities in third-party smartphone
applications. So this is becoming a real problem even for current users because a lot of
smartphone applications have come out which are malicious in that they try to steal private data
from smartphone users or they simply just drain the smartphone of batteries. And being able to
detect using some program analysis or runtime techniques would be nice.
>> How do you do that? That's fascinating. How could they [inaudible] smartphone
[inaudible]?
>> Nupur Kothari: You just run some computation which takes up a lot of memory, and that's it.
>> [inaudible]
>> Even download some random application.
>> So I assume you're not talking about iPhone apps, or are you?
>> Nupur Kothari: There have been iPhone apps which also do that, right? But now of course I
think it's become stricter and they are being thrown out of the app. But, yeah, I think Android,
Windows Mobile, you could have a lot of applications.
>> [inaudible]
[laughter]
>> Nupur Kothari: Yeah, actually there's also the sleep cycle monitoring app which basically -- I
mean, the fact that they drain your battery is unfortunately a side effect of their functionality. So
you're warned to actually keep your phone connected while you're using the application to ensure
that your phone doesn't die. But it is possible that people may do this maliciously, or it may just
be a bug in their code.
>> [inaudible] power you also listed privacy and security, so for the iPhone apps that you would
put in those two categories?
>> Nupur Kothari: So as far as the iPhone goes, I did hear of one case where a game developer
was sued because they supposedly were downloading their users' phone numbers and storing
them, which was violating privacy concerns.
So there are still scenarios. I think on the iPhone, because of the strict restrictions they place, it's
harder. But, on the flip side, for developers who are building applications, developing for the
iPhone is very hard because you're not providing the rich interface that let's say Windows Mobile
or other platforms provide you with.
>> Which explains why they have so few iPhone applications.
[laughter]
>> And conversely why there are so many [inaudible].
>> Nupur Kothari: Okay. So moving on. I'm also interested in building tools to develop
efficient datacenters. So while I was at MSR in summer 2008 I worked on a runtime profiling
tool that actually tried to profile the energy consumption of individual processes on machines.
And a lot of people have been using such tools to actually do energy-aware scheduling of tasks
for datacenters.
So what I'm interested in is seeing whether program analysis can help in this -- in some fashion.
So to conclude, I talked about why building networked systems is difficult and listed some of the
challenges that programmers face. And then I talked about the tools that I've built to address
these challenges for different domains, which include the tools I've built for sensor networks as
well as the tools I'm currently working on for protocol implementations and smartphones.
In the future I'd like to continued to work for these directions as well as explore other domains
which pose interesting challenges. Thanks.
[applause]
>> I had a question on your GuFi work.
>> Nupur Kothari: Okay.
>> So you made it sound as if you built a system and then lo and behold you got an attack out of
it and then you suddenly discovered that, oh, somebody actually wrote a paper a few years ago
which was exactly [inaudible]. Is that the real story, or were you actually trying to find that
attack?
>> Nupur Kothari: So we built the system for TCP. At that point we did not have any attacks in
mind. Of course we had to look at what are the attacks possible for the implementation that we
were using. But that helped us in identifying what the interesting statement could be. But we
did not -- that did not change anything else. GuFi [inaudible] as it is for everything else.
>> So you were not aware of that specific attack by Savage when you actually started GuFi off
trying to look for attack which is trying to find this -- reducing this -- whatever -- the counter
thing?
>> Nupur Kothari: Oh, yeah, yeah.
>> Okay.
>> Nupur Kothari: Can I ask a snide question, an admittedly very snide question to follow up on
that question. [inaudible] attacks you have found so far has been [inaudible]. Even the Wi-Fi
attacks ->> Nupur Kothari: Coincidence. Coincidence.
>> So I have one more question about the GuFi work. It's very interesting. So you mentioned
that the thing in the middle, which is -- takes as input the current packet, the list of paths, and the
current state of the sender or the receiver, one of them, right?
>> Nupur Kothari: Right, right.
>> And then tries to figure out how to [inaudible] the packet. So this state component that is fed
as input, is it easy to get a handle on all the state that could be pertinent to the TCP stack at a
particular machine?
>> Nupur Kothari: Right. So actually what we do is we don't -- what we do is we look at the list
of potential paths that we've generated to figure out what is the state that is useful to us. So we
don't take all the state from the TCP stack. We only take the state that helps us decide which
path to take or that is crucial in getting us to hit the interesting statements.
>> [inaudible] just need that state from there [inaudible].
>> Nupur Kothari: Exactly.
>> Oh, that's very nice.
>> Nupur Kothari: Yeah. So, I mean, one thing I guess I didn't talk about or I sort of glossed
over was the fact that this malicious receiver is actually running at the TCP sender in our testing
phase. Because we found that it was much easier to import global state and do other things at the
sender itself.
>> But, I mean, [inaudible] it is nicely encapsulated in the TCP, right? So you could just
[inaudible] look around for anything special.
>> Nupur Kothari: Well, but the reason is we don't want to -- I guess because even in the TCP
structure there's so much state, right, we don't want to read all of it, we only want to read the
particulars that are relevant to us.
>> And how big is that code in the TCP sender that you [inaudible]?
>> Nupur Kothari: I don't know how many lines of code. But basically it's the kernel TCP just
[inaudible].
>> I actually have no idea. So it's -- are we talking like ->> Thousands and thousands and thousands of lines.
>> Like hundreds of thousands, you mean?
>> I don't know about hundreds of thousands. The problem is that [inaudible] is what it is and
there are a large number of [inaudible] and the Windows PCP stack supports a large number of
the unstandardized variations, and that's what it is. I don't know [inaudible].
>> Nupur Kothari: So we actually used a very simple -- I mean, we used only one of the
congestion control mechanisms that are support from -- are supported. But the code was still
huge. And symbolic execution actually we're facing scalability issues with TCP.
>> [inaudible]
>> [inaudible] protocol stack was correctly implementing [inaudible] even the very basic
[inaudible] took them something like three years of CPU execution time or something crazy like
that.
>> Aman Kansal: Okay. Thanks.
Download