>> Nachi Nagappan: Okay, thank you for coming and for... online. We are very happy to have with us...

advertisement
>> Nachi Nagappan: Okay, thank you for coming and for people who are watching
online. We are very happy to have with us Abhik Roychoudhury. Abhik is
actually an associate professor --.
>>: [indiscernible].
>>: Sorry?
>>: [indiscernible].
>> Nachi Nagappan: Abhik is an associate professor at the National University
of Singapore and he earned his PhD from Stony Brook. He has actually been
there for almost 10 years now. He has very diverse interests ranging from
embedded systems, to software testing and analysis, symbolic execution, etc.
So he is actually going to talk about some of his work today.
>> Abhik Roychoudhury: Okay, so should I begin?
>> Nachi Nagappan:
Yeah.
>> Abhik Roychoudhury: Okay, okay. So this talk with be on the role of
symbolic execution in software testing, debugging and repair. Essentially
even though I will talk about some specific techniques I will try to go into
the role that symbolic execution is playing in each of these techniques. So
of course first of all I must mention that the growth of symbolic execution
that we have seen in the software engineering community and used in different
software processors is driven a lot by the SMT solver technology maturity.
So without the growth in the SMT solver technology many of these techniques
would not be possible and the purpose of this talk is really to see whether
this can be used, symbolic execution, can be used in newer ways in software
engineering. So of course there is a process centric view that is where
symbolic execution can be used beyond testing and verification where it has
conventionally been used.
So for example I will try to talk about how it has been used in debugging and
repair where conventionally symbolic techniques were not used, but also I
think there is a conceptual view and that is symbolic execution has quite a
bit been used for guiding search in our large search space. And whether one
can go beyond and try to use symbolic techniques for inferring intended
behavior. And really what is the secret recipe that makes symbolic execution
drive both of these roles that are guiding the search in a huge search space,
as well as inferring the intended behavior of programs, because of course as
you might imagine if you can infer the intended behavior of a program then
activities like debugging become much more attractable.
So really speaking the perspective that we are taking for symbolic execution
in this talk is that conventionally it has been used quite a bit. When I say
conventionally, it is conventionally now, but sometime back it was not so
conventional in test generation, model checking and so on, where it is
exploiting the similarities in the search space. And also at the same time
symbolic execution can be used as some sort of a psychiatrist trying to
uncover what went wrong. And this is really the role that is being played by
symbolic execution when we use them in debugging, summarization, semantics,
extraction, program repair and so on.
So when I walk about debugging in this talk I should very clearly clarify the
difference between the bug hunting and debugging. Of course in many of the
papers we choose model checking for finding bugs. This is also mentioned as
debugging, but that is really bug hunting where you have a certain property
in mind and a program and you produce counter example. And the counter
example essentially says that under these input conditions this property is
violated. Whereas what we are mentioning here as debugging is that there is
a certain input, there is a certain output and we are actually trying to find
out what went wrong in this particular test case. So this is really the
debugging problem that we are looking at.
So here certain property is not given to us. What is the property that was
violated? This is what we are trying to find, but on the other hand we have
a failing trace. So this is the debugging problem that we are looking at
here.
>>: Can you explain why the property or the part of the code that --.
>> Abhik Roychoudhury: We want the part of the code which is the likely
suspect for the violation of the expected output.
>>: So then a property is much more general than the piece of the code.
>> Abhik Roychoudhury: That’s right, that’s right.
inferring the property in debugging, that’s right.
So we really won’t be
So I think these are the things that I believe I can skip. So there is, of
course, symbolic execution, you just replace the input with a certain
[indiscernible] and then you have the concrete output as well as the symbolic
output. So the symbolic outputs are given in terms of [indiscernible] and of
course the part in symbolic execution we have quite a bit of usage of path
conditions where in this case, given the input equal to 5 corresponding to
each line number that is executed you can keep track of an assignment store
and a certain path condition. And the path condition really captures the set
of all inputs which go through this part.
Okay, so at any point of time you can keep track of the branches that you are
going through. The branches conditions and for any branch conditions that
you are conjoining you can look up the symbolic store. And this is how you
can compute the path condition automatically. So in our usage of symbolic
execution we will be having a fair amount of usage of path conditions and the
usage of these path conditions has been fairly extensively popularized in
this directed automated random testing method where you continue the search
for failing inputs. In particular inputs that don’t go through the same
path.
So for example the aim here is really to get path coverage. So for each part
you capture the path condition and you negate a little bit of it. So for
example if you start with this bold path, this is the path condition and you
negate a little bit of it, you get another path condition and you try to get
a solution for this. You negate a little bit more, you get another path
condition, and you get a solution for this and so on. So the aim is really
to make sure that you don’t have inputs which go through the same path again,
and again and again.
So this is fairly well known at this stage, a fairly well known usage of
symbolic execution in testing. So there is, when you use dynamic symbolic
execution for testing in this way there is some sort of an implicit
assumption is that the inputs that execute a certain path are similar. So if
you test one of them there is no need to test the others. So essentially we
are using similarity to skip over parts of a large search space and the
dynamic symbolic execution is essentially a tool which helps us achieve this
goal.
So the first thing that I will mention today that is a usage of symbolic
execution again in test generation is whether one can look at courser grain
notions of similarity, which are used for test generation. So this is the
slide that shows the different roles played by symbolic execution in testing,
debugging and repair. In testing really the goal is always that we don’t
need to test for similar inputs and there is the possibility of looking for
similarity beyond parts that is probably combining parts into one single
partition of parts.
In debugging of course giving a failing input we are trying to find out
similar inputs which pass. So we will see that one can do large equal
comparisons to detect the deviations between these failing inputs, as well as
the similar inputs that pass. And finally in repair there is the question of
finding similar inputs which show the same error and therefore you can rescue
them all and this forms the search space of all possible repairs. And here
also symbolic execution we will see is used to capture the intended behavior
of the program. Okay, so this is how I will try to set the stage for the
usage of symbolic execution and testing debugging, as well as repair and at
all points of time try to go back and see what role symbolic execution is
playing in each of these techniques.
So let me start with an example technique from testing. So this program
actually has 8 paths, not surprisingly with the 3 branches, they are at 8
paths, but we can see that they can be partitioned into just these 3
partitions based on the symbolic outputs. So if you take the input/output
relationship that has captured the output as a symbolic expression in terms
of the inputs then these are the conditions under which you have different
symbolic outputs that this is the condition under which the output is 2.
Here the output is X, okay, the X, Y and Zed are the inputs and here the
output is Y.
So this is another notion of similarity which one could use, which probably
goes beyond the directed automated random testing. And it would be nice if
it could compute these path partitions automatically. Of course I am just
defining these path partitions here. I haven’t said how to compute them.
So if you could compute these path partitions automatically this is the kind
of summary that we will generate for the program. That under these
conditions the output is 2, under these conditions the output is X, under
these conditions the output is Y and so on.
And it turns out that it is possible to compute these path partitions
actually automatically if you do a bit of dependency analysis over and above
symbolic execution. So instead of computing the path conditions on the paths
we compute the path conditions over these so called relevant slices which is
essentially an extension of the backward dynamic slicing. So the dynamic
slicing is a closure of control end data dependencies, the dynamic control
and data dependencies. The data dependencies are just the used definition of
chains and the control dependencies point out to the closest end closing
branch.
So over and above these control end data dependencies which capture the
effect of all the statements which affect the output by getting executed we
also have the so called potential dependencies which capture the effect of
statements which affect the output by not getting executed. So for example
here I have this example, this is actually a potential dependence because of
this branch, because this branch got evaluated in a certain direction this
statement did not get executed and this effected the output. So if you also
capture the effects of these statements which did not get executed and
thereby affected the output and combine it with dynamic slicing, and you
compute the path conditions over these extensions of dynamic slices, you can
really try to uncover the transformation function underlying the program.
Okay, so this is clearly much beyond the directed automated random testing
because you compute these partitions.
So I will not go into the details of these properties, but these properties
are important for the search to be complete. So here we compute these
relevant slice conditions, that is the path conditions of these relevant
slices over the different paths. And the property that we have is that given
a path, if an input satisfies the relevant slice condition, if an input
t.prime satisfies the relevant slice condition of t then it’s relevant slice
condition is the same. So really these are the inputs which compute the same
symbolic output. So now the search will proceed exactly as in directed
automated random testing. You start with a random input, you go through it’s
path, you compute the relevant slice over that path, you compute the relevant
slice condition over that relevant slice and this captures a whole group of
inputs who have the same symbolic output.
And now you have to modify this relevant slice condition a little bit so that
you can move into another group of inputs which have a different symbolic
output.
>>: So if you took the straight line program and stated it to bounded
statement then the relevant slice condition is [indiscernible]? I mean it’s
an all set with the respect to projection criteria which is the output.
>> Abhik Roychoudhury: Yeah, so --.
>>: [indiscernible].
>> Abhik Roychoudhury: The relevant slice condition will capture all the
statements which actually effect the output. So I will essentially --.
>>: [indiscernible].
>> Abhik Roychoudhury: Yeah, so this is all because of the inherent
parallelism in the program. So their might be a number of statements and
their conditions which appear in the path condition, but do not effect this
particular output.
>>: Yeah, but the question is really suppose you translate your program
directly to [indiscernible], what does the relevant slice condition need?
>> Abhik Roychoudhury: Well, it’s --.
>>: [indiscernible].
>> Abhik Roychoudhury: The relevant slice condition is a relatively dynamic
concept. It’s not a static concept that given the conversion of the program
into the logic. I will be able to get a notion for the relevant slice
condition. So given a path, if I start with a random input, given a path for
that input it is a projection of that path and the logical representation of
that projection.
>>: Okay.
>>: So is it a [indiscernible] of the input symbols?
>> Abhik Roychoudhury: Absolutely, absolutely. So just like the path
condition it is a formula in terms of the input symbols. So you think of the
path condition and it’s like many of the conditions in the path condition are
filtered away. And this is because of the inherent parallelism in the
program. They were probably not relevant for the computation of the output.
>>: But you don’t know, based on a single path, you don’t know which
conditions are important and which are not, correct?
>> Abhik Roychoudhury: Based on the single path I can know by doing this
dependency calculation over the path.
>>: So that [indiscernible], the dependency analysis?
>> Abhik Roychoudhury: The dependency analysis will look at the trace, yes.
>>: But it seems like much of the time is not really even the output that
matters for symbolic execution kinds of problems. It’s whether, oh I don’t
know, you get some sort of a, you know, [indiscernible] or a [indiscernible]
that might effect the output, or if you say you can save a lot of memory or
if there is some sort of a secondary condition that is captured by
[indiscernible].
>> Abhik Roychoudhury: That is true, that is true. So here I am doing the
path exploration and the path partitioning based on the output. So it could
be that the definition of what you want to test is not captured in the output
you are just trying to look for crashes an so on. That will not be captured
by this path exploration, that’s correct.
>>: But is there a way to do that? So suppose I don’t have a [indiscernible]
check that goes and [indiscernible] my allocated data structures and does so.
What do you think, is there way to do something around that?
>> Abhik Roychoudhury: Yeah, of course, you can actually make it observable.
You can define that as the output and then once you do the path partitioning
based on that input/output relationship it will capture the different
partitions, absolutely. So if you are looking specifically a buffer over run
you can define certain output conditions which capture the buffer over run.
And you can do the path exploration and the path partitioning based on those
output conditions, absolutely. But, we did not do that specifically in this
work here. When I as output, these are really the output variables of the
program.
>>: So if I take over this relevant slice condition it’s an abstraction of
the of the path condition?
>> Abhik Roychoudhury: Absolutely, absolutely, so it throws away all of the
things in the path condition which are really not relevant for the production
of this output, yeah.
So there is are technical details here which I am skipping. So we actually
prove that the path exploration based on these relevant slice conditions will
really explore all the symbolic outputs. That means there is a completeness
in the search, which was relatively difficult to prove and the reason for
this is actually the property that is shown below. That in a path condition,
if you have a certain path condition, if you negate part of it then you can
be sure, if you negate say one of these constraints, then you can be sure
that the path condition of the input that contains this negated formula.
This contains this formula as the prefix. But, in the relevant slice
condition, because since it’s a slice actually some of these branches might
disappear when you compute the relevant slice condition. And as a result we
have to do some reordering, but this is more of a technical detail for the
sake of proving the completeness of the search.
Once you do this the validation is really as expected. That these are all
the programs and then the relevant slice condition was much, much smaller
than the path condition and then the paths explode through the relevant slice
condition. These are all so much, much less because of course many paths are
online getting grouped into this relevant slice condition. So this is one
usage that we looked --.
Yeah, sorry there is a question here.
>>: So I just assume that, I mean I don’t what these program are number one
and number two is that does this assume that you can actually run them in
this model if you will, end to end?
>> Abhik Roychoudhury: Sorry, say that again.
>>: Does this assume that you can actually run them to the end in this model
and capture everything that you need to capture?
>> Abhik Roychoudhury: That’s right.
>>: So what are these programs?
>> Abhik Roychoudhury: So --.
>>: [indiscernible].
>> Abhik Roychoudhury: What are these programs? These are all from the, I
think the SIR, what we call the Software-artifact Infrastructure Repository.
>>: So, [indiscernible].
>>: [indiscernible].
>> Abhik Roychoudhury: So I think these are not particularly big programs.
They are about less to equal, I think most of them are less to equal to
thousand lines of code, yeah.
>>: So you can basically explore them [indiscernible]?
>> Abhik Roychoudhury: Yeah, yeah. But, I guess the issue is that even if
the program has, say hundred lines of code, if you just migrate from one
group of paths to another group of paths you have to make sure that the
search does not diverge and you will actually compute all the symbolic
outputs. Compute partitions for each of the possible symbolic outputs. That
is what we showed in the completeness of the search.
Sorry, there is a question here.
>>: I would you relate this to summaries? So if you create a summary for a
basic block could you uncover some of the related functionality?
>> Abhik Roychoudhury: Absolutely, so this is absolutely a summarization
procedure even though we used it for text generation. This is absolutely a
summarization procedure and it could benefit if the basic block summaries
were available.
>>: Yeah, but the [indiscernible] are different?
>> Abhik Roychoudhury: Yeah, so here you could think of that I am trying to
uncover the transformation function, underlying the program somehow. And as
you might imagine there could be unboundedly many symbolic outputs,
unboundedly many symbolic outputs also, not just concrete outputs, but
symbolic outputs. As a result of which there could be unboundedly many path
partitions. So definitely I cannot show that the search in this way will
terminate. But, if there are finitely many, unboundedly many symbolic
outputs then I will have exactly that many path partitions.
>>: Oh, I see.
>> Abhik Roychoudhury: And that many test cases.
>>: So are you inlining all the procedures or are you doing them
compositionally here?
>> Abhik Roychoudhury: Oh no, no, I am not inlining anything.
[indiscernible].
It’s
>>: So [indiscernible] or do you use the summaries in your --?
>> Abhik Roychoudhury: This is the total number of paths that were explored.
>>: So when you do [indiscernible] if there is a call do you use the summary
of that call or do you inline that call?
>> Abhik Roychoudhury: No, that would be the way of improving the search. In
this case these are really the traces. You see that I am working with traces
here. I start with a random input, I look at it’s trace, I do the slicing
over that, then I do symbolic execution over that slice. Actually the
slicing and the symbolic execution are going hand in hand. That produces a
formula, I do some operations on the formula, negate it, produce another
formula and then that moves onto the next path partition.
>>: Right, but [indiscernible]?
>> Abhik Roychoudhury: Yes, it is.
>>: [indiscernible]?
It’s not for a given procedure.
>> Abhik Roychoudhury: That’s right --.
>>: [indiscernible]?
>> Abhik Roychoudhury: You could say it that way, but if I actually did put
in the summaries of some of the lower level procedures then it would be
faster, but then of course it would be less accurate because then I am just
treating the whole summary of the underlying procedure.
>>: And what is the output for [indiscernible]? I mean, the programs have
implicit output?
>> Abhik Roychoudhury: Yeah, the programs have their output and I am just
using that output.
>>: So the heap is not considered part of the output?
>> Abhik Roychoudhury: This was one of the questions. Yeah, we could make
the heap as one of the output, we also did that. Yeah, that’s a good
question. Okay, so maybe I will move on.
Right, so I will mention to you one way we have used symbolic execution for
debugging where conventionally symbolic execution was not used. And this is
in this domain of regression debugging. And here you will see that the role
of symptoms execution is very different. So in regression debugging the
problem is very simple, you have a test input, it is given to an old stable
program and a new buggy program. And earlier it used to pass, and it now
fails and you want to know why. So here this statement is a little bit
informal, but then the assumption here is that when you try to go from the
old stable to the new buggy program you probably tried to ass some new
functionality. The intended purpose of the old functionality is as before.
Okay, so therefore this should not have failed, but now it fails and you want
to know why. So this would be the very first thing that one could try
because in debugging literature conventionally there as been a lot of usage
of trace comparison. That is the traced input is fed into the old stable
program and the new buggy program and you produce paths. And whether you can
directly compare the paths, of course you cannot directly compare the paths
because these are really two different programs and they might have
completely algorithms, data structures and so on. So it really wouldn’t be
feasible to directly compare the paths.
But, the question is whether you can generate some sort of a new input,
sorry, you could generate some sort of a new input using this evolution
information. And then you can compare the behavior of this new input with
the buggy input to find out what went wrong. Okay, so this is what we are
trying to do in the debugging problem. And this is a schematic to give the
intuition of the debugging method that here you have the buggy input, say
these are the partitions, see this is just a schematic, these are the
positions and assume that all inputs in a bin are similar. Now what is
similar? This is up to us to define what is similar.
And in this case say the red input is the buggy input because of this error
that has been introduced. It has somehow shifted from one partition to
another. What we are looking for is something like this blue input which was
similar in behavior to the red input, but now is different in behavior, but
this blue input has not changed it’s behavior at all, okay. It is the red
input which has changed the behavior because of the error that was introduced
and we are trying to look for a blue input like this one, which is now
different in behavior. So this is really the intuition, sorry you seem to
have a question.
>>: It seems like the un-stated assumption here is that your properties of
vendors are not data dependent. They are sort of symbolic formula dependent,
which is to say that it’s possible to have exactly the same formula that will
in practice yield very different outcomes, especially when interacting with
systems you are not capturing like databases. A traditional example, you
have a SQL injection, which is the result of [indiscernible]. The formula is
always the same it’s [indiscernible], [indiscernible] and so forth. Yet for
some devious inputs it produces a very different outcome and [indiscernible].
You see what I mean?
>> Abhik Roychoudhury: Yeah.
>>: But, there is this is of data dependency verses computation dependency
that you are not talking about here.
>> Abhik Roychoudhury: Yeah, that’s correct. So I did not state exactly what
these partitions mean. Somehow there is an un-stated assumption here that
the error is visible in the control flow. If these partitions actually
capture the control flow in some way. We did some other work where this
assumption, we can go past this assumption, but I am not presenting that
technique today. But, in this particular technique that is the case.
Okay, so in this first work that we did on debugging using symbolic
techniques there is this test input, which is fed into the old stable program
and the new buggy program and we do a concrete and symbolic execution,
compute the path condition of the test input in both the programs and then
essentially we will compute solutions for this formula F and not F prime,
because these are exactly like the blue inputs, which are similar to the
buggy input in the old program and different from the buggy input in the new
program.
And this gives us an alternative input. And actually there is really no need
for us to do the trace comparison. We can just look at satisfiable subformula from this F and not F prime and that would give us all of these
symbolic execution that’s actually happening at the level of the assembly
code. This will give us the bug reported assembly level and then we reverse
translate it back to the source level, okay.
And the reason there is no need to do any of the trace comparison for the
solutions this, this F and not F prime, is that of course there can be many,
many solutions of these ones and we don’t want to explore all of them. But,
note that the F prime is a path condition which is a conjunction like this
one. So of course when you look at the negation of this not F prime you can
consider all possible deviations of it. So it could deviate off in the first
branch, the second branch, the third branch and so on. So if the path
condition has M constraints you only really need to look at most M equal
[indiscernible] classes. And for each of those deviations you really know
which line would go into the bug report. So it is that particular branch
that will go into the bug report.
So if the path condition has M constraints you really have to look at M of
those formulas, check which of them are satisfiable and anyone that is
satisfiable you just put the corresponding branch in the bug report. So this
is how the technique works.
>>: So it seems that any branch that could avoid the error would be reported,
right? Any branch that [indiscernible]. So you are not [indiscernible],
right you can just [indiscernible].
>> Abhik Roychoudhury: Yeah, so --.
>>: The new path might be somewhere, plus [indiscernible].
>> Abhik Roychoudhury: Yeah, that’s actually a good point. We had some other
strategies for prioritizing among these ones that are satisfiable. Whether
they are also successful, the inputs that we generate by solving those
deviations, whether they are also successful, because those could be also
failing inputs. And those ones we don’t want to consider in the bug report.
So now if we have this picture we can also employ this coarser grain
similarity that I mentioned to you that is to look at these path partitions
instead of paths. So we can just use these relevant slice conditions instead
of the path conditions and the whole technique would just be used for free.
And in which case as you might imagine, the results become much, much better.
In fact, of course I am showing you these are much bigger programs than the
ones that we showed for the test generation. I think the NanoXML, Jlex,
Jtopas, I don’t have the exact sizes, but these are much, much bigger
programs.
And you can see that the size, both in terms of the size of the path
conditions and relevant slice conditions, in this case we have the time for
doing the debugging, which is substantially cut down. And also the lies of
code, because each line of code is really coming from each of these
deviations that we record. So each line of code is a branch condition that
is given by each of these deviations. So many of the deviations just get
filtered out because of the slicing.
Okay and of course many of the other results that we had in the original work
on Darwin, which was by the way built on BitBlaze, the BitBlaze tool and
which we are thankful because we had the tool available to us much earlier,
they were also replicated. And since this technique is completely based on
semantic analysis where one can use it on different implementations of the
same protocol also, so for example different web servers and so on. We also
used a different variation of this method on finding out errors in imbedded
Linux by validating it against Linux, the BusyBox. These are exactly the
errors that are reported in the [indiscernible] paper. They did the test
generation and we did the root causing of all of these errors.
>>: [indiscernible].
>> Abhik Roychoudhury: Yes?
>>: [indiscernible]?
>> Abhik Roychoudhury: Yes.
>>: So here did you intersect the relevant path conditions with
[indiscernible] to only focus on the --?
>> Abhik Roychoudhury: No, no, this is just the relevant slice condition as
produced.
>>: [indiscernible]?
>> Abhik Roychoudhury: Yeah, yeah.
>>: The parameters that you report, you also consider whether they are
present in the diff?
>> Abhik Roychoudhury: No, no, no, actually we don’t do that. And that is
really one of the observations behind building these techniques, because the
error might not be in the diff. The error might very well be in the original
program itself, but it might not have been manifested. And it is now getting
manifested. So we are never doing the intersection with the diff.
Right, so and finally I show you one usage of the symbolic execution in
program repair and this is something that we have actually worked on
recently. So here when I say program repair that correctness specification
is given by a test suite. So when we repaired the program the purpose is to
make sure that is passes all the test cases in this test suite. And the
repair strategy, I will go through the repair strategy in a minute. And the
usage of symbolic execution here is to group together all of these executions
through which a failing execution could be rescued. So in some sense this is
a newer notion of similarity. This result was just [indiscernible] this
year.
So let me show you the problem with an example. This is a very simple
example. This is just a simple procedure without any loops and suppose the
error is in line number four where the bias equals to down sep. We would
like to have this corrected to this bias equal to up sep plus 100. And
suppose these are all the test cases and 2 of the test cases in fact failed.
So the purpose is to generate a new expression for bias, [cough] excuse me,
so that these test cases actually pass.
So the first step of this method is the fault localization. Here we are
using statistical fault localization method, which actually takes in the
program and the test cases and tries to find a line which somehow appears
more in the failing test cases and less in the passing test cases. There are
many, many metrics for that. We are using one particular metric that is used
by this Tarantula tool. And so what we need to know for understanding the
method is that suppose this method has pointed you one particular line, in
this case line number 4, how you can correct that expression that a bias
equal to down sep, 2 bias equal to up sep plus 100 automatically.
Okay, so this is what the repair method is doing. And of course you can try
doing this for one line at a time. If it doesn’t work you can try to go to
the next line that is given by the statistical fault localization method.
>>: Can you [indiscernible] for each of these examples or is [indiscernible]?
>> Abhik Roychoudhury: There is no specification. So my only goal is to make
sure that the repaired program passes all the test cases. The specification
is given --.
>>: [indiscernible]?
>> Abhik Roychoudhury: That’s right, so for each test we have the input and
the expected output, yeah.
So the first step is to go to this line, find the suspect and the second step
is to find out through symbolic execution what it should have been, say if
you are aim is to correct the bias equal to down sep. So you just replace
bias with X and go on doing symbolic execution from there. So you are doing
concrete execution up to that particular line, which you want to correct and
then subsequently you go on and do symbolic execution from there.
Okay, so you can see here that I replaced the right hand side with X and the
path condition is true at this point in time. You can then go on forward and
in this case you develop one particular path condition X greater than 110, in
the other one X less than equal to 110 and so on. It turns out that this one
leads to a passing execution and this one leads to a failing execution.
Okay, so just to make this just a bit more concrete this is really at this
point in time I put this as a function of all the variables which are live.
Okay, this is really a function of all the variables that are live and I am
trying to synthesize a function of this form, which takes in all these
variables that are live. And I know the values of each of these variables
because I am trying to rescue this particular test case. So if these are the
values that were actually fed by this particular test case I want to
synthesize a function which satisfies this formula that was obtained from the
path condition that we got.
Okay, [coughing] excuse me. So corresponding to each test case we will
accumulate each of these constraints and I am trying to find out a particular
function F which satisfies all of these constraints and we can fix the set of
operators that appear in F. This will be done by a program synthesis. So
there are a number of methods in which you can generate the fixes, either you
can search over the space of expression, in this case we will actually use a
program synthesis and finally the generated fixes of this form.
So I can go through this in a little bit more detail. So the first step is
producing the ranked bug report where we are using just an off the shelf
statistical fault localization tool. For the symbolic execution we are
trying to group all the paths which can avoid the error. So if you have this
line which you wan to fix you just replace it with a certain variable X and
then you do symbolic execution with X as the only unknown. So this would be
actually fairly scalable, because the symbolic execution is only this
particular variable.
Okay, and the X is nothing but the function over all the live variables. So
this is really the function that you are trying to synthesize. So at some
point in time you can just add this constraint that is X equal to the
function over these concrete values and this would actually implicitly
generate constraints over this function. And this is the function that you
want to synthesize.
So this is what we call the repair constraint and you get a repair constraint
of this form, of course we don’t feed this off to this empty solver just like
that. We will select some primitive components that can be used in this
function and this is done through some sort of a layered search. So
initially we can see whether there can just be these, constants, constant
function, if not maybe arithmetic operators if not logical operators --.
Sorry, there is a question.
>>: So there is a set of bench marks that are connected [indiscernible].
>> Abhik Roychoudhury: Okay.
>>: So the way that you would describe it if you were [indiscernible].
>> Abhik Roychoudhury: So I am actually, we are actually using an extension
of the program synthesis method based on input/output pairs which have been
developed here.
>>: Okay, yeah, but are you aware that there is a website that
[indiscernible]?
>> Abhik Roychoudhury: I am actually not aware of that one.
So, but the technique that we are using, this synthesis method, this is an
extension of the IO paired with synthesis method, which has been developed -.
>> Yes, yes, yeah.
>> Abhik Roychoudhury: So instead of directly feeding this formula off to
this empty solver we are looking for a program that uses a given set of
primitive components. And the components we are searching for are done
through some sort of a layered search. And of course the questions are where
to place each component and what the parameters are for each component. So
the way this is done is essentially we define the location variables for each
component and then we get constraints on the location variable. So the
simple example that I have here, the only component is just a plus and these
are the location variables. This is the location variable of the input to
the program, this is the location variable to the output to the program, the
location variable for the output of the plus, of the first input of the plus,
the second input of the plus and so on.
So just assignment of the location variables gives me the program and that
gives me the expression to synthesize, okay. So it will generate constraint
systems on these location variables which will be solved to get an assignment
to these location variables, that gives me the program to generate and that
also gives me the expression that is to be synthesized.
So actually these are the subjects that we used. Again the first set of
subjects are from the SIR, the software infrastructure repository, that we
use in software engineering community. And also some others from the GNU
CoreUtils, these are the utilities like MKDIR, MKFILE, COPY and so on. And
here in the repair actually this is really the first work that uses any kind
of semantic analysis for repair. This I can say with confidence.
In the past there was a lot of work that you have probably heard of, the
GenProg tool which does repair by doing genetic programming. So we also
tried to look for these bench marks or subject programs against the GenProg
tool and both of these were repaired by the GPS as well as the SemFix tool
for the GNU CoreUtils. And even though, surprisingly even tough, our
technique uses symbolic execution it was faster because the symbolic
execution is used in a fairly scalable fashion with very few variables.
>> Does it matter?
>> Abhik Roychoudhury: Does it matter? Yeah, you see whether it matters or
not, okay. So these are the, I think I will convince you that it does matter
with this slide, you can see here the correctness criteria is given by the
number of test cases. So of course if I add more and more test cases my
correctness criteria becomes more stringent, right. So whatever was a valid
repair earlier may not be a valid repair anymore.
So now you can see the graph for the repairs produced by this GenProg tool
and the repairs that we are producing which is the blue line. So the repairs
that we are producing are much more stable and the reason is that it is
actually based on the analysis of the parts. So later on if you add test
cases which go through the paths that were already tried out earlier than of
course the repair that was valid earlier will continue to be valid.
Okay, so this is not surprising that ours is much more stable, whereas here
you can see the one that is by GenProg, this is producing by genetic
programming, so effectively it is trying to get the repaired expression from
somewhere else in the program by moving it to the bug [indiscernible] site.
And the repairs that are produced are not valid as you add more and more test
cases. And these test cases were not cherry picked by us, actually each of
these programs comes with a predefined set of test suites, which were
produced by some sort of a coverage criteria. So first we take 10 of those
randomly, then add 10 more randomly, 10 more randomly and so on just to do
the experiment in a proper way.
Yeah?
>>: So are you prone to over fitting for most of the GenProg [indiscernible]?
>> Abhik Roychoudhury: Um, over fitting, um?
I am not sure.
>>: Otherwise, the size of the model. I mean you can basically generate a
program that would do perfectly on 50 or so, 45 or so test cases, but that
program will be really hideous.
>> Abhik Roychoudhury: Okay.
>>: And need special incasing for every one of those test cases.
>> Abhik Roychoudhury: Yeah, but of course you can always repair the program.
Like if you knew the test cases, then if this, then this, and of course we
are not doing that. Yeah, there would be no way for that. The technique
doesn’t know what the test cases are going to be fed. It cannot possibly
know that.
>>: The computer handles, when you are trying to synthesize the app function,
you are not doing it on a single failing case, right. You are accumulating
all the failing data, right?
>> Abhik Roychoudhury: Absolutely, yeah. So the repair constraint actually
generates constraints on their F function for each of the tests and of course
doing it for each of these tests that might be a bit cumbersome so there are
some optimizations for doing that.
>>: I mean you are saying that if you increase the set of tests this F
becomes unsatisfied and you cannot, like if you threw in 100s of tests, the
synthesizer might not be able to give you what you need it to?
>> Abhik Roychoudhury: Yeah, so what I said, because all of these were done
on a time budget. So it does not mean that the repair that was produced is
not a valid repair. It does not mean that if more time was given to the
GenProg tool it may not be able to produce that, we don’t know. Of course at
some point in time you have to put a time budget here.
Right, and in fact this is another slide through which you can see whether it
matters. These are all the different classes of errors, see constant, edit
in constant, arithmetic errors, comparison errors, logic errors, code missing
errors and so on. And this is the comparison between the GenProg tool and
this semantic analysis based fixing. You can see that the GenProg tool
doesn’t seem to be doing so well in some of the categories. Where in some
other categories it is sort of okay.
Okay, yeah.
>>: So I thought genetic programming was used for error log circuits and
repair synthesis. [indiscernible].
>> Abhik Roychoudhury: Uh-huh.
>>: So do you use any [indiscernible]?
>> Abhik Roychoudhury: I have not explored that?
>>: I mean codes are [indiscernible].
>> Abhik Roychoudhury: Okay, yeah I really have not explored that, yeah,
yeah, but certainly it would be feasible I think. But, I have not explored
that at all. That’s an interesting connection. I have not even thought
about that, yeah.
>>: Because that was my killer app in my [indiscernible]. It’s kind of 50
proc, super computing, [indiscernible]. But GenProg is taking for software?
>> Abhik Roychoudhury: Yeah, yeah, it is still out for software repair, for
programming repair.
>>: But it tries to borrow repair chunks from the program itself.
not necessarily in the best way. But --.
It’s just
>> Abhik Roychoudhury: Yeah, essentially. I mean I am oversimplifying a
little bit, but essentially it is trying to copy expression from somewhere
else in the program.
>>: So they have just never found it then?
>> Abhik Roychoudhury: Yeah, it does this mutation and the crossover
operations. Of course in the mutation you can change a little but, but it
becomes very difficult to synthesize a real fix with those kind of
operations.
>>: Does that work on MAXSATs app on [indiscernible]?
that?
>> Abhik Roychoudhury: Yeah, we did look into that.
side based debugging.
Did you look into
So there is the MAXSAT
>>: Right, but [indiscernible].
>> Abhik Roychoudhury: Well, I guess they did not look into repair that much.
Yeah, it’s more the debugging that is essentially --. In fact in a later
manifestation of this work we are trying to use certain variations of MAXSAT
for repair. In this case as you saw we are essentially just fixing one line,
but one could probably try to fix together several lines at the same time.
Lines that are correlated and whether MAXSAT or such techniques could be
used, that’s a possibility.
Okay, and just to very quickly go through, so even though there is just one
line fix that we are generating it is also possible to, because the lines can
be fairly complicated, you can synthesize missing code or you can have
[indiscernible] within the fix that you are generating and thereby have this
operations, this other capability less than altered value. If so then 2 to 3
ports valid and so on. So this is a fairly complicated fix that has been
generated.
So just stepping back in terms of the repair there are, I don’t think it will
be completely automated. So there is the issue of building programming
environments and putting the human in the loop. And I think in the repair
the program synthesis is likely to play a useful role. Is debugging
required? This is a little bit questionable to me. So as the first step of
this repair we put first the statistical fault localization, which already
goes to the line and then I try to repair that line. Maybe that is not
needed, so one could think of repair methods which do the testing and repair
hand in hand and avoid the statistical fault localization all together. That
means through some symbolic execution you try to generate test cases and as
the test case actually fails you try to repair these test cases that you have
found.
So this is probably and answer to your question that you mentioned just now.
One could fine the location to fix via this symbolic reasoning as well as
MAXSAT, so I think there is a possibility. We have also been trying this
out, but this is not very scalable. So there are difficulties with this.
And of course there is the issue of suggestions, generating suggestions
instead of repairs. We have tried to look into the usage of this for
generating repairs of access control errors and so on, which is a different
story.
So just coming back to the theme of this talk which is the perspective played
by symbolic execution, so even though in test generation model checking
verification, the symbolic execution is being used essentially to guide the
search. That is you try through symbolic execution. You play the role of
many, many concrete executions. Whereas in debugging, repair, summarization,
program semantics extraction, somehow the symbolic execution is playing the
role of uncovering what went wrong as well as discovering the intended
behavior.
All of these are somehow exploiting, all of these techniques are built in
such a way that they are exploiting similarities in large parts of the search
space. In the case of debugging, for example, among the deviations in the
case of repair, trying to find out all the test cases which need to be
rescued together and so on. So I think here is a fair amount of space to
cover in this two different roles of symbolic execution. One is the
discovering of the intended program behavior and the other is the guiding of
the search.
And of course just very briefly let me mention one possible take from today’s
discussion: symbolic execution tries to infer the intended behavior. We also
did just one other work where we directly tried to capture the intended
behavior through this change contract specifications. So there has been a
fair amount of work on program contracts where you try to give input/output
specifications of programs. So here we tried to look into change contracts
where you describe the intended behavior of the changes. So for example the
contract at the bottom is saying that when these conditions are satisfied,
and there are no input conditions, whenever the previous output satisfies
this condition the new output should satisfy this condition.
So these are really, you can give output/output specifications between the
outputs of the two program versions rather than input/output specifications
which is more flexible I guess. And let me just conclude by giving you these
references as well as thanking all my co-authors and I will be happy to take
any more questions.
>>: So without input summaries --?
>> Abhik Roychoudhury: Right, right, right, yeah.
>>: So what I wanted to say, I guess [indiscernible], program verification
there is a much nicer encoding logic and that sort of reduces the
[indiscernible]. So you might be able to do a lot of these things if you
just [indiscernible] encoding into logic.
>> Abhik Roychoudhury: That is true, actually a lot of the techniques that we
did were dynamic symbolic execution, yeah, which is symbolic execution with
specific paths. At least one of the initial observations to me was that
specifically for reasoning about regressions where you have a past version of
the program or for reasoning about cases like embedded software where you
have a reference implementation of the program. You can use symbolic
execution over this reference implementation or the past version to get some
hint of the intended behavior. And for software engineering activities like
debugging, if you have a hint of the intended behavior it is a huge plus.
And this is really the problem that when if you don’t have formal
specifications in the program. If I had lots of formal specifications for
each of my methods then things like debugging really wouldn’t be much to
worry about.
So symbolic execution definitely was helpful in that regard, in the absence
of these formal specifications, somehow to get a hint of the formal
specification. So that is one of the credits that I am giving to the
symbolic execution. Apart from the known usages of symbolic execution like
guiding search and so on, but those probably could be done by other methods
as well as you already mentioned. But the finding of the intended behavior
probably I couldn’t be doing that through verification methods, because the
verification methods require a certain property to verify.
Yeah?
>>: What if it’s about specifying the program as a property to previous
[indiscernible].
>> Abhik Roychoudhury: That could be, yeah.
>>: All I am saying is that you can encode any of these things into logic
directly.
>> Abhik Roychoudhury: Sure, sure, that could be, that could be.
>> Nachi Nagappan: I think all the questions for time are to be continued.
>> Abhik Roychoudhury: Okay, thank you.
[Clapping]
Download