>> Nachi Nagappan: Okay, thank you for coming and for people who are watching online. We are very happy to have with us Abhik Roychoudhury. Abhik is actually an associate professor --. >>: [indiscernible]. >>: Sorry? >>: [indiscernible]. >> Nachi Nagappan: Abhik is an associate professor at the National University of Singapore and he earned his PhD from Stony Brook. He has actually been there for almost 10 years now. He has very diverse interests ranging from embedded systems, to software testing and analysis, symbolic execution, etc. So he is actually going to talk about some of his work today. >> Abhik Roychoudhury: Okay, so should I begin? >> Nachi Nagappan: Yeah. >> Abhik Roychoudhury: Okay, okay. So this talk with be on the role of symbolic execution in software testing, debugging and repair. Essentially even though I will talk about some specific techniques I will try to go into the role that symbolic execution is playing in each of these techniques. So of course first of all I must mention that the growth of symbolic execution that we have seen in the software engineering community and used in different software processors is driven a lot by the SMT solver technology maturity. So without the growth in the SMT solver technology many of these techniques would not be possible and the purpose of this talk is really to see whether this can be used, symbolic execution, can be used in newer ways in software engineering. So of course there is a process centric view that is where symbolic execution can be used beyond testing and verification where it has conventionally been used. So for example I will try to talk about how it has been used in debugging and repair where conventionally symbolic techniques were not used, but also I think there is a conceptual view and that is symbolic execution has quite a bit been used for guiding search in our large search space. And whether one can go beyond and try to use symbolic techniques for inferring intended behavior. And really what is the secret recipe that makes symbolic execution drive both of these roles that are guiding the search in a huge search space, as well as inferring the intended behavior of programs, because of course as you might imagine if you can infer the intended behavior of a program then activities like debugging become much more attractable. So really speaking the perspective that we are taking for symbolic execution in this talk is that conventionally it has been used quite a bit. When I say conventionally, it is conventionally now, but sometime back it was not so conventional in test generation, model checking and so on, where it is exploiting the similarities in the search space. And also at the same time symbolic execution can be used as some sort of a psychiatrist trying to uncover what went wrong. And this is really the role that is being played by symbolic execution when we use them in debugging, summarization, semantics, extraction, program repair and so on. So when I walk about debugging in this talk I should very clearly clarify the difference between the bug hunting and debugging. Of course in many of the papers we choose model checking for finding bugs. This is also mentioned as debugging, but that is really bug hunting where you have a certain property in mind and a program and you produce counter example. And the counter example essentially says that under these input conditions this property is violated. Whereas what we are mentioning here as debugging is that there is a certain input, there is a certain output and we are actually trying to find out what went wrong in this particular test case. So this is really the debugging problem that we are looking at. So here certain property is not given to us. What is the property that was violated? This is what we are trying to find, but on the other hand we have a failing trace. So this is the debugging problem that we are looking at here. >>: Can you explain why the property or the part of the code that --. >> Abhik Roychoudhury: We want the part of the code which is the likely suspect for the violation of the expected output. >>: So then a property is much more general than the piece of the code. >> Abhik Roychoudhury: That’s right, that’s right. inferring the property in debugging, that’s right. So we really won’t be So I think these are the things that I believe I can skip. So there is, of course, symbolic execution, you just replace the input with a certain [indiscernible] and then you have the concrete output as well as the symbolic output. So the symbolic outputs are given in terms of [indiscernible] and of course the part in symbolic execution we have quite a bit of usage of path conditions where in this case, given the input equal to 5 corresponding to each line number that is executed you can keep track of an assignment store and a certain path condition. And the path condition really captures the set of all inputs which go through this part. Okay, so at any point of time you can keep track of the branches that you are going through. The branches conditions and for any branch conditions that you are conjoining you can look up the symbolic store. And this is how you can compute the path condition automatically. So in our usage of symbolic execution we will be having a fair amount of usage of path conditions and the usage of these path conditions has been fairly extensively popularized in this directed automated random testing method where you continue the search for failing inputs. In particular inputs that don’t go through the same path. So for example the aim here is really to get path coverage. So for each part you capture the path condition and you negate a little bit of it. So for example if you start with this bold path, this is the path condition and you negate a little bit of it, you get another path condition and you try to get a solution for this. You negate a little bit more, you get another path condition, and you get a solution for this and so on. So the aim is really to make sure that you don’t have inputs which go through the same path again, and again and again. So this is fairly well known at this stage, a fairly well known usage of symbolic execution in testing. So there is, when you use dynamic symbolic execution for testing in this way there is some sort of an implicit assumption is that the inputs that execute a certain path are similar. So if you test one of them there is no need to test the others. So essentially we are using similarity to skip over parts of a large search space and the dynamic symbolic execution is essentially a tool which helps us achieve this goal. So the first thing that I will mention today that is a usage of symbolic execution again in test generation is whether one can look at courser grain notions of similarity, which are used for test generation. So this is the slide that shows the different roles played by symbolic execution in testing, debugging and repair. In testing really the goal is always that we don’t need to test for similar inputs and there is the possibility of looking for similarity beyond parts that is probably combining parts into one single partition of parts. In debugging of course giving a failing input we are trying to find out similar inputs which pass. So we will see that one can do large equal comparisons to detect the deviations between these failing inputs, as well as the similar inputs that pass. And finally in repair there is the question of finding similar inputs which show the same error and therefore you can rescue them all and this forms the search space of all possible repairs. And here also symbolic execution we will see is used to capture the intended behavior of the program. Okay, so this is how I will try to set the stage for the usage of symbolic execution and testing debugging, as well as repair and at all points of time try to go back and see what role symbolic execution is playing in each of these techniques. So let me start with an example technique from testing. So this program actually has 8 paths, not surprisingly with the 3 branches, they are at 8 paths, but we can see that they can be partitioned into just these 3 partitions based on the symbolic outputs. So if you take the input/output relationship that has captured the output as a symbolic expression in terms of the inputs then these are the conditions under which you have different symbolic outputs that this is the condition under which the output is 2. Here the output is X, okay, the X, Y and Zed are the inputs and here the output is Y. So this is another notion of similarity which one could use, which probably goes beyond the directed automated random testing. And it would be nice if it could compute these path partitions automatically. Of course I am just defining these path partitions here. I haven’t said how to compute them. So if you could compute these path partitions automatically this is the kind of summary that we will generate for the program. That under these conditions the output is 2, under these conditions the output is X, under these conditions the output is Y and so on. And it turns out that it is possible to compute these path partitions actually automatically if you do a bit of dependency analysis over and above symbolic execution. So instead of computing the path conditions on the paths we compute the path conditions over these so called relevant slices which is essentially an extension of the backward dynamic slicing. So the dynamic slicing is a closure of control end data dependencies, the dynamic control and data dependencies. The data dependencies are just the used definition of chains and the control dependencies point out to the closest end closing branch. So over and above these control end data dependencies which capture the effect of all the statements which affect the output by getting executed we also have the so called potential dependencies which capture the effect of statements which affect the output by not getting executed. So for example here I have this example, this is actually a potential dependence because of this branch, because this branch got evaluated in a certain direction this statement did not get executed and this effected the output. So if you also capture the effects of these statements which did not get executed and thereby affected the output and combine it with dynamic slicing, and you compute the path conditions over these extensions of dynamic slices, you can really try to uncover the transformation function underlying the program. Okay, so this is clearly much beyond the directed automated random testing because you compute these partitions. So I will not go into the details of these properties, but these properties are important for the search to be complete. So here we compute these relevant slice conditions, that is the path conditions of these relevant slices over the different paths. And the property that we have is that given a path, if an input satisfies the relevant slice condition, if an input t.prime satisfies the relevant slice condition of t then it’s relevant slice condition is the same. So really these are the inputs which compute the same symbolic output. So now the search will proceed exactly as in directed automated random testing. You start with a random input, you go through it’s path, you compute the relevant slice over that path, you compute the relevant slice condition over that relevant slice and this captures a whole group of inputs who have the same symbolic output. And now you have to modify this relevant slice condition a little bit so that you can move into another group of inputs which have a different symbolic output. >>: So if you took the straight line program and stated it to bounded statement then the relevant slice condition is [indiscernible]? I mean it’s an all set with the respect to projection criteria which is the output. >> Abhik Roychoudhury: Yeah, so --. >>: [indiscernible]. >> Abhik Roychoudhury: The relevant slice condition will capture all the statements which actually effect the output. So I will essentially --. >>: [indiscernible]. >> Abhik Roychoudhury: Yeah, so this is all because of the inherent parallelism in the program. So their might be a number of statements and their conditions which appear in the path condition, but do not effect this particular output. >>: Yeah, but the question is really suppose you translate your program directly to [indiscernible], what does the relevant slice condition need? >> Abhik Roychoudhury: Well, it’s --. >>: [indiscernible]. >> Abhik Roychoudhury: The relevant slice condition is a relatively dynamic concept. It’s not a static concept that given the conversion of the program into the logic. I will be able to get a notion for the relevant slice condition. So given a path, if I start with a random input, given a path for that input it is a projection of that path and the logical representation of that projection. >>: Okay. >>: So is it a [indiscernible] of the input symbols? >> Abhik Roychoudhury: Absolutely, absolutely. So just like the path condition it is a formula in terms of the input symbols. So you think of the path condition and it’s like many of the conditions in the path condition are filtered away. And this is because of the inherent parallelism in the program. They were probably not relevant for the computation of the output. >>: But you don’t know, based on a single path, you don’t know which conditions are important and which are not, correct? >> Abhik Roychoudhury: Based on the single path I can know by doing this dependency calculation over the path. >>: So that [indiscernible], the dependency analysis? >> Abhik Roychoudhury: The dependency analysis will look at the trace, yes. >>: But it seems like much of the time is not really even the output that matters for symbolic execution kinds of problems. It’s whether, oh I don’t know, you get some sort of a, you know, [indiscernible] or a [indiscernible] that might effect the output, or if you say you can save a lot of memory or if there is some sort of a secondary condition that is captured by [indiscernible]. >> Abhik Roychoudhury: That is true, that is true. So here I am doing the path exploration and the path partitioning based on the output. So it could be that the definition of what you want to test is not captured in the output you are just trying to look for crashes an so on. That will not be captured by this path exploration, that’s correct. >>: But is there a way to do that? So suppose I don’t have a [indiscernible] check that goes and [indiscernible] my allocated data structures and does so. What do you think, is there way to do something around that? >> Abhik Roychoudhury: Yeah, of course, you can actually make it observable. You can define that as the output and then once you do the path partitioning based on that input/output relationship it will capture the different partitions, absolutely. So if you are looking specifically a buffer over run you can define certain output conditions which capture the buffer over run. And you can do the path exploration and the path partitioning based on those output conditions, absolutely. But, we did not do that specifically in this work here. When I as output, these are really the output variables of the program. >>: So if I take over this relevant slice condition it’s an abstraction of the of the path condition? >> Abhik Roychoudhury: Absolutely, absolutely, so it throws away all of the things in the path condition which are really not relevant for the production of this output, yeah. So there is are technical details here which I am skipping. So we actually prove that the path exploration based on these relevant slice conditions will really explore all the symbolic outputs. That means there is a completeness in the search, which was relatively difficult to prove and the reason for this is actually the property that is shown below. That in a path condition, if you have a certain path condition, if you negate part of it then you can be sure, if you negate say one of these constraints, then you can be sure that the path condition of the input that contains this negated formula. This contains this formula as the prefix. But, in the relevant slice condition, because since it’s a slice actually some of these branches might disappear when you compute the relevant slice condition. And as a result we have to do some reordering, but this is more of a technical detail for the sake of proving the completeness of the search. Once you do this the validation is really as expected. That these are all the programs and then the relevant slice condition was much, much smaller than the path condition and then the paths explode through the relevant slice condition. These are all so much, much less because of course many paths are online getting grouped into this relevant slice condition. So this is one usage that we looked --. Yeah, sorry there is a question here. >>: So I just assume that, I mean I don’t what these program are number one and number two is that does this assume that you can actually run them in this model if you will, end to end? >> Abhik Roychoudhury: Sorry, say that again. >>: Does this assume that you can actually run them to the end in this model and capture everything that you need to capture? >> Abhik Roychoudhury: That’s right. >>: So what are these programs? >> Abhik Roychoudhury: So --. >>: [indiscernible]. >> Abhik Roychoudhury: What are these programs? These are all from the, I think the SIR, what we call the Software-artifact Infrastructure Repository. >>: So, [indiscernible]. >>: [indiscernible]. >> Abhik Roychoudhury: So I think these are not particularly big programs. They are about less to equal, I think most of them are less to equal to thousand lines of code, yeah. >>: So you can basically explore them [indiscernible]? >> Abhik Roychoudhury: Yeah, yeah. But, I guess the issue is that even if the program has, say hundred lines of code, if you just migrate from one group of paths to another group of paths you have to make sure that the search does not diverge and you will actually compute all the symbolic outputs. Compute partitions for each of the possible symbolic outputs. That is what we showed in the completeness of the search. Sorry, there is a question here. >>: I would you relate this to summaries? So if you create a summary for a basic block could you uncover some of the related functionality? >> Abhik Roychoudhury: Absolutely, so this is absolutely a summarization procedure even though we used it for text generation. This is absolutely a summarization procedure and it could benefit if the basic block summaries were available. >>: Yeah, but the [indiscernible] are different? >> Abhik Roychoudhury: Yeah, so here you could think of that I am trying to uncover the transformation function, underlying the program somehow. And as you might imagine there could be unboundedly many symbolic outputs, unboundedly many symbolic outputs also, not just concrete outputs, but symbolic outputs. As a result of which there could be unboundedly many path partitions. So definitely I cannot show that the search in this way will terminate. But, if there are finitely many, unboundedly many symbolic outputs then I will have exactly that many path partitions. >>: Oh, I see. >> Abhik Roychoudhury: And that many test cases. >>: So are you inlining all the procedures or are you doing them compositionally here? >> Abhik Roychoudhury: Oh no, no, I am not inlining anything. [indiscernible]. It’s >>: So [indiscernible] or do you use the summaries in your --? >> Abhik Roychoudhury: This is the total number of paths that were explored. >>: So when you do [indiscernible] if there is a call do you use the summary of that call or do you inline that call? >> Abhik Roychoudhury: No, that would be the way of improving the search. In this case these are really the traces. You see that I am working with traces here. I start with a random input, I look at it’s trace, I do the slicing over that, then I do symbolic execution over that slice. Actually the slicing and the symbolic execution are going hand in hand. That produces a formula, I do some operations on the formula, negate it, produce another formula and then that moves onto the next path partition. >>: Right, but [indiscernible]? >> Abhik Roychoudhury: Yes, it is. >>: [indiscernible]? It’s not for a given procedure. >> Abhik Roychoudhury: That’s right --. >>: [indiscernible]? >> Abhik Roychoudhury: You could say it that way, but if I actually did put in the summaries of some of the lower level procedures then it would be faster, but then of course it would be less accurate because then I am just treating the whole summary of the underlying procedure. >>: And what is the output for [indiscernible]? I mean, the programs have implicit output? >> Abhik Roychoudhury: Yeah, the programs have their output and I am just using that output. >>: So the heap is not considered part of the output? >> Abhik Roychoudhury: This was one of the questions. Yeah, we could make the heap as one of the output, we also did that. Yeah, that’s a good question. Okay, so maybe I will move on. Right, so I will mention to you one way we have used symbolic execution for debugging where conventionally symbolic execution was not used. And this is in this domain of regression debugging. And here you will see that the role of symptoms execution is very different. So in regression debugging the problem is very simple, you have a test input, it is given to an old stable program and a new buggy program. And earlier it used to pass, and it now fails and you want to know why. So here this statement is a little bit informal, but then the assumption here is that when you try to go from the old stable to the new buggy program you probably tried to ass some new functionality. The intended purpose of the old functionality is as before. Okay, so therefore this should not have failed, but now it fails and you want to know why. So this would be the very first thing that one could try because in debugging literature conventionally there as been a lot of usage of trace comparison. That is the traced input is fed into the old stable program and the new buggy program and you produce paths. And whether you can directly compare the paths, of course you cannot directly compare the paths because these are really two different programs and they might have completely algorithms, data structures and so on. So it really wouldn’t be feasible to directly compare the paths. But, the question is whether you can generate some sort of a new input, sorry, you could generate some sort of a new input using this evolution information. And then you can compare the behavior of this new input with the buggy input to find out what went wrong. Okay, so this is what we are trying to do in the debugging problem. And this is a schematic to give the intuition of the debugging method that here you have the buggy input, say these are the partitions, see this is just a schematic, these are the positions and assume that all inputs in a bin are similar. Now what is similar? This is up to us to define what is similar. And in this case say the red input is the buggy input because of this error that has been introduced. It has somehow shifted from one partition to another. What we are looking for is something like this blue input which was similar in behavior to the red input, but now is different in behavior, but this blue input has not changed it’s behavior at all, okay. It is the red input which has changed the behavior because of the error that was introduced and we are trying to look for a blue input like this one, which is now different in behavior. So this is really the intuition, sorry you seem to have a question. >>: It seems like the un-stated assumption here is that your properties of vendors are not data dependent. They are sort of symbolic formula dependent, which is to say that it’s possible to have exactly the same formula that will in practice yield very different outcomes, especially when interacting with systems you are not capturing like databases. A traditional example, you have a SQL injection, which is the result of [indiscernible]. The formula is always the same it’s [indiscernible], [indiscernible] and so forth. Yet for some devious inputs it produces a very different outcome and [indiscernible]. You see what I mean? >> Abhik Roychoudhury: Yeah. >>: But, there is this is of data dependency verses computation dependency that you are not talking about here. >> Abhik Roychoudhury: Yeah, that’s correct. So I did not state exactly what these partitions mean. Somehow there is an un-stated assumption here that the error is visible in the control flow. If these partitions actually capture the control flow in some way. We did some other work where this assumption, we can go past this assumption, but I am not presenting that technique today. But, in this particular technique that is the case. Okay, so in this first work that we did on debugging using symbolic techniques there is this test input, which is fed into the old stable program and the new buggy program and we do a concrete and symbolic execution, compute the path condition of the test input in both the programs and then essentially we will compute solutions for this formula F and not F prime, because these are exactly like the blue inputs, which are similar to the buggy input in the old program and different from the buggy input in the new program. And this gives us an alternative input. And actually there is really no need for us to do the trace comparison. We can just look at satisfiable subformula from this F and not F prime and that would give us all of these symbolic execution that’s actually happening at the level of the assembly code. This will give us the bug reported assembly level and then we reverse translate it back to the source level, okay. And the reason there is no need to do any of the trace comparison for the solutions this, this F and not F prime, is that of course there can be many, many solutions of these ones and we don’t want to explore all of them. But, note that the F prime is a path condition which is a conjunction like this one. So of course when you look at the negation of this not F prime you can consider all possible deviations of it. So it could deviate off in the first branch, the second branch, the third branch and so on. So if the path condition has M constraints you only really need to look at most M equal [indiscernible] classes. And for each of those deviations you really know which line would go into the bug report. So it is that particular branch that will go into the bug report. So if the path condition has M constraints you really have to look at M of those formulas, check which of them are satisfiable and anyone that is satisfiable you just put the corresponding branch in the bug report. So this is how the technique works. >>: So it seems that any branch that could avoid the error would be reported, right? Any branch that [indiscernible]. So you are not [indiscernible], right you can just [indiscernible]. >> Abhik Roychoudhury: Yeah, so --. >>: The new path might be somewhere, plus [indiscernible]. >> Abhik Roychoudhury: Yeah, that’s actually a good point. We had some other strategies for prioritizing among these ones that are satisfiable. Whether they are also successful, the inputs that we generate by solving those deviations, whether they are also successful, because those could be also failing inputs. And those ones we don’t want to consider in the bug report. So now if we have this picture we can also employ this coarser grain similarity that I mentioned to you that is to look at these path partitions instead of paths. So we can just use these relevant slice conditions instead of the path conditions and the whole technique would just be used for free. And in which case as you might imagine, the results become much, much better. In fact, of course I am showing you these are much bigger programs than the ones that we showed for the test generation. I think the NanoXML, Jlex, Jtopas, I don’t have the exact sizes, but these are much, much bigger programs. And you can see that the size, both in terms of the size of the path conditions and relevant slice conditions, in this case we have the time for doing the debugging, which is substantially cut down. And also the lies of code, because each line of code is really coming from each of these deviations that we record. So each line of code is a branch condition that is given by each of these deviations. So many of the deviations just get filtered out because of the slicing. Okay and of course many of the other results that we had in the original work on Darwin, which was by the way built on BitBlaze, the BitBlaze tool and which we are thankful because we had the tool available to us much earlier, they were also replicated. And since this technique is completely based on semantic analysis where one can use it on different implementations of the same protocol also, so for example different web servers and so on. We also used a different variation of this method on finding out errors in imbedded Linux by validating it against Linux, the BusyBox. These are exactly the errors that are reported in the [indiscernible] paper. They did the test generation and we did the root causing of all of these errors. >>: [indiscernible]. >> Abhik Roychoudhury: Yes? >>: [indiscernible]? >> Abhik Roychoudhury: Yes. >>: So here did you intersect the relevant path conditions with [indiscernible] to only focus on the --? >> Abhik Roychoudhury: No, no, this is just the relevant slice condition as produced. >>: [indiscernible]? >> Abhik Roychoudhury: Yeah, yeah. >>: The parameters that you report, you also consider whether they are present in the diff? >> Abhik Roychoudhury: No, no, no, actually we don’t do that. And that is really one of the observations behind building these techniques, because the error might not be in the diff. The error might very well be in the original program itself, but it might not have been manifested. And it is now getting manifested. So we are never doing the intersection with the diff. Right, so and finally I show you one usage of the symbolic execution in program repair and this is something that we have actually worked on recently. So here when I say program repair that correctness specification is given by a test suite. So when we repaired the program the purpose is to make sure that is passes all the test cases in this test suite. And the repair strategy, I will go through the repair strategy in a minute. And the usage of symbolic execution here is to group together all of these executions through which a failing execution could be rescued. So in some sense this is a newer notion of similarity. This result was just [indiscernible] this year. So let me show you the problem with an example. This is a very simple example. This is just a simple procedure without any loops and suppose the error is in line number four where the bias equals to down sep. We would like to have this corrected to this bias equal to up sep plus 100. And suppose these are all the test cases and 2 of the test cases in fact failed. So the purpose is to generate a new expression for bias, [cough] excuse me, so that these test cases actually pass. So the first step of this method is the fault localization. Here we are using statistical fault localization method, which actually takes in the program and the test cases and tries to find a line which somehow appears more in the failing test cases and less in the passing test cases. There are many, many metrics for that. We are using one particular metric that is used by this Tarantula tool. And so what we need to know for understanding the method is that suppose this method has pointed you one particular line, in this case line number 4, how you can correct that expression that a bias equal to down sep, 2 bias equal to up sep plus 100 automatically. Okay, so this is what the repair method is doing. And of course you can try doing this for one line at a time. If it doesn’t work you can try to go to the next line that is given by the statistical fault localization method. >>: Can you [indiscernible] for each of these examples or is [indiscernible]? >> Abhik Roychoudhury: There is no specification. So my only goal is to make sure that the repaired program passes all the test cases. The specification is given --. >>: [indiscernible]? >> Abhik Roychoudhury: That’s right, so for each test we have the input and the expected output, yeah. So the first step is to go to this line, find the suspect and the second step is to find out through symbolic execution what it should have been, say if you are aim is to correct the bias equal to down sep. So you just replace bias with X and go on doing symbolic execution from there. So you are doing concrete execution up to that particular line, which you want to correct and then subsequently you go on and do symbolic execution from there. Okay, so you can see here that I replaced the right hand side with X and the path condition is true at this point in time. You can then go on forward and in this case you develop one particular path condition X greater than 110, in the other one X less than equal to 110 and so on. It turns out that this one leads to a passing execution and this one leads to a failing execution. Okay, so just to make this just a bit more concrete this is really at this point in time I put this as a function of all the variables which are live. Okay, this is really a function of all the variables that are live and I am trying to synthesize a function of this form, which takes in all these variables that are live. And I know the values of each of these variables because I am trying to rescue this particular test case. So if these are the values that were actually fed by this particular test case I want to synthesize a function which satisfies this formula that was obtained from the path condition that we got. Okay, [coughing] excuse me. So corresponding to each test case we will accumulate each of these constraints and I am trying to find out a particular function F which satisfies all of these constraints and we can fix the set of operators that appear in F. This will be done by a program synthesis. So there are a number of methods in which you can generate the fixes, either you can search over the space of expression, in this case we will actually use a program synthesis and finally the generated fixes of this form. So I can go through this in a little bit more detail. So the first step is producing the ranked bug report where we are using just an off the shelf statistical fault localization tool. For the symbolic execution we are trying to group all the paths which can avoid the error. So if you have this line which you wan to fix you just replace it with a certain variable X and then you do symbolic execution with X as the only unknown. So this would be actually fairly scalable, because the symbolic execution is only this particular variable. Okay, and the X is nothing but the function over all the live variables. So this is really the function that you are trying to synthesize. So at some point in time you can just add this constraint that is X equal to the function over these concrete values and this would actually implicitly generate constraints over this function. And this is the function that you want to synthesize. So this is what we call the repair constraint and you get a repair constraint of this form, of course we don’t feed this off to this empty solver just like that. We will select some primitive components that can be used in this function and this is done through some sort of a layered search. So initially we can see whether there can just be these, constants, constant function, if not maybe arithmetic operators if not logical operators --. Sorry, there is a question. >>: So there is a set of bench marks that are connected [indiscernible]. >> Abhik Roychoudhury: Okay. >>: So the way that you would describe it if you were [indiscernible]. >> Abhik Roychoudhury: So I am actually, we are actually using an extension of the program synthesis method based on input/output pairs which have been developed here. >>: Okay, yeah, but are you aware that there is a website that [indiscernible]? >> Abhik Roychoudhury: I am actually not aware of that one. So, but the technique that we are using, this synthesis method, this is an extension of the IO paired with synthesis method, which has been developed -. >> Yes, yes, yeah. >> Abhik Roychoudhury: So instead of directly feeding this formula off to this empty solver we are looking for a program that uses a given set of primitive components. And the components we are searching for are done through some sort of a layered search. And of course the questions are where to place each component and what the parameters are for each component. So the way this is done is essentially we define the location variables for each component and then we get constraints on the location variable. So the simple example that I have here, the only component is just a plus and these are the location variables. This is the location variable of the input to the program, this is the location variable to the output to the program, the location variable for the output of the plus, of the first input of the plus, the second input of the plus and so on. So just assignment of the location variables gives me the program and that gives me the expression to synthesize, okay. So it will generate constraint systems on these location variables which will be solved to get an assignment to these location variables, that gives me the program to generate and that also gives me the expression that is to be synthesized. So actually these are the subjects that we used. Again the first set of subjects are from the SIR, the software infrastructure repository, that we use in software engineering community. And also some others from the GNU CoreUtils, these are the utilities like MKDIR, MKFILE, COPY and so on. And here in the repair actually this is really the first work that uses any kind of semantic analysis for repair. This I can say with confidence. In the past there was a lot of work that you have probably heard of, the GenProg tool which does repair by doing genetic programming. So we also tried to look for these bench marks or subject programs against the GenProg tool and both of these were repaired by the GPS as well as the SemFix tool for the GNU CoreUtils. And even though, surprisingly even tough, our technique uses symbolic execution it was faster because the symbolic execution is used in a fairly scalable fashion with very few variables. >> Does it matter? >> Abhik Roychoudhury: Does it matter? Yeah, you see whether it matters or not, okay. So these are the, I think I will convince you that it does matter with this slide, you can see here the correctness criteria is given by the number of test cases. So of course if I add more and more test cases my correctness criteria becomes more stringent, right. So whatever was a valid repair earlier may not be a valid repair anymore. So now you can see the graph for the repairs produced by this GenProg tool and the repairs that we are producing which is the blue line. So the repairs that we are producing are much more stable and the reason is that it is actually based on the analysis of the parts. So later on if you add test cases which go through the paths that were already tried out earlier than of course the repair that was valid earlier will continue to be valid. Okay, so this is not surprising that ours is much more stable, whereas here you can see the one that is by GenProg, this is producing by genetic programming, so effectively it is trying to get the repaired expression from somewhere else in the program by moving it to the bug [indiscernible] site. And the repairs that are produced are not valid as you add more and more test cases. And these test cases were not cherry picked by us, actually each of these programs comes with a predefined set of test suites, which were produced by some sort of a coverage criteria. So first we take 10 of those randomly, then add 10 more randomly, 10 more randomly and so on just to do the experiment in a proper way. Yeah? >>: So are you prone to over fitting for most of the GenProg [indiscernible]? >> Abhik Roychoudhury: Um, over fitting, um? I am not sure. >>: Otherwise, the size of the model. I mean you can basically generate a program that would do perfectly on 50 or so, 45 or so test cases, but that program will be really hideous. >> Abhik Roychoudhury: Okay. >>: And need special incasing for every one of those test cases. >> Abhik Roychoudhury: Yeah, but of course you can always repair the program. Like if you knew the test cases, then if this, then this, and of course we are not doing that. Yeah, there would be no way for that. The technique doesn’t know what the test cases are going to be fed. It cannot possibly know that. >>: The computer handles, when you are trying to synthesize the app function, you are not doing it on a single failing case, right. You are accumulating all the failing data, right? >> Abhik Roychoudhury: Absolutely, yeah. So the repair constraint actually generates constraints on their F function for each of the tests and of course doing it for each of these tests that might be a bit cumbersome so there are some optimizations for doing that. >>: I mean you are saying that if you increase the set of tests this F becomes unsatisfied and you cannot, like if you threw in 100s of tests, the synthesizer might not be able to give you what you need it to? >> Abhik Roychoudhury: Yeah, so what I said, because all of these were done on a time budget. So it does not mean that the repair that was produced is not a valid repair. It does not mean that if more time was given to the GenProg tool it may not be able to produce that, we don’t know. Of course at some point in time you have to put a time budget here. Right, and in fact this is another slide through which you can see whether it matters. These are all the different classes of errors, see constant, edit in constant, arithmetic errors, comparison errors, logic errors, code missing errors and so on. And this is the comparison between the GenProg tool and this semantic analysis based fixing. You can see that the GenProg tool doesn’t seem to be doing so well in some of the categories. Where in some other categories it is sort of okay. Okay, yeah. >>: So I thought genetic programming was used for error log circuits and repair synthesis. [indiscernible]. >> Abhik Roychoudhury: Uh-huh. >>: So do you use any [indiscernible]? >> Abhik Roychoudhury: I have not explored that? >>: I mean codes are [indiscernible]. >> Abhik Roychoudhury: Okay, yeah I really have not explored that, yeah, yeah, but certainly it would be feasible I think. But, I have not explored that at all. That’s an interesting connection. I have not even thought about that, yeah. >>: Because that was my killer app in my [indiscernible]. It’s kind of 50 proc, super computing, [indiscernible]. But GenProg is taking for software? >> Abhik Roychoudhury: Yeah, yeah, it is still out for software repair, for programming repair. >>: But it tries to borrow repair chunks from the program itself. not necessarily in the best way. But --. It’s just >> Abhik Roychoudhury: Yeah, essentially. I mean I am oversimplifying a little bit, but essentially it is trying to copy expression from somewhere else in the program. >>: So they have just never found it then? >> Abhik Roychoudhury: Yeah, it does this mutation and the crossover operations. Of course in the mutation you can change a little but, but it becomes very difficult to synthesize a real fix with those kind of operations. >>: Does that work on MAXSATs app on [indiscernible]? that? >> Abhik Roychoudhury: Yeah, we did look into that. side based debugging. Did you look into So there is the MAXSAT >>: Right, but [indiscernible]. >> Abhik Roychoudhury: Well, I guess they did not look into repair that much. Yeah, it’s more the debugging that is essentially --. In fact in a later manifestation of this work we are trying to use certain variations of MAXSAT for repair. In this case as you saw we are essentially just fixing one line, but one could probably try to fix together several lines at the same time. Lines that are correlated and whether MAXSAT or such techniques could be used, that’s a possibility. Okay, and just to very quickly go through, so even though there is just one line fix that we are generating it is also possible to, because the lines can be fairly complicated, you can synthesize missing code or you can have [indiscernible] within the fix that you are generating and thereby have this operations, this other capability less than altered value. If so then 2 to 3 ports valid and so on. So this is a fairly complicated fix that has been generated. So just stepping back in terms of the repair there are, I don’t think it will be completely automated. So there is the issue of building programming environments and putting the human in the loop. And I think in the repair the program synthesis is likely to play a useful role. Is debugging required? This is a little bit questionable to me. So as the first step of this repair we put first the statistical fault localization, which already goes to the line and then I try to repair that line. Maybe that is not needed, so one could think of repair methods which do the testing and repair hand in hand and avoid the statistical fault localization all together. That means through some symbolic execution you try to generate test cases and as the test case actually fails you try to repair these test cases that you have found. So this is probably and answer to your question that you mentioned just now. One could fine the location to fix via this symbolic reasoning as well as MAXSAT, so I think there is a possibility. We have also been trying this out, but this is not very scalable. So there are difficulties with this. And of course there is the issue of suggestions, generating suggestions instead of repairs. We have tried to look into the usage of this for generating repairs of access control errors and so on, which is a different story. So just coming back to the theme of this talk which is the perspective played by symbolic execution, so even though in test generation model checking verification, the symbolic execution is being used essentially to guide the search. That is you try through symbolic execution. You play the role of many, many concrete executions. Whereas in debugging, repair, summarization, program semantics extraction, somehow the symbolic execution is playing the role of uncovering what went wrong as well as discovering the intended behavior. All of these are somehow exploiting, all of these techniques are built in such a way that they are exploiting similarities in large parts of the search space. In the case of debugging, for example, among the deviations in the case of repair, trying to find out all the test cases which need to be rescued together and so on. So I think here is a fair amount of space to cover in this two different roles of symbolic execution. One is the discovering of the intended program behavior and the other is the guiding of the search. And of course just very briefly let me mention one possible take from today’s discussion: symbolic execution tries to infer the intended behavior. We also did just one other work where we directly tried to capture the intended behavior through this change contract specifications. So there has been a fair amount of work on program contracts where you try to give input/output specifications of programs. So here we tried to look into change contracts where you describe the intended behavior of the changes. So for example the contract at the bottom is saying that when these conditions are satisfied, and there are no input conditions, whenever the previous output satisfies this condition the new output should satisfy this condition. So these are really, you can give output/output specifications between the outputs of the two program versions rather than input/output specifications which is more flexible I guess. And let me just conclude by giving you these references as well as thanking all my co-authors and I will be happy to take any more questions. >>: So without input summaries --? >> Abhik Roychoudhury: Right, right, right, yeah. >>: So what I wanted to say, I guess [indiscernible], program verification there is a much nicer encoding logic and that sort of reduces the [indiscernible]. So you might be able to do a lot of these things if you just [indiscernible] encoding into logic. >> Abhik Roychoudhury: That is true, actually a lot of the techniques that we did were dynamic symbolic execution, yeah, which is symbolic execution with specific paths. At least one of the initial observations to me was that specifically for reasoning about regressions where you have a past version of the program or for reasoning about cases like embedded software where you have a reference implementation of the program. You can use symbolic execution over this reference implementation or the past version to get some hint of the intended behavior. And for software engineering activities like debugging, if you have a hint of the intended behavior it is a huge plus. And this is really the problem that when if you don’t have formal specifications in the program. If I had lots of formal specifications for each of my methods then things like debugging really wouldn’t be much to worry about. So symbolic execution definitely was helpful in that regard, in the absence of these formal specifications, somehow to get a hint of the formal specification. So that is one of the credits that I am giving to the symbolic execution. Apart from the known usages of symbolic execution like guiding search and so on, but those probably could be done by other methods as well as you already mentioned. But the finding of the intended behavior probably I couldn’t be doing that through verification methods, because the verification methods require a certain property to verify. Yeah? >>: What if it’s about specifying the program as a property to previous [indiscernible]. >> Abhik Roychoudhury: That could be, yeah. >>: All I am saying is that you can encode any of these things into logic directly. >> Abhik Roychoudhury: Sure, sure, that could be, that could be. >> Nachi Nagappan: I think all the questions for time are to be continued. >> Abhik Roychoudhury: Okay, thank you. [Clapping]