>> Sumit Gulwani: Thanks everybody for coming today. ... Florian Zuleger who is visiting us from the Technical University...

>> Sumit Gulwani: Thanks everybody for coming today. I'm very happy to welcome our visitor, Florian Zuleger who is visiting us from the Technical University of Vienna. He's done a fair bit of work on bounds analysis, termination analysis and more recently some work on providing feedback for programming education. I think he did an internship here in 2009, so it's good to have him back and talking, so I'll let you take it from there. >> Florian Zuleger: Okay great. Thanks for coming to my talk. It's actually going to be two talks about 25 minutes each, so I'll try to keep it to that time. The first one is going to be on static analysis for termination and computing bounds and the second talk will be about the recent project online education where we try to get feedback on performance. Okay. So what is the problem we want to solve? We want to compute bounds. Bounds come in various forms. We want, for example, to compute bounds on how often a single statement is visited inside some function and you can ask the same question for multiple statements, so how often are they visited together. Or you can ask how often is a loop iterated. These questions are very related to the complexity, computing complexity of a function. By bound we mean a symbolic expression, so we want to compute a bound in terms of the input to a function. I'm not making this claim precise, but from a practical point of view I claim that all of these questions are very related and can be reduced to one each other. Why should you care? There are two main motivations for me. One is verification. This is simply a more interesting problem than simply determination analysis and so sometimes you want to verify that certain resource requirements are met like for memory or bandwidth. So it's not just for verification; I also have the feeling that computing bounds is a helpful technique for debugging, though you can use bound techniques as a static profile. There have been papers on profiling recently where they want to compute like a function in terms of inputs, how long it takes to execute a program and this would be like a static technique which complements classical profilers. Also, you can think of these techniques as helping you to explore like unknown code. More from a technical point of view, previous works on termination and bound analysis have been fairly complicated using heavy machinery such as abstract interpretation, computer algebra tools and software model checkers. I've been working on this for quite a while and studying a lot of actual code and I believe that you don't always need to have this heavyweight machinery, so a goal of this project is to come up with a simpler static analysis that is more scalable and more predictable. So I will jump right into some code examples and give you a feeling for the kinds of bounds we want to compute. Here is the absolute easiest case. Here you have just two nested for-loops so the complexity is quadratic and on the left you have the same program and written as a while loop and so here you could apply simple pattern matching techniques to get this information. Now I will continue with loops that are simple and the technique will not work. The easiest case is that you have inner loops which use the same counter like on the left loop you have the same counter i so the overall complexity will be linear and on the right you have a loop count of the inner loop which is not reset in the outer loop so the bound you get is also linear. Things get more interesting if you have a loop that uses two counters. Here you see a loop which has a counter i and a counter j and on one loop path you are decrementing the one count and on the other loop past the other counter you see a question mark here. The question mark always arises like some often the program logic but is not relevant for determination of the loop so we assume that's abstracted and the question mark stands for non-determinism. To make it more complicated is if you have such a loop that you in the one branch reset the counter of the other loop. This somehow mimics a nested loop, but written in this form it's only a single loop and you get a quadratic loop bound. What is kind of new in this project is that we consider counter increments instead of only counter resets, so in the slides before you have seen that we take this variable j and we reset it to n and here we are interested in if we simply increment that counter on the second branch and that leads to an overall bound of 2n for executing that loop. You can see this by asking yourself either I decrement i or I decrement j and if I decrement i I increment j and this can only happen n times and so you know that the variable j is only incremented n times and has 0 at the beginning so you will get a bound that is 2n for this loop. I want you to understand this example because this is really the motivating example for the techniques to come. Here we have like the same pattern that you saw on the example before. Here you have an inner loop, this counter j and the counter j can be incremented on the outer loop but if j is incremented so this can only happen if i is decremented at the same time. You can again ask the question though how often can I increment j? I can only do this as often as I decrement i and so because this can only happen n times, j can only be incremented n times so the complexity of the inner loop is, so the bound of the inner loop is n. These loops actually occur in practice and we found them in parsing routines so somehow to give you a feeling what is happening. Why is this an interesting problem, or what is the real problem we are studying here? It's amortized analysis. This is really amortized complexity analysis. I want to recall what is meant by amortized analysis. Tarjan has introduced this notion on the example of a stack that you have two operations. You either push an element on the stack or you have a pop many operation that removes a number of elements from the stack and though putting an element on the stack is simply a constant operation and pop many is linear because you can remove several elements. If you have n operations and you do an analysis you get the complexity of n squared but if you take into account only how many elements you put in the stack and use the Snowsonoff [phonetic] potential functions then you can get n operations and rho will be of linear complexity. So this is really what you have seen is the program which corresponds to the stack example, so you can interpreese [phonetic] that j plus plus is putting an element on the stack and you can interpreese interloop as removing some number of elements from the stack. It's really the stack pop operation that comes from the this example. You're computing that bound here is kind of related to advertising [indiscernible]. There is the problem we want to solve. Now I'm going to tell you about the backend we are using for our analysis and these vector addition systems. Then we go on and for vector addition systems we generate lexical graphic ranking functions through short determination and from then you compute bounds from these lexical graphic ranking functions. What are vector addition systems? You can see a vector addition system here down on the right. There we have states that denotes the states of the vector addition system. They correspond to the program locations in the program and then we have transitions. The transitions connect variables, so the prime variables that correspond to the next state and though this transition should be read as that like the value of i is less than the all value of i and for j the new value is smaller than the old value +1. So what does the lossy mean? The lossy means that we have less than instead of inequality. The abstraction is that you can always lose values but you cannot increase. The important thing about… >>: [indiscernible] both of those or [indiscernible] >> Florian Zuleger: Okay. So like the in the standout notion in vector addition system is that you have inequality here, and so the lossy is something weaker so where you replace all the equalities by less than. >>: Okay. >> Florian Zuleger: This is like the general form. The transitions always look like this and so here you can only have the constant like a plus or a minus and variables take values over the natural numbers, so this is very important. Okay. I'm going to tell you later a little bit how we get vector addition systems but for now I want to appeal to your intuition so that we can get vector addition systems out of these programs and so they will look as follows. On left you see example from the beginning. We have two paths going for going from the loop header back to the loop header, so there are two paths. On one path we have that j is incremented unless i is decremented and so because we don't have equality but only less than we can abstract them like this, so this will be a sound extraction. On the top I have additionally added additional values. What is this vector addition system we want to get for this program? Now we look at the next program and we additionally have the reset and we are abstracting this reset as follows. So y is a sound abstraction, though here you can see the value by using inequality instead of equality. Because we now in vector addition system that variables are always positive, so because of that we take the use of their natural numbers and so I can have j prime less than n by approximating inequality by less than and I can add plus j because j is always positive, so this is the reason why we need lossyness. You might have noticed that this is in fact not a vector addition system because I have a variable here, but we are using kind of this extension of vector addition systems where we require that n is a constant throughout the program. It's not allowed to change. Here we have another program where we increment j on the second branch so we can smp and model it, model it like this. Then the third program is the most interesting one was this vector addition system that I showed you before was the plus plus initial values so let's understand why this corresponds to the program on the left. We have the loop header of the outer loop. We have this loop header of the inner loop and so we have this one branch, the if branch if we go back without visiting the interloop and the second, on the second branch we visit the inner loop and so the distance is the interloop. Actually these arrows shouldn't be there and then we simply go back. This is the abstraction for this program. The first step which we do during the analysis is that we create lexicaographic ranking functions. All of these three problems which you see down there, they have lexicaographic ranking function ij and so I don't need lexicaographic ranking functions in for generality, so just repeat very shortly so this is the lexicaographic ranking function for the case. It means that [indiscernible] variable i is decreased as we take a position or the variable j is decreased and i is not increased, so these are the two conditions which I need and they ensure determination because a, the i get smaller or i is not increased but at least j decreased and so these conditions ensure that we will terminate. And you can see that this holds for [indiscernible] systems because in the transition under rights of transition systems here you will see that i decreases and on the second transition you'll see that i is not increased but j is decreased. On the first transition we might increase j which is allowed and so we will use this information shortly to compute bounds by how much this increased, but this ensures that we will terminate. To compute lexicaographic ranking functions we are doing something very simple which we found to be as efficient for our analysis in practice and what we will do is this simple algorithm. We will look for a variable that decreases on one transition and is not increasing on any other transition. This would be the variable I, for example, so we take this variable and make it the first component of the lexicaographic ranking function and then we remove this transition. On that transition i is decreasing and the other transition is not increasing, so we can make it the first component of the lexicaographic ranking function and then we simply repeat. Now we have one transition left a comment so then we now pick the variable j and then append it to the lexicaographic ranking function. We remove the transition so there's no transition left and we are done. Now how can we compute bounds for these vector distances systems? The idea for computing bounds is stated by these two expressions. The important idea of lexicaographic ranking functions is that we have for each transition we have one component in the lexicaographic ranking function, so we now, and we will use this information for computing bounds. So the transition t1 which I marked in blue, the left one, is associated to the component i though i is the first component in lexicaographic ranking function. If we want to compute now how often this transition can be executed, we simply have to take the initial value of i. Why is that? So we know that on this transition i is decreased and no other transition can increase i, so this means we can exactly execute it initial value often. Now for the second transition, so how often can this transition be executed? So it can also be executed how often the initial value is, of course. But then after the first transition might have increased this, might have increased variable i so we have to add. And how often this variable can be increased and by how much, so this is what we are doing with the second expression. We are adding how often it is increased and by how much. That's to update the expression. So the good thing is that we already know how often these fees can be done because this increase is in the transition which we have already computed, though this process can be repeated, so this is a recursive formula, actually, and so this can be extended to any number of transitions. The nice thing here which I like is that you are relying on the termination proof to ensure that this process terminates because the termination proof gives you an order on the transitions, which ensures that there is no [indiscernible] here. >>: And I ask you a quick question? >> Florian Zuleger: Yes. >>: It seems to me that in the previous slide you introduced a heuristic for selecting the lexicographic order that you pick. >> Florian Zuleger: So it's a greedy algorithm. You will just be taking any refined. >>: And in this case the actual bounds you get are dependent on [indiscernible] so in theory there would be multiple orders and could lead to multiple different bounds, some of which might be better than others. Is there any idea, I mean, does it seem to make a difference in practice? >> Florian Zuleger: That's a great observation, though in practice it's rarely the case that you have more than one, so this is why it works. Sometimes, I look for this because I wanted to know. There are very few examples where you have multiple and then you can take the minimum of the bounds. >>: I mean in those cases do you see a large variation or is it like a constant factor difference? >> Florian Zuleger: I have found like really just three or four cases in the last benchmark where you have different variables even. You have somehow two counters which somehow represent the same data and then you get this. Okay. Here, these are the updated expressions, so which are for your convenience marked here in red. Now we can simply apply this formula and then you get these different bounds that you have seen before. You can see on the left that you get a linear bound because the j is not increased. Here you get a linear bound because j is only increased by one and initial we'll use 0. And here you get [indiscernible] bounds by this formula. Okay. I told you what we do for loops that don't have inner loops. What is remaining is to deal with inner loops. This is the last example. How can we handle loops at different locations? Handling loops at different locations is very difficult and so previous approaches by myself included we have summarized inner loop. We have computed a relation and summary of the inner loop and so this is already very hot in general and we even needed kind of summaries of inner loops and so I wanted to do something different. The PhD student had this idea that he simply merged inner and outer loops. When he told me this the first time, I thought you cannot do this. So what is the semantics, what is the semantics of this? And so you see what has happened. We have taken this inner loop and we simply attached it up there and then we have the same transition here and then we have just for this left box we have just concatenated a path back. Because of this very simple abstract and model of vector distances, it turns out that you can do this somehow because of the only thing that you can do is adding and so, and edition is communitive and so for that reason we have a soundness theorem that allows us to, establishes that the bounds we get our sound. So for this example, we then get this lexicaographic ranking function which has three components and then we can compute the bounds of 2n for that, so we are doing this analysis. If you think about it a little bit you can interpret this lexicaographic ranking function kind of as a multidimensional potential function to connect to this amortized analysis. This idea of using edition is like a new contribution of our work. >>: Can I ask you one more question? >> Florian Zuleger: Sure. >>: Can you explain this ranking? So you have i, i, j. So what does it mean that i appears twice? Or am I missing something obvious? >> Florian Zuleger: Sure. During our ranking function computation we create one component of the lexicaographic ranking function per transition, so if you get variable i we only remove one transition, so we could only remove the other transition. We are doing that which seems a bit unnecessary because we can, we make use of it during bound computations so that we now like for each transition which -- so you could do something smarter probably indirectly, but it's not needed. In doing the bound computation we used the fact that you have the same variable. That's not so important technically. >>: So basically you, in the comparison it's less than and less than all the time but you just have it for an intermediate bound? >> Florian Zuleger: Yeah. >>: Okay. >> Florian Zuleger: Just to make the formulation of the algorithm easy. You should use the fact that somehow during the bound computation to get better results. >>: [indiscernible] example [indiscernible] function of just [indiscernible] because we don't even need j, right? >> Florian Zuleger: The thing is that there is an error which I told you before so that on this upper most transition, it should only be i prime less than i, not -1. Sorry about that. Okay. >>: What allows you to do this [indiscernible] when, in general, [indiscernible] loops? >> Florian Zuleger: We have a soundness claim for [indiscernible] distance systems and for our bound algorithm, so in general, I mean, you have lots of variables and arithmetic operations, so if you just take two different loops which are in different places in the program and you merge them, you will increase the possible executions. It's not so clear that what you are doing then will be sound, so it's really a property of that simple abstraction. >>: [indiscernible] increase the set of [indiscernible] >> Florian Zuleger: Some of it increases it a little bit, but the property we are really establishing is that the combination of the abstraction and the way that we compute bounds through this combination is sound. I can tell you more afterwards if you are interested. What is like important to say is that what are the intersystem variables? In all of the examples you have seen there was a one-to-one correspondence between the variables and the program and the variables and the vector distance system. But in general, you shouldn't think about these variables as problem variables, but rather as norms on the program state or like any expression on the state which gives you a number. You deduce for example the height of a tree or the length of a list or any arithmetic expressions of those entities. So any expression that you might think useful, you can use and so what we're doing in our implementation is that we are using expressions which show you that one particular path in one loop they get smaller so that this path cannot be repeated forever. What we do is we simply take expressions from the conditions and we checked them as they get smaller. This is what we call a local ranking function and it is as yet an easy heuristic to take expressions from the problem conditions. Then to make sure that they are natural numbers, we use this expression the maximum of this expression is 0. I don't have time to tell you how we then really get to abstracted transitions but we don't do something fancy, so this is like the main idea in the tool we call the loopers so it's implemented as LLVM and it's used as Z3 for detecting answers to the viable path and for the abstraction so we have evaluated our tool on the Cbench benchmark so it's the compiler benchmark but the interesting thing for us is it contains things like standard Linux programs like [indiscernible] and JPEG algorithms so it's around 200,000 lines of code and around 4000 loops and we can compute 78 percent of the loops. We have some optimal state assumptions for data structures so we've got [indiscernible] but so does, we have a fair trust in all [indiscernible]s. We need less than 40 minutes of analysis time and only a few of the loops actually need more than 10 seconds of analysis time and our earlier tool needed 13 hours and so to all opinions this improvement is really due to the abstraction for inner loops because the relation summaries are very costly. Okay. >>: How do you [indiscernible] >> Florian Zuleger: This is like fashion in which we still cannot answer satisfactory. What we plan to do sometime is that this benchmark comes with test cases, so we could run the test cases and see if, and count the number of loop corrections and then compare these, but at the moment I can't say anything more than we looked at a sample of them and did what we can. >>: [indiscernible] bounds, those that were most [indiscernible] >> Florian Zuleger: I have a slide on this so let me look at that. It's mostly on the fact that we cannot find the local ranking function for even one path in the loop. We sampled 50 loops and so some of them are due to things within the model, like bitwise operations only apply function and unsigned the heuristics of these in our analysis is [indiscernible] procedure in this function aligning and so sometimes we don't model unsigned integers and then there are, there's this class of sometimes termination is conditional or cannot be proven and so out of this 50 then there are like this 27 cases where you could combine our analysis with stronger [indiscernible] techniques to generating invariants so we consider this as a problem that is [indiscernible] to what we are doing because there is no general reason why our techniques do not work, so if you would have the knowledge about the stronger from arithmetical invariants. >>: [indiscernible] or can you actually do it [indiscernible] >> Florian Zuleger: Sorry? >>: When you complete the bound [indiscernible] >> Florian Zuleger: Yeah. >>: Do you consider procedure [indiscernible] inside the loop or do you abstract them somehow else? >> Florian Zuleger: Yeah. We abstract them. Either we in-line them or we abstract them. If there is a function which is like handling a global variable then we might even be [indiscernible]. >>: Are you saying that [indiscernible] essentially use the expressions for invariants in the rankings? >> Florian Zuleger: That depends. Sometimes you need like the next expression is bounded from below, so then you would use this expression, but otherwise you know that, but then you would want to know like if you have x equals x plus n then you would want to know that the n is positive, so that you know that x is going up so it's like another case where you need invariants. Okay. Do you have some other questions? >>: I guess I have one. I like the potential function set up you have because the case that I think [indiscernible] I have like a loop that is copying some collection into a list and in that list it has a case where it says if I am at capacity, do a resize. The outer loop looks linear and it is in practice, but if you just do a pattern matching on the two loops to get a quadratic bound because the potential function you were talking about would allow you to handle this adequately. Is that correct? >> Florian Zuleger: The case you're talking about does not fall into this abstraction because the resize upper bound is not constant, but we are working, as an extension of this we will be able to handle what you are suggesting. >>: [indiscernible] >> Florian Zuleger: I think what you are talking is that you like standard vector and has this property, this STL so that if you reach the limit you will double the size of the container. You will copy all elements and it is expensive. >>: But the amortized cost of adding an element is… >> Florian Zuleger: Exactly. Is linear. >>: But it's constant. >> Florian Zuleger: Okay. It's constant. >>: It could be linear if you happen to add on the reset, you know, it doesn't hit capacity. >> Florian Zuleger: Yeah. This is a case we have been working on. >>: Only the first level [indiscernible] times. The second level [indiscernible] >>: Yeah. >> Florian Zuleger: Okay. I'll continue with the second talk. This talk is about online education. I guess you are familiar with the X for Fun Project. X for Fun is helping you to complete a programming assignment, so it's giving you feedback. X for Fun and related research projects, the main focus has been on functional correctness. We set out to study performance problems in these programming assignments and we assume functionally correct programs for a start. In order to motivate the techniques I will first tell you about our observations from the X for Fun data. If you have a given problem that students have to solve that's typically a small number of different strategies how to solve these problems. By strategy I mean somehow the global high level insight of the student to solve this problem. These strategies, they require different feedback, so I'm going to give you an example on the next slide. However, the same strategy can have myriads of different implementations. By implementation I mean like the low-level choices, like the programming contracts they use and so this is the part that's not relevant for feedback. As an example, I spent a lot of time studying solutions to the anagram problem. Anagram, and the problem here is for a student to write a program that given two strings determines if the two strings are anagrams. One way of doing this is to count for each of the letters how often they occur and compare these numbers. You can do this efficiently, basically linear. What the student ended up doing here is going over all of the characters in the string and then counting again and again how often does correct characters, so this basically gives you a quadratic solution. What you want to give as a feedback to the student is that you should calculate the number of characters in a preprocessing phase. A different strategy of solving this problem is sorting the two strings and then simply comparing them. If you have clicks on implementation this is like a quadratic, but even if it's n log n the point really is that this is an efficient and that you want to tell the student about it. It turns out that sorting is very popular with the students and they understood sorting and so they are applying sorting again and again whenever they feel there is an opportunity to apply it. Hear the feedback would that instead of sorting try to compare the number of characters. That I want to skip. Regarding different implementations, so in C# you have a possibility of using Linq expressions and so the program you will see on the right has exactly the same logic as the program on the left. It's just simply written as a link statement, but it looks quite different. Also for sorting, you have the possibility to use edit library functions on the left or you can implement your sorting algorithm yourself and so there are tons of different sorting algorithms so it can be quite different. Now coming back to the problem statement from before, how can we distinguish from the different strategies while ignoring at different implementations? >>: But would correspond to [indiscernible] in the last examples? >> Florian Zuleger: The implementation detail is like how you implement that way of counting. The strategy is that you count the number of characters and you doing it again and again, so that is the strategy. Anyway you implemented basically is a implementation detail. The same for sorting, just the strategy is you are sorting and no matter how you sort that's an implementation of that strategy. >>: A little subtle there, because here on the left let's say you actually had an n log n sorting and on the right you have a [indiscernible] search. So those are, are you actually going to want to say there's a performance difference between these two or you just want to say this is a sorting family category of solutions? >> Florian Zuleger: That's a fair question. These things are kind of fuzzy and so it always depends on the goal of the teacher. With our techniques you will be able to distinguish them if you want or not to distinguish them and put them in one bucket if you want. The teacher should have the flexibility to either of those. The problem statements that we can distinguish these strategies while ignoring the different implementations and although you have seen the previous talk so how I started out doing this was [indiscernible] used these techniques in the setting and you can use performance analysis and so for the motivated by the examples far motivated by the examples you just saw, so they may not be as efficient as what we wanted to achieve because the different strategies can have the same complexity and you will not be able to distinguish them by performance analysis so you can have to quadratic strategies. This also applies to dynamic approaches of just measuring the performance and so this also might not help you. Another approach that is very popular in education and pursued by many people is machine learning and machine learning relies on recognizing syntactic patterns in the solution and we feel that we have not tried it that for this domain there are too many programming constricts and you will not be able to distinguish them. We then came up with something different and what we ended up doing is we specify strategies summoned by the key values which appear during the computation. This is a dynamic approach, so we will execute programs and observe what is going on during the execution. For these two examples, for which we implement this counting strategy, what we want to observe is the fact that they count. They go over each correct count how often it occurs and they are doing this more often than necessary. You see here these, the locations that is counting is happening and so here you can see on these two sequences the line numbers we had values happening and the black are the values so for these two strings -- actually, this is for the string s you get these value sequences. By executing the programs and only observing the value sequences we kind of get rid of all these implementation details and we just can observe what is going on. This keyvalue idea is quite flexible. It allows, for example, to see that the student is sorting and the very simple fashions. We just have to observe at some point in time we see a sorted string during the execution, so the keyvalue here would simply be observing the sorted string. What do we do? I will sketch now the methodology of our approach. The student provides an implementation so this is what you've just seen and so then there will be a teacher and the teacher will write a specification and the teacher will provide the test inputs to execute the program. Then we execute implementation to specification and then we would get traces when we would compare these traces with this notion of trace embedding which I am going to explain to you in a minute. How does the teacher specify the trace that he wants? The teacher simply has access to a special print statement which we call observe and so when the program is executed the observe statement will simply put the value, will append the value to the trace. This way the teacher is able to specify exactly the sequence of key values that he feels are important to match for the matching. For the student when we execute the program we will instrument the code before executing it and we will instrument it in such a way that we record basically every subexpression, so during the computation when an expression is evaluated we verify the subexpression and append the values to the string, to the sequence. You see we not only get the keyvalues which we want to observe, but also additional values which appear during execution of the program. Now we have these two traces. Trace a we got from the teacher and trace b we got from the student and now we want to match them. The central notion of this approach is the notion of trace embedding. The notion of trace embedding, so this determines whether this is a match or not. What we do is we try to make trace a a subsequence of the trace b but for this we require that there is a mapping from the blue program locations to the red program locations that is injective. Here if you map 3 to 2, if you map 6 to 12 and if you map 11 to 3 then the specification trace is really a sub trace of this trace b. Intuitively, what does this mean? This notion gives you that the same values appear in the same order, so this comes from the subsequence requirement. Then the second requirement that you have this injective mapping so this gives you that the same values appear on matching locations. If values are generated at location 6 so all these aba are generated location six, then aba must be generated at a location to which 6 is mapped, so at 12. This is a very strong requirement. Why didn't we just simply go with a subsequence? This notion is inspired by the notion of a simulation relation. In a simulation relation you associate locations and then when you get there you want the same values so this is a dynamic version a simulation relation. We did this because if you don't have this requirement you can match some garbage. If you have a loop counter simply as it's going computing from 1 to n and if it's, you can find any value in there and so sometimes recited imaginings that were not intended, but this a strong notion works really well. The last thing I want to mention that's injective, injectivity does not allow you to map two locations to the same locations. This is sometimes needed so we have an extension. I will come to that later but it's rarely the case. Most of the time you are going to, you really want to inject a [indiscernible]. How can we decide this? It turns out that deciding this trace embedding notion is NP complete the so we have an algorithm that is fast and works fast in practice. I will skip the algorithm. How do we imagine this technology to be used? We have the student, so the student is writing an implementation and then we have the teacher, so the teacher maintains the set of specifications. There are specifications for inefficient implementations and there are specifications for efficient implementations and I will come to them in a minute, but for now if the student writes and implementation it is matched against the set of specifications. If there is a match the student will be provided the feedback and if efficient the student will be informed that he has done a good job. If there is no match then the teacher will be notified that there is no feedback at the moment and then the teacher can provide a new specification, so this is how it could be for an [indiscernible], for example. Now coming back to the specifications, one interesting detail is the subsequence requirement. The subsequence requirement allows us to do partial matchings, so in these two counting specifications, you have the implementation on the left. There the implementation will return false as soon as it detects that there is a different number of characters. On the right you have this s all statement. I actually do not know for sure if exits early or not, but let's assume that it goes over all. It compares all numbers of strings and so you have now two versions where we break early or not, and by having the subsequence requirement with o s specification, which is also partial then we match both of them. This is quite convenient to be able to handle these two cases in one specification. >>: [indiscernible] specification that's inefficient, but the most efficient of the [indiscernible] >> Florian Zuleger: Yes. You can also do some other things but we should discuss this off-line. In contrast to this in efficient specifications we want to be more careful. For efficient specifications you will have a coverage criterion and the coverage criterion says that every loop should have a location that is matched. To be sure that there is no loop which doesn't have a location it should be match so we don't get it. The second requirement which can be specified by the teacher is that if you add this notifier full to the observed statement that then for this particular location you don't want to have a sub trace, but the trace equality. By this you make sure that there is no additional stuff going on, so you could use this also like in this case if you want to ensure that everything is matched so this is what you could do. Because this forces [indiscernible] like full, this full notify and this coverage criterion, they really require you to match every loop and so in order to make it more convenient we provide like a special cover statement which allows to match loops with a specified number of iterations. This is an efficient specification, so let's understand what is going on. In efficient specification you count for both strings the number of characters, so you store. You set up an area. This is 26, size 26, so for every letter and then you go over the string and so for each character you increment this counter by, for each character that you see. The code the student can right so he has to do that for the second string as well, so he has to count the second occurrences for the second string so he is able to do this in the same loop or he can write an extra loop. But there is like a different implementation so instead of implementing for a second character, for the second string, instead of incrementing, you can use the same area and simply decrement and then check if everything is 0 in the end. Because of these different ways, it's very convenient to observe this really for one string and then you say the second loop may be there or may not be there. This specification allows you to match three consecutive loops with you each up to 26 iterations. >>: If you go over to the second string do you actually have to go not to 26 but [indiscernible] >> Florian Zuleger: Yes. You are right. I made a mistake there. You could shut and add for this 26, as [indiscernible] is 26, you should add t. length and then it would work. Now I want to come to some extensions and so you have seen the core idea and we provide some additional constructs. On construct is one challenge is how to deal with library functions. If you have, again, the counting strategy of the student and now he has implemented counting by using the split function. What should we do about it? For library functions you can either go into the implementation of the library function but it might not always be available and you might not want to do it. What we do is we make, if you see that the library function is called, we make it a value. We record that this library function has been called and with what values. So we extend our language to allow the teacher to specify that he wants to observe function calls. And it was what value, so here observing this function [indiscernible] in that string and that sequence, so where you see that as a value you have recorded that is the function has been called. So this is how we deal with library functions. Other challenges symmetry and minor implementation differences, so in this anagram problem you have like s and t but the student can simply switch the role of s and t so that might cause you to right a second specification. What we can do is that we provide a nondeterministic choice, so the teacher can write a specification where nondeterministic way he swaps s and t in the beginning. The nondeterministic variables are only decided once, though if you have a nondeterministic choices these will correspond 2n similar specifications. This is just syntactic sugar and it allows you to conveniently specify [indiscernible] specifications in one complex specification. Then there are other extensions. One of them is one too many matching, so that you allow one location matches to several locations and only one. And so the other thing is that we allow a teacher to specify how to convert data, so for example, if you have a string, what some students do is they convert it to an area of characters and so then we want to allow the teacher to specify that we want to compare that. We also allow threads, so by thread I mean that you can specify what values are independent of order, so in the definition I gave you and this definition who works almost all of the time. We insist that the order is really the same, but for this case where you can swap s and t because of the strings, so the order if you do s first-order t first doesn't really matter. You could swap the order and another interesting extension is that if you want to observe iteration over data abstractions and you have an iteration over a set, the order of iteration is underspecified, so the set can choose which elements it should give in which order. But the correctness of the program doesn't, cannot rely on this so we simply determine the order in which we see the elements. We have implemented this approach so that it is implemented in C# and the specifications and student implementations are also in C#. We studied three assignments from the Pex4Fun platform with over 2000 correct implementations for them and then we created our own course with 24 assignments but we only managed to get 50 students to do it. We forced them as part of a course requirement, so all of our observations are that there is, indeed, a large number of an efficient implementations. For the anagram problem there are 90 percent of the implementations are an efficient and so there are also like 26 of all these assignments which have more than one in efficient strategy, so you want to be able to distinguish between these strategies. Our tool is fairly precise. For the moment we have only evaluate us, so that is probably to be expected, but at least the formula is precise. We didn't get any false negatives, so nothing matched that shouldn't match, but for five cases implementation matched specification which it shouldn't match. Specifications are fairly easy to write. They are typically the same size as an average implementation. Maximally they get three times larger. We rely on input provided by the teacher, so it's nice that you really only need 1 to 2 inputs and you don't need too much nondeterministic choices. The performance is really fast. This is important if you want to use it in an interactive setting as in education. The last slide is on the teacher effort and it's also really about how you are able to get feedback fast so you can see. These charts, you see on the left like the number of specifications written by the teacher and how many implementations a match. The setting is, so you are now sitting in front of all of these student assignments and you start writing specifications. Then the teacher gets randomly one of these implementations, so he writes a specification for that and that matches already like 200 or 300 solutions and then we get the next implementation. You write a specification for that and then you already match thousands of implementations and then you see with more and more specifications or mistakes the teacher does to refine the specifications, you saturate soon. Most of the implementations received feedback very fast. On the right you can see corresponding time and over all the teacher had to go to 3 percent of the implementations to write specifications for all of the specifications. We are hoping that we will be able to do a real case. That concludes the second talk. >> Sumit Gulwani: Great. [applause]. I know you had questions throughout but if anybody has more questions… >>: When teacher writes specification, how do we know that we only match the solutions will be [indiscernible] >> Florian Zuleger: This is an issue. For that our notion of trace and bidding is fairly prohibitive, so as I tried to motivate, so it's really strong and so the teacher will have to take care there. Then the other thing is the setting, so if the students, they could have a button so they could say like wrong specification or like the answer doesn't make sense. So that will help if the students get a worse grade, no, but if they get a better grade they probably wouldn't complain. Those are my answers to that. >> Sumit Gulwani: Okay. Thank you very much. [applause]. >> Florian Zuleger: Thanks.

>> Sumit Gulwani: Thanks everybody for coming today. ... Florian Zuleger who is visiting us from the Technical University...

Related documents

Products

Support

&gt;&gt; Sumit Gulwani: Thanks everybody for coming today. ... Florian Zuleger who is visiting us from the Technical University...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Sumit Gulwani: Thanks everybody for coming today. ... Florian Zuleger who is visiting us from the Technical University...