>> Sumit Gulwani: Thanks everybody for coming today. ... Florian Zuleger who is visiting us from the Technical University...

advertisement
>> Sumit Gulwani: Thanks everybody for coming today. I'm very happy to welcome our visitor,
Florian Zuleger who is visiting us from the Technical University of Vienna. He's done a fair bit of
work on bounds analysis, termination analysis and more recently some work on providing
feedback for programming education. I think he did an internship here in 2009, so it's good to
have him back and talking, so I'll let you take it from there.
>> Florian Zuleger: Okay great. Thanks for coming to my talk. It's actually going to be two talks
about 25 minutes each, so I'll try to keep it to that time. The first one is going to be on static
analysis for termination and computing bounds and the second talk will be about the recent
project online education where we try to get feedback on performance. Okay. So what is the
problem we want to solve? We want to compute bounds. Bounds come in various forms. We
want, for example, to compute bounds on how often a single statement is visited inside some
function and you can ask the same question for multiple statements, so how often are they
visited together. Or you can ask how often is a loop iterated. These questions are very related
to the complexity, computing complexity of a function. By bound we mean a symbolic
expression, so we want to compute a bound in terms of the input to a function. I'm not making
this claim precise, but from a practical point of view I claim that all of these questions are very
related and can be reduced to one each other. Why should you care? There are two main
motivations for me. One is verification. This is simply a more interesting problem than simply
determination analysis and so sometimes you want to verify that certain resource requirements
are met like for memory or bandwidth. So it's not just for verification; I also have the feeling
that computing bounds is a helpful technique for debugging, though you can use bound
techniques as a static profile. There have been papers on profiling recently where they want to
compute like a function in terms of inputs, how long it takes to execute a program and this
would be like a static technique which complements classical profilers. Also, you can think of
these techniques as helping you to explore like unknown code. More from a technical point of
view, previous works on termination and bound analysis have been fairly complicated using
heavy machinery such as abstract interpretation, computer algebra tools and software model
checkers. I've been working on this for quite a while and studying a lot of actual code and I
believe that you don't always need to have this heavyweight machinery, so a goal of this project
is to come up with a simpler static analysis that is more scalable and more predictable. So I will
jump right into some code examples and give you a feeling for the kinds of bounds we want to
compute. Here is the absolute easiest case. Here you have just two nested for-loops so the
complexity is quadratic and on the left you have the same program and written as a while loop
and so here you could apply simple pattern matching techniques to get this information. Now I
will continue with loops that are simple and the technique will not work. The easiest case is
that you have inner loops which use the same counter like on the left loop you have the same
counter i so the overall complexity will be linear and on the right you have a loop count of the
inner loop which is not reset in the outer loop so the bound you get is also linear. Things get
more interesting if you have a loop that uses two counters. Here you see a loop which has a
counter i and a counter j and on one loop path you are decrementing the one count and on the
other loop past the other counter you see a question mark here. The question mark always
arises like some often the program logic but is not relevant for determination of the loop so we
assume that's abstracted and the question mark stands for non-determinism. To make it more
complicated is if you have such a loop that you in the one branch reset the counter of the other
loop. This somehow mimics a nested loop, but written in this form it's only a single loop and
you get a quadratic loop bound. What is kind of new in this project is that we consider counter
increments instead of only counter resets, so in the slides before you have seen that we take
this variable j and we reset it to n and here we are interested in if we simply increment that
counter on the second branch and that leads to an overall bound of 2n for executing that loop.
You can see this by asking yourself either I decrement i or I decrement j and if I decrement i I
increment j and this can only happen n times and so you know that the variable j is only
incremented n times and has 0 at the beginning so you will get a bound that is 2n for this loop.
I want you to understand this example because this is really the motivating example for the
techniques to come. Here we have like the same pattern that you saw on the example before.
Here you have an inner loop, this counter j and the counter j can be incremented on the outer
loop but if j is incremented so this can only happen if i is decremented at the same time. You
can again ask the question though how often can I increment j? I can only do this as often as I
decrement i and so because this can only happen n times, j can only be incremented n times so
the complexity of the inner loop is, so the bound of the inner loop is n. These loops actually
occur in practice and we found them in parsing routines so somehow to give you a feeling what
is happening. Why is this an interesting problem, or what is the real problem we are studying
here? It's amortized analysis. This is really amortized complexity analysis. I want to recall what
is meant by amortized analysis. Tarjan has introduced this notion on the example of a stack
that you have two operations. You either push an element on the stack or you have a pop
many operation that removes a number of elements from the stack and though putting an
element on the stack is simply a constant operation and pop many is linear because you can
remove several elements. If you have n operations and you do an analysis you get the
complexity of n squared but if you take into account only how many elements you put in the
stack and use the Snowsonoff [phonetic] potential functions then you can get n operations and
rho will be of linear complexity. So this is really what you have seen is the program which
corresponds to the stack example, so you can interpreese [phonetic] that j plus plus is putting
an element on the stack and you can interpreese interloop as removing some number of
elements from the stack. It's really the stack pop operation that comes from the this example.
You're computing that bound here is kind of related to advertising [indiscernible]. There is the
problem we want to solve. Now I'm going to tell you about the backend we are using for our
analysis and these vector addition systems. Then we go on and for vector addition systems we
generate lexical graphic ranking functions through short determination and from then you
compute bounds from these lexical graphic ranking functions. What are vector addition
systems? You can see a vector addition system here down on the right. There we have states
that denotes the states of the vector addition system. They correspond to the program
locations in the program and then we have transitions. The transitions connect variables, so
the prime variables that correspond to the next state and though this transition should be read
as that like the value of i is less than the all value of i and for j the new value is smaller than the
old value +1. So what does the lossy mean? The lossy means that we have less than instead of
inequality. The abstraction is that you can always lose values but you cannot increase. The
important thing about…
>>: [indiscernible] both of those or [indiscernible]
>> Florian Zuleger: Okay. So like the in the standout notion in vector addition system is that
you have inequality here, and so the lossy is something weaker so where you replace all the
equalities by less than.
>>: Okay.
>> Florian Zuleger: This is like the general form. The transitions always look like this and so
here you can only have the constant like a plus or a minus and variables take values over the
natural numbers, so this is very important. Okay. I'm going to tell you later a little bit how we
get vector addition systems but for now I want to appeal to your intuition so that we can get
vector addition systems out of these programs and so they will look as follows. On left you see
example from the beginning. We have two paths going for going from the loop header back to
the loop header, so there are two paths. On one path we have that j is incremented unless i is
decremented and so because we don't have equality but only less than we can abstract them
like this, so this will be a sound extraction. On the top I have additionally added additional
values. What is this vector addition system we want to get for this program? Now we look at
the next program and we additionally have the reset and we are abstracting this reset as
follows. So y is a sound abstraction, though here you can see the value by using inequality
instead of equality. Because we now in vector addition system that variables are always
positive, so because of that we take the use of their natural numbers and so I can have j prime
less than n by approximating inequality by less than and I can add plus j because j is always
positive, so this is the reason why we need lossyness. You might have noticed that this is in fact
not a vector addition system because I have a variable here, but we are using kind of this
extension of vector addition systems where we require that n is a constant throughout the
program. It's not allowed to change. Here we have another program where we increment j on
the second branch so we can smp and model it, model it like this. Then the third program is the
most interesting one was this vector addition system that I showed you before was the plus
plus initial values so let's understand why this corresponds to the program on the left. We have
the loop header of the outer loop. We have this loop header of the inner loop and so we have
this one branch, the if branch if we go back without visiting the interloop and the second, on
the second branch we visit the inner loop and so the distance is the interloop. Actually these
arrows shouldn't be there and then we simply go back. This is the abstraction for this program.
The first step which we do during the analysis is that we create lexicaographic ranking
functions. All of these three problems which you see down there, they have lexicaographic
ranking function ij and so I don't need lexicaographic ranking functions in for generality, so just
repeat very shortly so this is the lexicaographic ranking function for the case. It means that
[indiscernible] variable i is decreased as we take a position or the variable j is decreased and i is
not increased, so these are the two conditions which I need and they ensure determination
because a, the i get smaller or i is not increased but at least j decreased and so these conditions
ensure that we will terminate. And you can see that this holds for [indiscernible] systems
because in the transition under rights of transition systems here you will see that i decreases
and on the second transition you'll see that i is not increased but j is decreased. On the first
transition we might increase j which is allowed and so we will use this information shortly to
compute bounds by how much this increased, but this ensures that we will terminate. To
compute lexicaographic ranking functions we are doing something very simple which we found
to be as efficient for our analysis in practice and what we will do is this simple algorithm. We
will look for a variable that decreases on one transition and is not increasing on any other
transition. This would be the variable I, for example, so we take this variable and make it the
first component of the lexicaographic ranking function and then we remove this transition. On
that transition i is decreasing and the other transition is not increasing, so we can make it the
first component of the lexicaographic ranking function and then we simply repeat. Now we
have one transition left a comment so then we now pick the variable j and then append it to
the lexicaographic ranking function. We remove the transition so there's no transition left and
we are done. Now how can we compute bounds for these vector distances systems? The idea
for computing bounds is stated by these two expressions. The important idea of lexicaographic
ranking functions is that we have for each transition we have one component in the
lexicaographic ranking function, so we now, and we will use this information for computing
bounds. So the transition t1 which I marked in blue, the left one, is associated to the
component i though i is the first component in lexicaographic ranking function. If we want to
compute now how often this transition can be executed, we simply have to take the initial
value of i. Why is that? So we know that on this transition i is decreased and no other
transition can increase i, so this means we can exactly execute it initial value often. Now for the
second transition, so how often can this transition be executed? So it can also be executed how
often the initial value is, of course. But then after the first transition might have increased this,
might have increased variable i so we have to add. And how often this variable can be
increased and by how much, so this is what we are doing with the second expression. We are
adding how often it is increased and by how much. That's to update the expression. So the
good thing is that we already know how often these fees can be done because this increase is in
the transition which we have already computed, though this process can be repeated, so this is
a recursive formula, actually, and so this can be extended to any number of transitions. The
nice thing here which I like is that you are relying on the termination proof to ensure that this
process terminates because the termination proof gives you an order on the transitions, which
ensures that there is no [indiscernible] here.
>>: And I ask you a quick question?
>> Florian Zuleger: Yes.
>>: It seems to me that in the previous slide you introduced a heuristic for selecting the
lexicographic order that you pick.
>> Florian Zuleger: So it's a greedy algorithm. You will just be taking any refined.
>>: And in this case the actual bounds you get are dependent on [indiscernible] so in theory
there would be multiple orders and could lead to multiple different bounds, some of which
might be better than others. Is there any idea, I mean, does it seem to make a difference in
practice?
>> Florian Zuleger: That's a great observation, though in practice it's rarely the case that you
have more than one, so this is why it works. Sometimes, I look for this because I wanted to
know. There are very few examples where you have multiple and then you can take the
minimum of the bounds.
>>: I mean in those cases do you see a large variation or is it like a constant factor difference?
>> Florian Zuleger: I have found like really just three or four cases in the last benchmark where
you have different variables even. You have somehow two counters which somehow represent
the same data and then you get this. Okay. Here, these are the updated expressions, so which
are for your convenience marked here in red. Now we can simply apply this formula and then
you get these different bounds that you have seen before. You can see on the left that you get
a linear bound because the j is not increased. Here you get a linear bound because j is only
increased by one and initial we'll use 0. And here you get [indiscernible] bounds by this
formula. Okay. I told you what we do for loops that don't have inner loops. What is remaining
is to deal with inner loops. This is the last example. How can we handle loops at different
locations? Handling loops at different locations is very difficult and so previous approaches by
myself included we have summarized inner loop. We have computed a relation and summary
of the inner loop and so this is already very hot in general and we even needed kind of
summaries of inner loops and so I wanted to do something different. The PhD student had this
idea that he simply merged inner and outer loops. When he told me this the first time, I
thought you cannot do this. So what is the semantics, what is the semantics of this? And so
you see what has happened. We have taken this inner loop and we simply attached it up there
and then we have the same transition here and then we have just for this left box we have just
concatenated a path back. Because of this very simple abstract and model of vector distances,
it turns out that you can do this somehow because of the only thing that you can do is adding
and so, and edition is communitive and so for that reason we have a soundness theorem that
allows us to, establishes that the bounds we get our sound. So for this example, we then get
this lexicaographic ranking function which has three components and then we can compute the
bounds of 2n for that, so we are doing this analysis. If you think about it a little bit you can
interpret this lexicaographic ranking function kind of as a multidimensional potential function
to connect to this amortized analysis. This idea of using edition is like a new contribution of our
work.
>>: Can I ask you one more question?
>> Florian Zuleger: Sure.
>>: Can you explain this ranking? So you have i, i, j. So what does it mean that i appears twice?
Or am I missing something obvious?
>> Florian Zuleger: Sure. During our ranking function computation we create one component
of the lexicaographic ranking function per transition, so if you get variable i we only remove one
transition, so we could only remove the other transition. We are doing that which seems a bit
unnecessary because we can, we make use of it during bound computations so that we now
like for each transition which -- so you could do something smarter probably indirectly, but it's
not needed. In doing the bound computation we used the fact that you have the same
variable. That's not so important technically.
>>: So basically you, in the comparison it's less than and less than all the time but you just have
it for an intermediate bound?
>> Florian Zuleger: Yeah.
>>: Okay.
>> Florian Zuleger: Just to make the formulation of the algorithm easy. You should use the fact
that somehow during the bound computation to get better results.
>>: [indiscernible] example [indiscernible] function of just [indiscernible] because we don't
even need j, right?
>> Florian Zuleger: The thing is that there is an error which I told you before so that on this
upper most transition, it should only be i prime less than i, not -1. Sorry about that. Okay.
>>: What allows you to do this [indiscernible] when, in general, [indiscernible] loops?
>> Florian Zuleger: We have a soundness claim for [indiscernible] distance systems and for our
bound algorithm, so in general, I mean, you have lots of variables and arithmetic operations, so
if you just take two different loops which are in different places in the program and you merge
them, you will increase the possible executions. It's not so clear that what you are doing then
will be sound, so it's really a property of that simple abstraction.
>>: [indiscernible] increase the set of [indiscernible]
>> Florian Zuleger: Some of it increases it a little bit, but the property we are really establishing
is that the combination of the abstraction and the way that we compute bounds through this
combination is sound. I can tell you more afterwards if you are interested. What is like
important to say is that what are the intersystem variables? In all of the examples you have
seen there was a one-to-one correspondence between the variables and the program and the
variables and the vector distance system. But in general, you shouldn't think about these
variables as problem variables, but rather as norms on the program state or like any expression
on the state which gives you a number. You deduce for example the height of a tree or the
length of a list or any arithmetic expressions of those entities. So any expression that you might
think useful, you can use and so what we're doing in our implementation is that we are using
expressions which show you that one particular path in one loop they get smaller so that this
path cannot be repeated forever. What we do is we simply take expressions from the
conditions and we checked them as they get smaller. This is what we call a local ranking
function and it is as yet an easy heuristic to take expressions from the problem conditions.
Then to make sure that they are natural numbers, we use this expression the maximum of this
expression is 0. I don't have time to tell you how we then really get to abstracted transitions
but we don't do something fancy, so this is like the main idea in the tool we call the loopers so
it's implemented as LLVM and it's used as Z3 for detecting answers to the viable path and for
the abstraction so we have evaluated our tool on the Cbench benchmark so it's the compiler
benchmark but the interesting thing for us is it contains things like standard Linux programs like
[indiscernible] and JPEG algorithms so it's around 200,000 lines of code and around 4000 loops
and we can compute 78 percent of the loops. We have some optimal state assumptions for
data structures so we've got [indiscernible] but so does, we have a fair trust in all
[indiscernible]s. We need less than 40 minutes of analysis time and only a few of the loops
actually need more than 10 seconds of analysis time and our earlier tool needed 13 hours and
so to all opinions this improvement is really due to the abstraction for inner loops because the
relation summaries are very costly. Okay.
>>: How do you [indiscernible]
>> Florian Zuleger: This is like fashion in which we still cannot answer satisfactory. What we
plan to do sometime is that this benchmark comes with test cases, so we could run the test
cases and see if, and count the number of loop corrections and then compare these, but at the
moment I can't say anything more than we looked at a sample of them and did what we can.
>>: [indiscernible] bounds, those that were most [indiscernible]
>> Florian Zuleger: I have a slide on this so let me look at that. It's mostly on the fact that we
cannot find the local ranking function for even one path in the loop. We sampled 50 loops and
so some of them are due to things within the model, like bitwise operations only apply function
and unsigned the heuristics of these in our analysis is [indiscernible] procedure in this function
aligning and so sometimes we don't model unsigned integers and then there are, there's this
class of sometimes termination is conditional or cannot be proven and so out of this 50 then
there are like this 27 cases where you could combine our analysis with stronger [indiscernible]
techniques to generating invariants so we consider this as a problem that is [indiscernible] to
what we are doing because there is no general reason why our techniques do not work, so if
you would have the knowledge about the stronger from arithmetical invariants.
>>: [indiscernible] or can you actually do it [indiscernible]
>> Florian Zuleger: Sorry?
>>: When you complete the bound [indiscernible]
>> Florian Zuleger: Yeah.
>>: Do you consider procedure [indiscernible] inside the loop or do you abstract them
somehow else?
>> Florian Zuleger: Yeah. We abstract them. Either we in-line them or we abstract them. If
there is a function which is like handling a global variable then we might even be [indiscernible].
>>: Are you saying that [indiscernible] essentially use the expressions for invariants in the
rankings?
>> Florian Zuleger: That depends. Sometimes you need like the next expression is bounded
from below, so then you would use this expression, but otherwise you know that, but then you
would want to know like if you have x equals x plus n then you would want to know that the n is
positive, so that you know that x is going up so it's like another case where you need invariants.
Okay. Do you have some other questions?
>>: I guess I have one. I like the potential function set up you have because the case that I
think [indiscernible] I have like a loop that is copying some collection into a list and in that list it
has a case where it says if I am at capacity, do a resize. The outer loop looks linear and it is in
practice, but if you just do a pattern matching on the two loops to get a quadratic bound
because the potential function you were talking about would allow you to handle this
adequately. Is that correct?
>> Florian Zuleger: The case you're talking about does not fall into this abstraction because the
resize upper bound is not constant, but we are working, as an extension of this we will be able
to handle what you are suggesting.
>>: [indiscernible]
>> Florian Zuleger: I think what you are talking is that you like standard vector and has this
property, this STL so that if you reach the limit you will double the size of the container. You
will copy all elements and it is expensive.
>>: But the amortized cost of adding an element is…
>> Florian Zuleger: Exactly. Is linear.
>>: But it's constant.
>> Florian Zuleger: Okay. It's constant.
>>: It could be linear if you happen to add on the reset, you know, it doesn't hit capacity.
>> Florian Zuleger: Yeah. This is a case we have been working on.
>>: Only the first level [indiscernible] times. The second level [indiscernible]
>>: Yeah.
>> Florian Zuleger: Okay. I'll continue with the second talk. This talk is about online education.
I guess you are familiar with the X for Fun Project. X for Fun is helping you to complete a
programming assignment, so it's giving you feedback. X for Fun and related research projects,
the main focus has been on functional correctness. We set out to study performance problems
in these programming assignments and we assume functionally correct programs for a start. In
order to motivate the techniques I will first tell you about our observations from the X for Fun
data. If you have a given problem that students have to solve that's typically a small number of
different strategies how to solve these problems. By strategy I mean somehow the global high
level insight of the student to solve this problem. These strategies, they require different
feedback, so I'm going to give you an example on the next slide. However, the same strategy
can have myriads of different implementations. By implementation I mean like the low-level
choices, like the programming contracts they use and so this is the part that's not relevant for
feedback. As an example, I spent a lot of time studying solutions to the anagram problem.
Anagram, and the problem here is for a student to write a program that given two strings
determines if the two strings are anagrams. One way of doing this is to count for each of the
letters how often they occur and compare these numbers. You can do this efficiently, basically
linear. What the student ended up doing here is going over all of the characters in the string
and then counting again and again how often does correct characters, so this basically gives you
a quadratic solution. What you want to give as a feedback to the student is that you should
calculate the number of characters in a preprocessing phase. A different strategy of solving this
problem is sorting the two strings and then simply comparing them. If you have clicks on
implementation this is like a quadratic, but even if it's n log n the point really is that this is an
efficient and that you want to tell the student about it. It turns out that sorting is very popular
with the students and they understood sorting and so they are applying sorting again and again
whenever they feel there is an opportunity to apply it. Hear the feedback would that instead of
sorting try to compare the number of characters. That I want to skip. Regarding different
implementations, so in C# you have a possibility of using Linq expressions and so the program
you will see on the right has exactly the same logic as the program on the left. It's just simply
written as a link statement, but it looks quite different. Also for sorting, you have the possibility
to use edit library functions on the left or you can implement your sorting algorithm yourself
and so there are tons of different sorting algorithms so it can be quite different. Now coming
back to the problem statement from before, how can we distinguish from the different
strategies while ignoring at different implementations?
>>: But would correspond to [indiscernible] in the last examples?
>> Florian Zuleger: The implementation detail is like how you implement that way of counting.
The strategy is that you count the number of characters and you doing it again and again, so
that is the strategy. Anyway you implemented basically is a implementation detail. The same
for sorting, just the strategy is you are sorting and no matter how you sort that's an
implementation of that strategy.
>>: A little subtle there, because here on the left let's say you actually had an n log n sorting
and on the right you have a [indiscernible] search. So those are, are you actually going to want
to say there's a performance difference between these two or you just want to say this is a
sorting family category of solutions?
>> Florian Zuleger: That's a fair question. These things are kind of fuzzy and so it always
depends on the goal of the teacher. With our techniques you will be able to distinguish them if
you want or not to distinguish them and put them in one bucket if you want. The teacher
should have the flexibility to either of those. The problem statements that we can distinguish
these strategies while ignoring the different implementations and although you have seen the
previous talk so how I started out doing this was [indiscernible] used these techniques in the
setting and you can use performance analysis and so for the motivated by the examples far
motivated by the examples you just saw, so they may not be as efficient as what we wanted to
achieve because the different strategies can have the same complexity and you will not be able
to distinguish them by performance analysis so you can have to quadratic strategies. This also
applies to dynamic approaches of just measuring the performance and so this also might not
help you. Another approach that is very popular in education and pursued by many people is
machine learning and machine learning relies on recognizing syntactic patterns in the solution
and we feel that we have not tried it that for this domain there are too many programming
constricts and you will not be able to distinguish them. We then came up with something
different and what we ended up doing is we specify strategies summoned by the key values
which appear during the computation. This is a dynamic approach, so we will execute
programs and observe what is going on during the execution. For these two examples, for
which we implement this counting strategy, what we want to observe is the fact that they
count. They go over each correct count how often it occurs and they are doing this more often
than necessary. You see here these, the locations that is counting is happening and so here you
can see on these two sequences the line numbers we had values happening and the black are
the values so for these two strings -- actually, this is for the string s you get these value
sequences. By executing the programs and only observing the value sequences we kind of get
rid of all these implementation details and we just can observe what is going on. This keyvalue
idea is quite flexible. It allows, for example, to see that the student is sorting and the very
simple fashions. We just have to observe at some point in time we see a sorted string during
the execution, so the keyvalue here would simply be observing the sorted string. What do we
do? I will sketch now the methodology of our approach. The student provides an
implementation so this is what you've just seen and so then there will be a teacher and the
teacher will write a specification and the teacher will provide the test inputs to execute the
program. Then we execute implementation to specification and then we would get traces
when we would compare these traces with this notion of trace embedding which I am going to
explain to you in a minute. How does the teacher specify the trace that he wants? The teacher
simply has access to a special print statement which we call observe and so when the program
is executed the observe statement will simply put the value, will append the value to the trace.
This way the teacher is able to specify exactly the sequence of key values that he feels are
important to match for the matching. For the student when we execute the program we will
instrument the code before executing it and we will instrument it in such a way that we record
basically every subexpression, so during the computation when an expression is evaluated we
verify the subexpression and append the values to the string, to the sequence. You see we not
only get the keyvalues which we want to observe, but also additional values which appear
during execution of the program. Now we have these two traces. Trace a we got from the
teacher and trace b we got from the student and now we want to match them. The central
notion of this approach is the notion of trace embedding. The notion of trace embedding, so
this determines whether this is a match or not. What we do is we try to make trace a a
subsequence of the trace b but for this we require that there is a mapping from the blue
program locations to the red program locations that is injective. Here if you map 3 to 2, if you
map 6 to 12 and if you map 11 to 3 then the specification trace is really a sub trace of this trace
b. Intuitively, what does this mean? This notion gives you that the same values appear in the
same order, so this comes from the subsequence requirement. Then the second requirement
that you have this injective mapping so this gives you that the same values appear on matching
locations. If values are generated at location 6 so all these aba are generated location six, then
aba must be generated at a location to which 6 is mapped, so at 12. This is a very strong
requirement. Why didn't we just simply go with a subsequence? This notion is inspired by the
notion of a simulation relation. In a simulation relation you associate locations and then when
you get there you want the same values so this is a dynamic version a simulation relation. We
did this because if you don't have this requirement you can match some garbage. If you have a
loop counter simply as it's going computing from 1 to n and if it's, you can find any value in
there and so sometimes recited imaginings that were not intended, but this a strong notion
works really well. The last thing I want to mention that's injective, injectivity does not allow
you to map two locations to the same locations. This is sometimes needed so we have an
extension. I will come to that later but it's rarely the case. Most of the time you are going to,
you really want to inject a [indiscernible]. How can we decide this? It turns out that deciding
this trace embedding notion is NP complete the so we have an algorithm that is fast and works
fast in practice. I will skip the algorithm. How do we imagine this technology to be used? We
have the student, so the student is writing an implementation and then we have the teacher, so
the teacher maintains the set of specifications. There are specifications for inefficient
implementations and there are specifications for efficient implementations and I will come to
them in a minute, but for now if the student writes and implementation it is matched against
the set of specifications. If there is a match the student will be provided the feedback and if
efficient the student will be informed that he has done a good job. If there is no match then
the teacher will be notified that there is no feedback at the moment and then the teacher can
provide a new specification, so this is how it could be for an [indiscernible], for example. Now
coming back to the specifications, one interesting detail is the subsequence requirement. The
subsequence requirement allows us to do partial matchings, so in these two counting
specifications, you have the implementation on the left. There the implementation will return
false as soon as it detects that there is a different number of characters. On the right you have
this s all statement. I actually do not know for sure if exits early or not, but let's assume that it
goes over all. It compares all numbers of strings and so you have now two versions where we
break early or not, and by having the subsequence requirement with o s specification, which is
also partial then we match both of them. This is quite convenient to be able to handle these
two cases in one specification.
>>: [indiscernible] specification that's inefficient, but the most efficient of the [indiscernible]
>> Florian Zuleger: Yes. You can also do some other things but we should discuss this off-line.
In contrast to this in efficient specifications we want to be more careful. For efficient
specifications you will have a coverage criterion and the coverage criterion says that every loop
should have a location that is matched. To be sure that there is no loop which doesn't have a
location it should be match so we don't get it. The second requirement which can be specified
by the teacher is that if you add this notifier full to the observed statement that then for this
particular location you don't want to have a sub trace, but the trace equality. By this you make
sure that there is no additional stuff going on, so you could use this also like in this case if you
want to ensure that everything is matched so this is what you could do. Because this forces
[indiscernible] like full, this full notify and this coverage criterion, they really require you to
match every loop and so in order to make it more convenient we provide like a special cover
statement which allows to match loops with a specified number of iterations. This is an
efficient specification, so let's understand what is going on. In efficient specification you count
for both strings the number of characters, so you store. You set up an area. This is 26, size 26,
so for every letter and then you go over the string and so for each character you increment this
counter by, for each character that you see. The code the student can right so he has to do that
for the second string as well, so he has to count the second occurrences for the second string so
he is able to do this in the same loop or he can write an extra loop. But there is like a different
implementation so instead of implementing for a second character, for the second string,
instead of incrementing, you can use the same area and simply decrement and then check if
everything is 0 in the end. Because of these different ways, it's very convenient to observe this
really for one string and then you say the second loop may be there or may not be there. This
specification allows you to match three consecutive loops with you each up to 26 iterations.
>>: If you go over to the second string do you actually have to go not to 26 but [indiscernible]
>> Florian Zuleger: Yes. You are right. I made a mistake there. You could shut and add for this
26, as [indiscernible] is 26, you should add t. length and then it would work. Now I want to
come to some extensions and so you have seen the core idea and we provide some additional
constructs. On construct is one challenge is how to deal with library functions. If you have,
again, the counting strategy of the student and now he has implemented counting by using the
split function. What should we do about it? For library functions you can either go into the
implementation of the library function but it might not always be available and you might not
want to do it. What we do is we make, if you see that the library function is called, we make it a
value. We record that this library function has been called and with what values. So we extend
our language to allow the teacher to specify that he wants to observe function calls. And it was
what value, so here observing this function [indiscernible] in that string and that sequence, so
where you see that as a value you have recorded that is the function has been called. So this is
how we deal with library functions. Other challenges symmetry and minor implementation
differences, so in this anagram problem you have like s and t but the student can simply switch
the role of s and t so that might cause you to right a second specification. What we can do is
that we provide a nondeterministic choice, so the teacher can write a specification where
nondeterministic way he swaps s and t in the beginning. The nondeterministic variables are
only decided once, though if you have a nondeterministic choices these will correspond 2n
similar specifications. This is just syntactic sugar and it allows you to conveniently specify
[indiscernible] specifications in one complex specification. Then there are other extensions.
One of them is one too many matching, so that you allow one location matches to several
locations and only one. And so the other thing is that we allow a teacher to specify how to
convert data, so for example, if you have a string, what some students do is they convert it to
an area of characters and so then we want to allow the teacher to specify that we want to
compare that. We also allow threads, so by thread I mean that you can specify what values are
independent of order, so in the definition I gave you and this definition who works almost all of
the time. We insist that the order is really the same, but for this case where you can swap s
and t because of the strings, so the order if you do s first-order t first doesn't really matter. You
could swap the order and another interesting extension is that if you want to observe iteration
over data abstractions and you have an iteration over a set, the order of iteration is
underspecified, so the set can choose which elements it should give in which order. But the
correctness of the program doesn't, cannot rely on this so we simply determine the order in
which we see the elements. We have implemented this approach so that it is implemented in
C# and the specifications and student implementations are also in C#. We studied three
assignments from the Pex4Fun platform with over 2000 correct implementations for them and
then we created our own course with 24 assignments but we only managed to get 50 students
to do it. We forced them as part of a course requirement, so all of our observations are that
there is, indeed, a large number of an efficient implementations. For the anagram problem
there are 90 percent of the implementations are an efficient and so there are also like 26 of all
these assignments which have more than one in efficient strategy, so you want to be able to
distinguish between these strategies. Our tool is fairly precise. For the moment we have only
evaluate us, so that is probably to be expected, but at least the formula is precise. We didn't
get any false negatives, so nothing matched that shouldn't match, but for five cases
implementation matched specification which it shouldn't match. Specifications are fairly easy
to write. They are typically the same size as an average implementation. Maximally they get
three times larger. We rely on input provided by the teacher, so it's nice that you really only
need 1 to 2 inputs and you don't need too much nondeterministic choices. The performance is
really fast. This is important if you want to use it in an interactive setting as in education. The
last slide is on the teacher effort and it's also really about how you are able to get feedback fast
so you can see. These charts, you see on the left like the number of specifications written by
the teacher and how many implementations a match. The setting is, so you are now sitting in
front of all of these student assignments and you start writing specifications. Then the teacher
gets randomly one of these implementations, so he writes a specification for that and that
matches already like 200 or 300 solutions and then we get the next implementation. You write
a specification for that and then you already match thousands of implementations and then
you see with more and more specifications or mistakes the teacher does to refine the
specifications, you saturate soon. Most of the implementations received feedback very fast.
On the right you can see corresponding time and over all the teacher had to go to 3 percent of
the implementations to write specifications for all of the specifications. We are hoping that we
will be able to do a real case. That concludes the second talk.
>> Sumit Gulwani: Great. [applause]. I know you had questions throughout but if anybody has
more questions…
>>: When teacher writes specification, how do we know that we only match the solutions will
be [indiscernible]
>> Florian Zuleger: This is an issue. For that our notion of trace and bidding is fairly prohibitive,
so as I tried to motivate, so it's really strong and so the teacher will have to take care there.
Then the other thing is the setting, so if the students, they could have a button so they could
say like wrong specification or like the answer doesn't make sense. So that will help if the
students get a worse grade, no, but if they get a better grade they probably wouldn't complain.
Those are my answers to that.
>> Sumit Gulwani: Okay. Thank you very much. [applause].
>> Florian Zuleger: Thanks.
Download