36209 >> Rustan Leino: Good afternoon, everyone. I'm Rustan Leino and it's my pleasure to introduce Nadia Polikarpova who has a long history of doing interesting things in verification. She was a Ph.D. student in Bertrand Meyer's group in ETH and wrote a thesis on semantic collaboration which extended this sort of thing that we had tried to do in the spec chart project and the VCC project here in the Rice group and extended it in ways that we could not have imagined at the time to let you write specifications of programs. And then she has looked at specifications in different languages, written fully verified collections library. She did an internship here with Michal Moskal working on security properties of the TPM which she did in the context of the VCC verifier. She's participated in several verification competitions and there was one that I thought that she was the clear winner and somehow no winner was ever announced. I don't know what happened to the contest. It was very strange, but in my eyes, she was the winner. >> Nadia Polikarpova: Thank you. >> Rustan Leino: Yes. And she's worked on a tool called boogaloo, which executes boogie programs. And if you know what boogie programs are, and the partial commands, you'll be wondering how this is done, and you're welcome to ask her afterwards. And many of you who are for example in the ironclad project or have used Daphne will be familiar with and will love the calc statement that is in there to write proofs, and that's also Nadia's work here. So today, though, she's not going to talk about any of that. She's going to talk about synthesis. And let me end by announcing that if you're watching this talk online, I'll be monitoring questions if you have questions you can type those in as well. So with no further ado, welcome Nadia. >> Nadia Polikarpova: Thank you very much, Rustan. So I'm currently do a post doc at NYT with Armando Solarzano, so I'm working on synthesis because verification was too easy for me. And I'm going to talk today about something that we've been doing with Armando in the past, like half a year. And it's program synthesis from refinement types. It's very much work in progress still, so there's some things at that maybe are not perfectly working yet, but I will be really curious to hear what you have to say about this and I'll be happy about any feedback. As we all know, developing programs is hard. And developing correct programs that always do what they're intended do is even harder. And the goal of program synthesis is to help us with that by providing us with a way to describe programs that is more high level, more concise or more intuitive than what program languages, mainstream program languages can offer us today. And in the other program synthesizer will transform this description into something that is still efficiently executable. But since the difference between those two descriptions is quite large, there is no algorithm that can just simply do this compilation and usually some kind of search is involved in synthesis. So in a high level, most program synthesizers look like this, there will be some component that can explore this pace of candidate programs and there will be another component that can check that a candidate actually matches the description that the user provided and then give some kind of feedback to the Explorer and this goes on and on until we find something that actually matches. All right. And if we want, one important point is that if the we want the synthesis practices to be completely automatic, then of course verification has to be completely automatic as well. And the whole field of program synthesis is quite large and I'm going to focus on like one specific area which is automatic synthesis of recursive functional programs which is an area that is very popular in recent years. And in that area, people are using mostly those three different kinds of inputs of specifications and corresponding verification procedures because of course the verification procedure and the input language have to match on some level. So one kind of specification language that people use are simply input/output examples. And of course, how can we verify a program for a set of input/output examples, we can just execute them. So we use like a good old testing and some tools that were successful with this kind of specification are Escher by summit and ouse. Myth is from Stephan Church and his group, and lambda squared is from Rice and UT Austin. Then another kind of synthesis tools for functional programs using bounded checking as the way to verify candidate programs and what I mean by bounded checking is you have some kind of executable assertion or probably you have a program that is unoptimized, uses specification for generating an optimize program and so the techniques that are used here are usually like sat based bounded checking. And so sketch is sort of a popular tool that works on a similar principle but there are some sketches called SynthRec that actually does automatic synthesis of functional programs. And finally this goes all the way to the other end of the spectrum where, as a specification, we use fully formal specifications perhaps written in some kind of rich logics with quantifiers or recursive predicates and as a way to verify programs against those specifications we can use deductive specification and Leon from Victor Concha's group is one example of such tool. So all those techniques have of course their advantages and disadvantages and tradeoffs so here pretty much located from like the least formal that requires least level of expertise from the user to sort of the most formal. Of course they will have their disadvantages as well, so if you look at both techniques on the left, they only provide guarantees of correctness for a finite number of inputs and as a result, they might not work as well for more complex programs. So for input and output examples, the result is that for a more complex program, the user might have provide a lot of inputs and outputs and think about a lot of core cases. For example, I know that for Myth, even a program as simple as dropping a certain number of elements from the list requires 13 input/output pairs. And for those kind of tools that do sort of exhaustive boundary checking, well, if a certain bound is not enough to guarantee that the program is really correct, the checking has to go up to a higher bound and then it gets slow so it doesn't scale to checking the programs on many, many inputs, on the other hand, the deductive verification doesn't have this problem of scaling to many inputs but it does have the problem that it is rarely fully automatic because for deductive verification you know that sometimes you need to give some hints as to how to instantiate quantifiers or even worse, you have to provide invariance. So wouldn't it be just great if we had something that gives us unbounded verification but fully automatically? Of course. That would be great. And if it works for a big enough class of programs, that would be what we want. So in this work, we decided to try a different kind of input language for synthesis. And a different kind of verification procedure which is based on types and type checking. So it's not a new thing in the functional programming community to use types to specify computations. For example, if you are a Haskell programmer, you're probably familiar with this tool called Hoogle. And what Hoogle does, I can show you right now, it's a website where you can search for functions from standard Haskell library using their type. For example, if I'm thinking, what was this function called that takes an integer and the value of any type A and produce as list of As of the length of the first argument? And I don't remember its name. And Hoogle will tell me, oh, the first is all that Hoogle returns that is replicate, and this is exactly what I wanted. I wanted some number of copies of the value. But of course, the Haskell type system is good enough to do a search in the standard library but it's not rich enough to describe a goal for synthesis. So we need a more expressive type system for that and we decided to use refinement types. So the refinement types, this term is used a lot in different context, so I'm going to quickly explain what we mean in this work by refinement types. If you're familiar with work of Frenchie Gio and his group on liquid types, then this is exactly what we mean. And if you're not familiar, I would just go through the basics of this kind of refinement types. So we them general decidable refinement types. And basically, what it is is a conventional type think ML or Haskell type that is decorated with a predicate that restricts the range of failures that this type has. So for example, this one here describes a type of natural numbers. And the difference between refinement types and why we call them decipherable refinement types, the difference between those and general dependent types is that those predicates are drawn from some kind of decidable logic, efficiently decidable by SNT solvers. And this is an important fact because this makes our verification decidable. So this judgment over here would say that variable N has this type of natural numbers. And this particular type we're going abbreviate as nat later. Okay. So what we can express with those -- yes? >>: So can you explain [indiscernible] is that kind of a computation class? >> Nadia Polikarpova: So it's basically the whole approach is parameterized with what kind of logic you want to use in your refinement. It's not source specified, but the logic that people usually use would be linear integer arithmetic with interpreter functions and a raise, so this is kind of a class that is well explored and can express a lot of stuff. But you can plug in other logics there as long as you can decide them efficiently. Thank you. Okay. What can we express with those? So not only we can talk about base types, restrict base types like integers, but we can also give refinement types to functions. For example, this type over here describes function max, maximum of two integers as a function that takes two unrestricted integers, X and Y, and return as value that is great or equal than both of its arguments and as you can see, in function types, we can give names to the arguments so that we can use them later in the type of the result. And this is what makes it dependent function types and this is what lets us express pre and post conditions. And finally, we also have algebraic data types which are polymorphic. For example, this one here says that Xs is a list and each element of this list is a natural number. So this Nat again is just an abbreviation for this. So the definition of list is just what you would expect with a regular type system but as you can see, we could instantiate this type parameter here with a refined type to express a non-trivial property that every element of the list is greater or equal to zero. And not only we can express such universal properties of data structures. We can also talk also for example about the lengths of the list using the construct called measure. So you can think of this measure lengths as just a function defined inductively on the list. But it's really syntactically restricted in such a way that you just have one definition of length for everybody list constructor, and in terms of verification, what happens with this definition is basically there's this syntactic transformation that takes those definitions of length and just appends them as refinements to each -- to the type of each constructor of list, and at this point, you can just treat this length completely as an uninterpreted function. So you can forget about this definition and everything we know about length is just that the length of nil is zero and the length of cons can be calculated from the arguments in this way. Okay. And using this, using measures, we can express recursively defined properties about algebraic data types. And the cool thing about this is that the type system actually works for us to instantiate and generalize those properties completely automatically so whenever we're going to construct a list, a type system just by using the type of the, for example, the cons constructor will generalize this property that all the elements are natural numbers. And whenever we are matching a list, so we're deconstructing a list, we will get those properties back completely automatically without needing any kind of heuristics to instantiate quantifiers. And this is why refinement types have been so successful in verification of non-trivial properties with very little to know manual input and doing these thing completely automatically. So they've been used in verification. How can we use them in synthesis? Well, let's try to use a refinement type to specify a refinement to specify a synthesis build. So remember the function replicate that I showed you in Hoogle. How could we specify such a function? Well, I want to say give me a function that takes a natural number and a value of any type beta and returns a list of values that are equal to this second argument and of the length that is equal to the first argument. Okay. This is basically the complete specification of replicate. And in order for the synthesis to work, we also have to provide some kind of components that can be used as computation primitives. So in this case, we provide it with obviously the list data type and also we would give our synthesis procedure the increment and decrement functions over integers for which we also have to provide their refinement types. But, by the way, we don't even need their implementation. And the goal is now to find a function that has this type and uses those components, is allowed to use those components. Okay? So let me show you how this work west side our prototype implementation real quick. Oh, no, first, first there's something I forgot to mention, is so look at this type of the list elements here. Surprisingly, if we replace it with just beta, this specification is as good as the previous one. And this could be surprising at first, but really, if you think about it, since this type parameter can be instantiated with any refined type, what the specification is really saying is that whatever property X happens to have, every element of the list must have the same property. Including the property of being equal to a particular value. So this actually shows us how expressive polymorphic refinement types really are because they let us abstract or quantify our refinements. So let me show what our prototype implementation would do given this as input. So I have prepared it here for replicate. So it takes us a split second to generate this implementation here which basically says it will be a recursive function that the algorithm decided to name F2 which takes N and the Y as arguments and it synthesized this branching here. If the length parameter is less than equal to zero, which basically means zero because its type is natural, then we'll return a nil and otherwise it will cons a Y to a recursive call of the same function of two on the same Y argument but the first argument is decremented. Which is what you would expect. But note that the algorithm was able to infer this condition here, which is pretty nice, but I'll tell you later how it's done. Okay. Let me show you another example that is a little bit more involved. So this is insertion into sorted list. So we want to synthesize a function that takes an X of any type beta, and it takes an increasing list of betas and it produces an increasing list whose set of elements is the unit set of elements of Xs and the singleton set X. What is an increasing list? How can we define a sorted list using refinement types? Well, it's actually very easy. We say that an increasing list of alphas is either an empty list or to make an increasing list of alphas, we have to cons some alpha to an increasing list of elements that are greater or actual to the head. That's very easy. So here, we assume that the comparisons are actually generic here so they're defined on any alpha whatsoever. All right. And on top of that, we can just -- we added the length measure in the previous example we can add this -- a different measure that returns the set of elements of the list but exactly some the same way. And with this definition in hand, week define this insert function. Okay. So for this example, it would take our tool slightly longer because it's a more complex example, but still, in just over one second, we can synthesize this implementation, which is again a recursive function with two arguments and would match on the list and say if the list is empty, we'll just return the singleton list of X and then otherwise it will compare X to the head of the list and then if X is less equal to the head of the list, it would just cons X to that whole list again and otherwise, it will cons that Y to the recursive call off of the same insert function, which is again, the implementation that you would expect. It doesn't seem like much at first, but actually, to verify such an implementation, you need some non-trivial reasoning if you think about it. Because to verify this branch that actually this cons why to insert XY produces a sorted list. What you need to know is that all the elements of the list that is returned by the recursive call are greater or equal than Y. And basically, this means that you need to know that you -- if you insert something that is greater than or equal to Y into a list that are greater or equal to Y, you get a list where everything is greater than or equal to Y, which basically means that you have to strengthen the specification of the insert function itself for it to provide it with this property which requires some kind of specification discovery? And if in the language of refinement types, what it really means is that we have to figure out a refined instantiation for this beta type here to say that in this particular call to insert, its type won't be just an IncList of betas, but it will be an IncList of something -- oh, this should be beta, sorry, not int -- of something that is greater or equal to Y. So how do we do this? How do we automatically discover predicates like this, and in general, how do we do this type checking of refinement types completely automatically, which is what we need for synthesis? Yeah, by the way, since this non-trivial reasons is involved, this is actually a first example of automatically generated implementation of insert tool IncList that is fully verified, unboundedly verified because Leon can generate this as well but it cannot verify it because it cannot discover these kind of properties. All right. So do we do this type checking? Well, the thing is there is a technique that can infer this kind of refinements completely automatically, and do refinement type checking completely automatically and this is liquid type inference from again from Bertrand Meyer's group at UCSD and this technique relies on a combination of conventional [indiscernible] style type inference to infer the shapes of the refinement types, which means the conventional type that under lies the refinement type. And then it uses predicate obstruction to infer the refinements. Well, so can we just use this as the verification for a synthesis procedure and just be done with it? Well, that would be nice but unfortunately this method is fundamentally a whole program analysis. And how do I mean that? Well, let see how liquid type inference would do on this example where we want to check this expression if X is less than zero, then a single to enlist with minus one otherwise single enlist with one I guess is type list of naturals. Obviously it does not have this type. There is a type error here. But how would liquid type inference do here? Well, the thing is the liquid type inference is meant for really type inference in the context where no user type limitations are provided. So it doesn't even assume that this top level, the type of top level expression is known. It just tries to discover type of every expression from the types of its sub-expressions completely from scratch. So say it doesn't know this type. So let's look at all the sub-expressions of this expression and so we know what types of one minus one are. And we know that those list expressions are all of type list but we don't know the instantiation for the generic parameter. So what a liquid type inference would do, it will first invoke hidden linear which would infer the shape of its types and so it will know that it's a list of integers but we don't know what the refinements are yet. Then they would insert predicate unknowns in all the places in those inferred shapes where the refinement is missing. And then they will use predicate abstraction to reconstruct those refinements bottom up in a completely bottom-up style. So for example, and they would construct the strongest refinement that is allowed by the sub-expressions. For example, here, since the nil is not restricted by anything, its strongest type is list of false. Then basically to discover the type of this cons, we would have to take some kind of list upper bound of those two and we would get the list of minus ones. And here, in the same way, we will get a list of ones, and then at the top level, we will take sort of list upper bound of those and let's say in our language we can only express it this type as list of true. And at this point, we see that list of true is not a subtype of list of nats and we discover that there is a type error. But you can see that we had to analyze the whole program before we could discover this type error. And there's really two problems here. First problem is that the type information is not propagated top down because we don't even assume that there is any type information at the top. But the second problem is that there are really those two stages. There's this Hinlan Millner shape inference which is known to be a global approach. So it generates all the unification constraints and then solves all of them for the whole program. And only after that if first phase is done for the whole program, we can start inferring the refinements. And this kind of whole program type inference might work completely fine in the setting of verification. But it's really a terrible idea for synthesis and let me give you a little analogy to show you why. Let's say you have a combination lock and you -- a verification is like when you're pretty sure that you know the combination and you just want to double check that you're not wrong, so in that situation, you are not -- so you're not really hindered by the fact that the logical will only tell you if the combination is correct once you get all the numbers right, which is like a global verification. But synthesis is like lock-picking. So if you really don't know the combination and you want to determine the combination, then it will be really great if that lock could tell you for every digit if that digit is correct or not. You would be able to pick that lock much faster. So we need this kind of magic lock technology for modular verification technology to enable scaled synthesis. Okay. How can we modify this global bottom-up liquid type inference to make it modular and enable scalable synthesis? Well, first of all, we have to make use of the fact that we actually have this top level type available and try to propagate this type information top down. So in this case, let's say we know this must be a list of nats, we have those sub-expressions, so we easily propagate this information down to both branches of the if and basically say, well, if the whole thing is a list of nats, then the then branch must be a list of nats under the assumption of the if condition and the other one also must be a list of nats under the assumption of the negation of the condition. Something like this. Unfortunately, we cannot propagate type information all the way top down to the leafs because it's not possible to propagate it through functional applications. A type of function application doesn't uniquely determine the type of function and the type of the argument. So at this point we sort of have to switch direction and go bottom up for a while until those directions meet. And this is really the idea behind bidirectional type checking which was discovered by Pierson Turner in the year 2000. And we will be using this idea here. So let's say we got down to here and then we start top down, and then we start bottom up but then at this point, those two directions meet and this is where we can do our type check and this will be much more local. So at this point, we do some shape inference and then we discover that there is a type error without even looking before even looking at the second branch of the if. So we really made this type checking much more local and much more modular. Okay. So basically, this is our proposal for synthesis from refinement types. It's just like before except instead of this cold program liquid type inference we use this new technique which we call modular refinement type reconstruction and it combines the ideas from bidirectional type checking as I just told you. It still uses the same kind of techniques motivated by predicate obstruction to discover the predicates and one technical challenge that it really has to address is well, now, we cannot do a phase of shape inference for the whole program before we start this kind of refinements. So we needed to find a way to interleaf shape inference and refinement inference. And this turned out to be pop. And one other thing that we do differently from liquid type inference is since we're doing things mostly top down, we are actually inferring the weakest types, the weakest refinements instead of the strongest refinements. And do that allows us to use exactly the same mechanism that we use for inferring types to infer branch conditions in the conditionals which is what you saw in the first example. Because if you think about it, we have a mechanism for predicate discovery, why not use it for branch conditions. Okay. So at this point, just putting it all together, the whole enumeration and verification parts of our approach, I can just show you the first example again, the replicate example but really step by step how the whole search works. So on this slide, what we have is so the current goal type, this is what we want to synthesize, and the current available components. This is the environment that we can use and the current program that is the output of the synthesis. So the first thing that our tool will do is look at this gold type and see, well, it's a function type. So we know that the output will be a function so it's really easy to deal with that so basically what it would do is we'll say, well, to synthesize a function is really just synthesizing it's body given its arguments. So it will add the arguments of the function like N and X into the environment, into a set of available components and it will also give this function a name because it wants to make this function recursive in its first argument but not the second because this first argument is of type nat which is a predefined well-founded order. So the tool is allowed to recurse in this argument but not on the second one which has a type which we don't know anything about. And to enable recursion, what the tool does is basically adds as another component to the environment the same function which would basically be used as a recursive call. But note that it's type is slightly different so instead of the first argument just being of type nat, it's actually of type -- something between zero and strictly less than N which so basically our tool weakens the type of this function in such a way that it can only be called on arguments that are strictly smaller than the one that we are originally called with which will guarantee that all the recursive calls terminate. And by the way, if you're used -- if you're in verification, you are probably used to that you can ignore termination arguments for a while and say you're only verifying packet correctness. But in synthesis, you really cannot ignore termination at all because non-terminating programs are always shorter than terminating ones. So you will always get garbage if you don't take care of that approach of terminating. Okay. At this point, our target type is this. We just need a list of betas that are of length N. So the tool will first try a bunch of simpler expressions that are just function applications but it will not succeed. So at some point, it will decide to introduce a conditional, but the condition is still unknown and it's represented by this predicate unknown U1. And then at this point the tool will focus on synthesizing the first branch. So for the first branch, it will start enumerating functional applications from simplest to more complex. And the simplest expression that can go into this first branch and has the right type shape so it's a list of betas, would be nil, an empty list. So it tries this value nil for the first branch and then it would try to use [indiscernible] obstruction to infer the weakest condition under which this would be an appropriate implementation of the function. So at this point, so, as you can see, you want to use this as an assumption here, as sort of a path condition. And this point it will infer that under this condition, that N is less than equal zero this is actually the type of nil is actually a subtype of what we want. >>: [Indiscernible] doesn't say an equal zero there? >> Nadia Polikarpova: Oh, that's -- so this is a very good question. I kind of avoided the question of how we actually infer those predicates but so what liquid types do and what we do as well, is we are given a set of atomic predicates or other atomic predicate templates and all the predicates that we infer are just conjunctions are those atomic predicates. So here, I assume that the atomic predicates that we're given are variable less than equal zero, variable greater than equal zero or variable non-equal zero. From those, we can make various kind of inequalities and equalities but we always infer the weakest one that fits. So here, since this one is weaker than equality, this we'll get. But if you add equality as an atomic predicate, then you might as well get equality. That's just a matter of luck then because those two will be incomparable syntactically. But semantically -actually, we do semantic checks on them as well to cut the serve space. So you will actually get this one anyway. All right. So now we are done with the first branch and then of course now the task is synthesize the second branch under the assumption of the negative condition. So now we add this less than equal zero to the assumptions and now we have to synthesis again an expression that has this type. So again, the tool will try a branch of -will start trying function applications starting from the simplest ones so nil cannot be made so satisfy this restriction on the length in this case because we know that N is greater than or equal to zero and the length of the nil is zero and we know that from this. So nil doesn't fit. So we try something a bit more complex. Maybe there will be a cons and for a cons, we have synthesized the arguments now. And again, for each argument, we will start trying simpler expressions first. So at some point, we'll arrive to this cons of X and then as a second argument to cons we decide to use this recursive call. And at this point, we have to synthesize now the arguments for this call. So what I wanted to draw your attention to is that when we are synthesizing the first argument of F, we are actually really lucky here because the precondition on this first argument is really strong. So this precondition will be used to filter the candidates for this first argument very locally. So for example, if we will be trying N here, then even before synthesizing the second argument and going all the way up to the type of the whole Ls branch, we will know that N is not a suitable candidate here because we know that a suitable argument has to be less than N. So N is not suitable. IncN is definitely not suitable, and at this point we know that we have to choose decN locally. Okay. At some point this will give us the desired result and we don't have any holes in our program anymore and this is done. So as you can see, the enumeration part of our synthesis procedure is at this point really basic. So it really does explicit enumeration from simpler expressions to more complex, but so what we really put some thought into is that verification part. So we tried to make it as modular and automatic as possible and already this combination enables -- lets us synthesize some interesting programs. But we're hoping that if we also make the enumeration part smarter at some point, then we will get even better results. So this prototype ->>: Can you go back one slide? So in this specification of F, [indiscernible] what is M there? It says [indiscernible]. >> Nadia Polikarpova: Yeah. So we renamed the arguments here because M and X are already taken. We just picked fresh names or the arguments because we don't want to repeat them. >>: So M is a given constant. >> Nadia Polikarpova: So basically, M was initially the argument that was given here. So and this M would be added to the environment, which is take the M from that type and add it to the environment. And that M is also used in the type of ->>: So the system automatically inferred that B should be left on M? >> Nadia Polikarpova: This is just because this -- M is the name of the first argument of the function in the outermost call. So we are inferring the body of F called with M and we know that if we want to make recursive call from there, then the first argument has to be less than this M. Yeah, I agree, it's not very clear here. >>: So is the assumption that every time function will make recursive [indiscernible] and it will decrease somehow? >> Nadia Polikarpova: So basically, this method is also parametric with respect to what particular order you choose to make your recursive calls terminate. So what our tool uses at the moment is it chooses the first argument that it can recurse on and just uses that one, but it will be also possible to -- it actually has a switch to make, for example, the less [indiscernible] tuple of all recursable arguments as ->>: [Indiscernible]? For example, if I want to remit copies to this, so what kind of thing would come up there? >> Nadia Polikarpova: So for data types, what we do at the moment is basically you're allowed to specify the measure that would be used to compare those lists. So for example, if you define a length of lists and length maps lists to integers and integers already have an order, a predefined order in our system, you can just say compare list by length or in other instance you can say compare them by elements. This is one of the choice we could also make of course structural recursion that will also be possible but we thought this will be more flexible if we do it this way. More questions? Okay. So yeah. The tool that I showed you is called Synquid from synthesis and liquid. And it's available in Bitbucket. It is, as I said, it's still a work in progress. Hasn't been really released yet but hopefully will be soon. And you're welcome to try it. And my last slide, I present my kind of vision for where this project might be going so I showed you Hoogle in the beginning, but wouldn't it be cool if we had something like Hoogle but uses refinement types and can do both more precise search and documentation but also if the function that you're looking for does not exist, it could synthesize a function from using all of those functions from the base library as components so I call this Hoogle plus. So for example, you give Hoogle plus something like, well, I want a function that takes some value X, X and the list of Xs, and produces an integer value that is equal to the number of occurrences of X and Xs. So here, I basically just using another measure on lists which returns a bag of like a multi-set of elements that I call bag and then I say well, it's the multiplicity of X in that multi-set that I want because there's no primitive function in Haskell that returns the number of recurrences of an element and list, this query would require to do some little synthesis and then maybe what it returned would be something like this. Yeah. But that ->>: [Indiscernible] lock key? [Laughter] >> Nadia Polikarpova: Yeah. Questions? Maybe [indiscernible] friends on Hoogle or circles. >>: So I'm trying to think how the technique work in the specification was in the form of example. So [indiscernible] mention that one problem with examples is that you might require too many of those. But one way to avoid this aspect is to say that you're looking for small program or the smallest program that you can find which matches the examples. And then I think [indiscernible] would do the trick. So if I'm synthesizing a case of copying the number of [indiscernible], my feeling is that the program [indiscernible] synthesized is in fact the shortest program that we can understand with that example. >> Nadia Polikarpova: Right. But for example, one of the tools that I mentioned, Myth, is using exactly this heuristic that they are looking for the shortest program. But as I said, they still need13 examples to specify drop and they -- their paper even says that it was not trivial sometimes to come up with those examples and it's really an interactive process where you think you have specified everything but then the tool always comes up with some corner cases. And one of the reasons why they need so many example they have this property of trace completeness so when they're synthesizing a recursive function, because you don't know -- you don't have any specification for the function, so whenever you would have the synthesize implementation using your cursive call, you need another example that would specify this recursive call, so basically, if you're specifying length of list, you specify something a list of length four, you have to specify also for length, 2, and 1, and zero. And this is how I think those sets of examples get larger. >>: [Indiscernible] limited by the technique they're using? I can assume that if I want to specify a drop of function, I give a long list and I would specify I want drop input and output that won't be present. And simplest function would be [indiscernible]. [Indiscernible], we only needed two examples. >> Nadia Polikarpova: Mm-hmm. Okay. Yeah. Maybe that's the limitation of their technique, but so I think what would be really cool is to combine those. And I think it's even not that difficult because examples are refinement types in some way. So bringing examples into this framework would be great because of course, the disadvantage of refinement types is that you cannot expression anything. You can't expression everything you want because it's still decidable logic and the combination of those things and examples, that would be a really great idea. >>: Even your technique will probably have this issue if the overall specification for that function is not strong enough or inductive enough to prove this correctness and you need to actually refine it to be able to strengthen it as well. >> Nadia Polikarpova: Yeah. This cannot help on the -- of course your specification is not strong enough but we think that it's even an advantage in some cases to be able to provide a partial specification because one other problem with examples is that what if -- basically you have to know what the output is of the program and might not be trivial all the time. For example, you want to specify insertion to red leaf tree. You basically have to know how red leaf trees work to be able to specify the output whereas with specifications, it's much easier to say here's answer environmental red leaf tree and here's what I want in terms of the set of elements and then you go figure it out. So this really, yeah, it's right over here. >>: So there are tools like ACL2 and Isbell that instruct proofs of inductive sorts of things. You mentioned that insertion into sorted lists was the first that actually verifies this as well. What would something like ACL2 do? Can it construct terms that are executable program? >> Nadia Polikarpova: This is very good question. Probably, I mean, I don't know really an answer to this question. It's probably possible to use -- yet of course there's a lot of research in the area of proof synthesis that is kind of separated from program synthesis even though we know theoretically that it's the same thing. So I think there's much more potential in bringing that work, that old schoolwork in program synthesis more into program synthesis and seeing what those things can do. Yeah, this is really -- when I said this is the first verified implementation, synthesize implementation, I was really comparing with those really program synthesis tools that, yeah, that I was considering. other tools. >>: But it would be a good comparison to look at this [Indiscernible]. >> Nadia Polikarpova: So yeah, I mean, with any query, of course there are limitations, but, so, I think, so what I learned from let's say [indiscernible] and his group is that people are finding more and more creative ways of arranging those types to expression properties that you wouldn't think before would be expressible. And of course, so, the types system that we use here only -- so only has those features that actually -but their research on type checking actually went further than that and they have more futures that we hope we can add later so they have things like obstruct refinements for example where you can parameterize your type not just by a type as in polymorphic types but also by predicate. So you can easily specify things like let's say filter using those abstract predicates. Yeah. And so I think those -- this kind of refinement types are really just this surprising combination of -- they're surprisingly expressible and still decidable and I think -- and I thought it was really worth exploring for synthesis but of course there are limitations to this. >>: One great thing about refinement types is that you're able to locally [indiscernible] space. So is this in comparison if these were not there? What would the time be? So let's say you're just trying out end to end and you don't have these perfect types for each variable. >> Nadia Polikarpova: >>: Right. So -- How much is the gain you said? >> Nadia Polikarpova: I mean, I cannot say how -- I cannot really compare with a different kind of specification but where with data in the paper and our preliminary experiments is basically we did the synthesis in this way with local check and again we disabled local checking and just do all the checking on the top level and so we saw that, well, what you would expect for very small examples there's no difference. But with the bigger examples, there was a big difference. So maybe I can even bring it up. Oh, yeah. Right here. So for example, examples like append, deletion from a list, and both of the functions on sorted list that we try to synthesize timed out -- I think time out was like 2, 3, minutes so it could not synthesis it with the whole program analysis but with a modular analysis, it would take like some kind of seconds. And yeah, so it's what you would expect this kind of whole program analysis doesn't really scale as we go to a more accomplished program. >> Rustan Leino: So thank you all for coming and for your questions. So Nadia is going to be here all week. If you would like to chat with her 1 to 1, please let me know. So thank you, Nadia. [Applause]