>> Nikolai Tillmann: So welcome everyone. This is... would like to start by saying a big thank you...

>> Nikolai Tillmann: So welcome everyone. This is the last session of the workshop and I would like to start by saying a big thank you to Judith who organized the workshop. >> Judith Bishop: Oh wow thank you. [Applause] >> Nikolai Tillmann: It’s not quite over yet, but I really enjoyed the community here, everyone is working on a related subject and I think that’s really great to bring together this set of researchers to see how we can improve education and learning how to program. >> Judith Bishop: Thank you Nikolia. I would never have got here without your inspirations of course. Let’s leave that there. So our first speaker here is Alex. >> Alex Orso: Okay, yea thanks also from me to Judith for inviting me and for organizing this. It was very interesting. So what I did was to ask people around the room, I was trying to summarize the where are the major themes, issues and so on. So several people responded and thank you to all the people who responded. What I am going to try to do is just kind of summarize what the information was and use that to classify 3 different things. So, one is themes that emerge from the workshop. The other one is open issues and open issues is the one with the biggest number of entries and then opportunities. So in terms of themes the main one that seems to be the overarching theme from most of the talks and discussion is feedback. So it’s all about basically the kind of feedback that you want to give to the students, so to whoever is using the system. For example one particular theme or sub theme in this context was closeness. So this idea: How do you tell the students or whoever is playing the game that they are close to the solution? How do you indicate how close they are? How do you direct them towards the right solution? Some people talked about providing the exact delta with respect to the solution or possibly an abstracted version of that. We learned this idea of doing it based on the number of counted inputs for the right or wrong solutions. There were many, many ways of doing this that were discussed and I think that’s a really interesting theme. In terms of open issues, and this is probably of interest, especially for you guys or developers, the minds behind Code Hunt, so one is kind of this overarching questions like the elephant in the room, which is the approach [indiscernible] good? So how do we know that people actually learn by doing Code Hunt more than by traditional means? And we discussed a possible study which would be very interesting to try and see if you had a kind of student population, you split it and half of them are learning through Code Hunt or similar means and half of them are learning through traditional methods. What will be the difference in their learning? >>: Do you have a study on the traditional method? >> Alex Orso: Well that’s an interesting point. I teach online classes and now there are all these discussion of: How do we know that online classes are working? And you think: Well how do you know about the traditional classes? But, in the traditional classes everybody is assuming that they work, right. So it’s a very valid point and in fact I think it would be good to just test all these methods and see what works and what doesn’t work including also the traditional one, but it’s a very, very valid point. Nevertheless I think a study of that kind would be very interesting, especially if the results are good. It would give you a lot of fuel for these kinds of approaches. Another open issue that was mentioned by several people was how to make puzzle writing easier and I think it probably resonates with you as well. In particular: How do you make it easy to provide good guidance to the player, which also goes back to the idea of feedback? So how do you give them something that is useful so that they will learn how to get to the same resolution? >>: Do you mean [indiscernible] or do you mean test cases? >> Alex Orso: Well it’s however you do it, whether you do it by applying the [indiscernible] or by providing hints. What’s the best way to tell the students they are getting there? It also really goes with feedback of like you’re getting closer, but you might want to look at this or you use the right input. It might be easier to find the patters and so on. So that’s definitely an issue on this side. Now you guys showed us how to do it by basically tricking the system like for example putting in their specific conditions if you want something to be considered. It would be nice if there were more of an abstract and general way of doing that and also maybe a more understood way. Sometimes I think when [indiscernible] and I were trying to improve the puzzles it was hard to figure out what would help and what would not. So you think you did a great job, you try it and its like, “Oh no, that’s not really good.” So I think that’s an open issue that would be worth investigating. The other thing that I heard from a couple of people is that this fact that the knowledge of symbolic execution seems to be a little too necessary for whoever is developing the problems. So basically you have to know what the machinery is underneath in order to do the right thing. It would be nice if that could be kind of abstracted away a little bit so that you don’t have to understand symbolic execution. This is not a problem for many of the people in this room, but might be a problem if you want to kind of generalize the approach. So sometimes the results might be surprising because you don’t know how the symbolic executor works. For example the fact that you have different tests every time. It is like, “How come you have different tests each time?” Well it’s because it depends really on how the paths are explored. So that’s kind of trying to decouple the kind of underlying technology from the puzzle creating. Then of course anybody should feel free, I’m trying to summarize what I heard and what are my impressions are as well, but feel free to add and jump in. Another issue is the problem creation: So how do you choose interesting problems and what feedback systems do you use? This is really what is good and what is not good. There are tons of studies on whether [indiscernible] good examples to use, but it’s not really clear what will work and what will not in terms of teaching the students. One of the points that were made here is that there seems to be good problems for different purposes. So some problems that are good for a given purpose might not be good for a different one. So for example there might be something that is great for teaching a specific topic to the students, but if instead what you want to do is engage new students and get them interested in development maybe you have to use something completely different where they don’t learn anything, but have a lot of fun. It’s not clear that these two things necessarily go together. So it will be good to have some sort of classification of what problems are good for learning? What problems are good for engaging and maybe other categories as well? Now we know that everything seems to be seen together as a whole. So let’s see, primitives, oh yea that’s something that was really interesting for me. The primitives for the designer that for example are kind of providing this higher level language in which you can kind of add some of the information to the example filters. It’s like now we have to do this assumed to avoid the null inputs. It would be very nice if you could just say, “I don’t want to see any null input.” At that point you just remove the generation of those. I don’t want to see any overflow. So, you don’t have to worry about how that kind of happens. In the generation you have to use the assume. You just specify that they are at a higher level. Something else is like specific inputs and you might say, “Oh, I really want to see 25 as an input because I think it reveals some interesting aspect of the problem.” And again you can do it by adding a branch, but that requires that you understand how the system works. So it would be nice if you could specify that or a set of inputs. Something that was brought up by William, which I think is very relevant is non-determinism. So it would be nice to be able to make some non-deterministic choices in the specs. Enforcing varieties is something that actually I found very difficult. In many cases when you develop the problems you end up getting the same values for the inputs and that is not very interesting. It doesn’t provide much information to the student. So we ended up kind of putting a lot of assumes on differences between different elements of the array and it get’s very complicated. It would be nice if you could just kind of give some directive. I want to see some sort of variety in the inputs that are generated and maybe explore why they are part of the domain of the solution. It might reveal more interesting facts to the students. Also hints may be a possible way of saying, “If this happens or if you follow this specific path,” I definitely want to mention this to the students. It’s something that in Richards work was done automatically. Maybe you also want to provide the developers or designers a way of kind of providing specific [indiscernible], because you know that it would help the students. Another issue that was brought up and discussed quite extensively is, and just tells me if I’m talking too much. >> Judith Bishop: Not at all. >> Alex Orso: What is the best metric to track the problem difficulty? So how do you decide whether a problem is difficult or not? Is it the number of tries or is it time? And I agree with you that time is not a good metric because you don’t know what the students are really doing unless it’s a controlled environment. On the other hand it’s the number of tries. So we listened to different people that have different strategies. In some cases people just liked the feedback so they are like, “I will submit 300 slightly different solutions because that’s going to help me get a very fast way to the solution.” So there should be a way, maybe a smart way, of figuring out what is the amount of effort put into the solution? So something that was mentioned I think was for example the fact that small deltas maybe [indiscernible]. Frequent deltas shouldn’t be [indiscernible] exploratory mode. So maybe there is a way of quantifying that in a more rigorous way. Some issues that were not mentioned by anybody, but we listed them yesterday so I extracted a couple that were not covered by the rest. One I think received kind of a lot of interest and it was the idea of preserving tests including the order. That seems to be really a necessary feature because as you mentioned some people write down the tests that they see. I ended up doing the same when I was trying to solve the puzzle. So it would be very nice if you could say, “Okay freeze this test. This is one I want to see again.” It seems like something that should also be relatively easy. I mean everything is easy to do right given time. >>: Yea, maybe to just jump in there, but I want listen to the entire list of entry. >> Alex Orso: Yea, I’m almost done. >>: I think in general one can distinguish two kinds of feedback. There are those that I would consider controversial in the sense that sure you would like to see more values, more diverse value in an ordered way immediately, but I think it would to a large extent destroy the fun of the game. So there are some things that I think are controversial and maybe we need to study to see what the impact is on the education aspect, verses how quickly people solve it verses whether it’s fun and engaging. There are some other aspects which I think everyone agrees with. It’s an interesting challenge how to come up with good hints or how to design puzzles. Those are definitely hard problems and we don’t have any answer. So just in general when looking at all this feedback something to keep in mind is there are some things where it might go either way and then there are clearly research deficiencies or big opportunities where we don’t know any answer. But keep going, keep going. >> Alex Orso: All right. In general it’s clear why it’s not there, but just to mention it again is user friendliness of the environment. But, as you say it’s for internal use so it’s perfectly fine, but if you want to have a broader adoption of course people will want more features like completion and better feedback when you write the code and so on and so forth. >>: So educators will tell you that they don’t want completion. >> Alex Orso: Oh really. >>: Right because they want their students to actually learn. >> Alex Orso: Oh, no I was thinking for whoever is designing the problem. >>: For the teacher. >> Alex Orso: Not for the students, for the teacher. >>: In fact some educators say, “I love the fact that there is no completion.” >> Alex Orso: Yea, but for whoever is designing the problem it will save you time if you don’t have to remember the exact syntax of the language. >> Judith Bishop: But we have discussed whether we would be able to get the [indiscernible] editor into Code Hunt. That would give more. >>: Yea, it’s a possibility, but what [indiscernible] said it’s not clear that we really want to. >> Judith Bishop: Right. >>: I mean again it’s like having more values in an ordered way you want it, but it would harm the value. >>: So that one is probably more doable than having a complete IDE experience with completion right now, because we would have to compile the code in the browser and things and we don’t have a compiler for C#. >>: Well in fact they have one. In [indiscernible] we do have completion for C#. >>: We do have completion, but we haven’t migrated it. >>: Yea and for a reason. >> Alex Orso: Because you can also think of it in terms of the pedagogical aspect. If you are doing very simple code in assignment maybe you don’t want completion. If you get to a point where the key thing is really figuring out the algorithm then at that point you want to have completion because you don’t want them to stumble on the syntax. >> Judith Bishop: So we’ve got hung up on one of the smallest things out here. >>: In fact out of all of these challenges, whether an IDE was code completion or not as better for education I wonder if somebody already studied that, because you don’t need Code Hunt for that. Just split up your classroom into two and give some programming assignment. We should do our homework and look that up. >> Alex Orso: Okay. >>: Is it happening or not? >> Judith Bishop: Well what did the students thing? I know you had some thoughts about that didn’t you at one point? >>: On auto completion? >> Judith Bishop: Yea, whether it was a good thing or not a good thing. >>: I think the purpose of Code Hunt is not to know the syntax or the API of C# or Java, whatever you are doing. So auto completion would definitely –. >>: But you are under time pressure. You have to produce something that syntactically correct, compiled and –. >> Judith Bishop: So he’s saying it would be a good thing to have it. >>: What I’m thinking is that it’s the purpose of learning those things so why not support it in this IDE? >>: It’s tricky. I mean coloring is easy. >>: We have done it so it’s quite possible. We made another interesting experience when first sold out Code Hunt in a context that was done in China. So at least until recently there was no Windows Azure Data Center in China mainland, but maybe that has changed by now. The closest one would have been on Hong Kong, but it still has to go through the grid fiber. So we had feedback basically that there is a huge latency for people being in China and if a lot of data is being sent around in the browser it’s a bad experience. So that’s another thing, it might even get unfair depending on whether you sit behind a slow connection or a fast connection. So that’s another thing to consider, just throwing it in. >>: But code completion would be probably inside JavaScript right. >>: So that means you have to download the JavaScript. When we take the editor lodes too slow we actually use a text area and we don’t use any kind of fancy editor. That means all these players would be unfairly at a disadvantage because they don’t have access to colors and latency is a concern. >>: Well it doesn’t all happen in the browser, but then you have a round trip every time. I mean we would need a C# compiler or a Java compiler that runs in your browser. I’m not sure if that even exists. >>: So what I’m thinking and I just did some right now, I know I’m not a C# developer, I’m usually Java or Ruby, but always [indiscernible] upper case for the properties. >>: We will retrain you. >>: Yea, apparently, but that was kind of annoying because control S what’s going on? Oh, yea and right upper case. >>: The Java is lower case. >>: Right, but I used C# now because of obvious reasons. >>: Did you have more? >> Alex Orso: Yea, let’s see that’s the last one I have for issues. So now: Opportunities, which is what it’s all about. So what is next in terms of opportunities? So I think several people, including myself, I’m very interested in what we can mine from the set of solutions, which can be seen from many different perspectives. So one thing could be how do people get to the solution? Where do they make a mistake? How can you learn typical mistakes? How can you use the feedback to then make the assignments better and to teach better to the students? What if you have like a student that submits solutions that are correct, but bad and then they get better? How do they do that process? Can you learn how to improve it? Maybe you can then speed that part up. So there’s really partial solutions verses complete solutions. There’s really a lot that I think can be mined and I don’t think we have a clear understanding of all the opportunities there. So the fact that you guys are making the data available will be great. Including for program synthesis, for people working on program synthesis can you then use that to sort of simulate the way in which a human being synthesizes code and maybe have a better program synthesis approach that is more similar to the one that humans use. So that in my opinion is really the best opportunity here. We have this unprecedented set of data and we can use it to learn how people learn. So that’s pretty good. Something else that I think was mentioned and definitely seems to be interesting is AB testing. So it will be interesting, you can do a lot of AB testing, because you can have different hint systems, different input generation system techniques and just if you get enough players you would be able to really explore what works, what doesn’t work and what works for what kind of population. >>: So what I was mentioning is that you can do it even in your own classroom where you actually add a comment that gives the specification for exact sizes. Then you can look at your dashboard and see what’s happening between classes. >> Alex Orso: Yea, in fact I think this is something that is true in general for anything that is online and with enough participants. You can just try different things for different parts as long as you can separate in a fair way the students assuming that it’s a class that’s taking full credit. >>: I think Billy’s point is that while it’s not entirely trivial to plug in a new hint system. What basically anyone who goes home after this event can do is upload two different zones, which differ in maybe descriptions or other ways and compare how students do in a pretty nice automated fashion. >> Alex Orso: Yea or even just what we were doing to improve the different problems where you can say you can try different strategies and see how that works. The other thing that’s probably more for me than for anybody else, but I don’t teach introductory classes. I normally teach software engineering classes, which means that we are dealing with more complex code and I would be really interested to see whether you can kind of take this to that level. And of course it’s not going to be exactly the same because one of the beauties of Code Hunt is that it’s self contained, it’s relatively small, you see the whole program, but maybe there is a way in which you can kind of push in the constant so you can have design, development and you can kind of use this same gaming approach for the high level abstraction for larger system. >>: I think you’re going to challenge your best students with this. I’m sure the students will tell you it can get really hard. >> Alex Orso: Yea, it’s difficult, I know. >>: I don’t think the range is limited to intro. In this structure that’s a couple of arrays and then you can start having enough pointers that it get’s really tricky. >> Alex Orso: No, no I understand, but the point is like if you are teaching a software engineering class and you have these kinds of issues people will jump on you, because they will say, “Well I’m not here to learn programming. I want to learn how to design a system.” >>: So Tao Kyle had some interesting ideas on how to leverage the system to teach design patterns. I mean we start by showing you that you fill in this one function, but really you can write 64K of code with classes. So in other words you can look at the challenge of how can we use this framework to teach balance? Tao had some ideas, but I think [indiscernible] and there is not much more to discover. >> Judith Bishop: And patterns are only one part of software engineering as well. >>: But you’re right that it has a scope. >> Alex Orso: Yea, it has a scope. >>: It has a scope. >> Alex Orso: But, it would be interesting to see if maybe there is a variation of this and you can have maybe a broader scope. And maybe it can be put together, because then you can leverage some of the –. The problem with these kinds of assignments is that, for example in one of my assignments that I just brought a couple of days ago, is that you develop a small Android app that does something and there is not a correct solution and there is not an easy test that you can run. So how can you use a system like this in that context? I don’t know, is it possible? >>: No. >> Alex Orso: Ah, I don’t know. I think the clustering thing is something you can still do. Like say you get all the solutions and then you cluster them to identify patterns of development that might be good or bad and then you classify them and give feedback to the students. So it’s not going to be the same thing where you submit your app and it tells you good/bad and then you submit another one, but maybe if you kind of obstruct the feedback on other aspects of the code. >>: With your Android app that’s something really hard in the core of it. Then you have the whole [indiscernible] about doing Android and getting all the SDKs and stuff. But, at the heart of it there is a hard grade to crack. You could think that your students would just pop to window and start to test it. We could get a better feedback than just trying to write them and write a test [indiscernible] themselves. And we found sometimes [indiscernible] where we would just pop the browser and we needed a little parser and we [indiscernible]. So it was completely correct and it copied back into [indiscernible]. >> Alex Orso: Yea, that might be also a way to do it. >>: So I will draw the line where the question is: Is there a correct solution that you can characterize or is there some subjective aspect to like an Android app? So how it behaves, how it looks and that’s clearly outside of the scope of what this is all about. So what we have heard when we run contests or from recruiting and from teachers is that the appealing aspect is really this fully automated system which doesn’t require someone to assess the quality of solutions. So if one keeps that in mind that should have to figure out what problems can fit and which don’t. So there is certainly some aspects of design patterns where the beauty is in the eye of the beholder, but then if it’s about having a visitor for a tree either the visitor produces the right [indiscernible] or it doesn’t. So that’s how I would characterize the problems and then see if it fits or not. >> Alex Orso: Also for mobile apps for example you might think about constraining a little bit and you have to pick your app in a suitable way. For example there is a display for the app and you have to have these input fields and so on. At that point you could characterize that and that becomes your input for the system. And there you could really do something along those lines. So it has to be a specialized system, but maybe if there is enough interest in a specific domain you could have a customized version of Code Hunt for mobile apps. Who knows, it’s just something that might be worth considering. That’s actually all I had no my list. >> Judith Bishop: Well I can immediately fill in on that opportunities list. We didn’t mention it, but one of the aspects within the sandbox that Nikolai has already been talking about, or just on the surface or edges of it, has been the idea that we should be able to have levels that build on levels. So this is a very typical game experience that if you pass one level what you built in that level enables you to unlock and get onto the next level, but also you use that. So what that would mean in the context of Code Hunt is if you built a procedure in one function in one level you would be able to call that function in the next level. So you’ve actually built something and you can move on with that. So that’s a sense of achievement that you are not currently getting. It would be another fun aspect you know, build you X, get your elixir and then move on and kill the dragon’s sort of thing, right. >>: Yea, actually I thinking whether you could break down even very complex system? >> Judith Bishop: Yes so then the second level of that would be what you are actually doing is building a class. So since we don’t yet have objects maybe the idea would be that you could structure Code Hunt so that what happens is that within a sector you can build a class and then in the next section you can instantiate your class. So that’s a thought. We haven’t thought it through completely and it would obviously be something that would require a lot of building, but I think it would fit within the model nicely. >>: I think that’s a really great idea and I think that actually ties back to some of the questions about difficulty too, because you want to structure those levels in a way that that difficulty progression makes sense. So really having a good grasp of what difficulty is and being able to measure it for an individual is really essential I think to have that experience be a good one for learning. So I am really interested in that. >>: And remember like the last summer we had some discussion on that. >> Judith Bishop: Oh exactly, no we did, it was on our list and it’s still on our list. >>: Okay I just wanted to check the progress. >> Judith Bishop: Well so this kind of brings us to implementation of action to some extent. The bare factor is that Microsoft will eventually and probably sooner rather than later, is stop development on the project itself and for many good reasons, not for bad reasons, but simply because it makes sense actually to involve other peoples good ideas and get other people working on it. So, each of these suggestions is a packet of work that somebody could work on should they wish to. That assumes there is a decent interface that enables them to come in and we would have to ensure that is available. The most decent interface is that you send an intern and the intern works on it. Unfortunately, that is also not possible all the time, because we might not have slots or we might be busy with other interns at the time. So it’s not a silver bullet, the intern option. >>: I mean I’ve already had this discussion with the current team, but basically assuming that for example my group [indiscernible], how to really have them hook it to be live, at least in the small scale so that we have feedback, data to see how things go. I think from the outside, for example, how do get the Cloud resources? I mean I think the easy way is like we host the service on our server or our Cloud, but it’s difficult for a university to get that kind of a budget to really have hosting, but maybe [indiscernible]. >>: So Microsoft Research has [indiscernible] research and you could apply for that and get VMs and get resources on Azure. >> Judith Bishop: Are you talking about hosting the service or hosting the data? >>: The services. But, of course we need to have some way to get the data. I mean if the service will be used the data would come in. So naturally the data would be, at least part of the data, would be in the service. >>: So I don’t know if it became clear to everyone how the system works, but there is a back end that does all of the groundwork of doing the [indiscernible] explorations. It can scale up to use a lot of CPU power and this potentially costs a lot of money to us. Then there’s a front end aspect that optimizes web sites and guides you through the sectors. So those are basically two different aspects of the system and some of the things we talked about are really front end related. So if you want to do AB testing where some users see certain things that other users don’t that’s mainly a front end issue. Even if you want to generate some kinds of hints then if that’s your own hint generation engine then it’s again about showing something to some users and maybe not to others if you want to do AB testing. So there’s a back end that possibly costs a lot of money to run and then there’s a front end which is all the light weight. The one option we were thinking about is to make that front end open source, which would allow other people to pick it up, make some changes and then you could either deploy that locally, which would be you can do that on your laptop and have your students party on it or if you did something really great we could actually deploy that on our servers since it’s open source and we all work together, assuming that the basic idea of the distinction between front end and back end is reasonably clear. How does that sound? Does anyone have any thoughts of what kind of experiments would be possible in that setting, or what kind of addition you would want to add? Does anyone feel that what you really want to do is change something about the back end you are interested in generating different kinds of values or any of that? I am sure some of you have some ideas of a research question you want to answer or some feature you would like to add. Does anyone have an opinion of what an open source front end would help or if that’s not really a help? >>: Can I ask a little bit of a clarification question? >>: Yea. >>: So how hard would it be with the infrastructure you are proposing to access things like a set of samples of code attempts rather than data about numbered tries and things like that? >> Judith Bishop: Well you probably weren’t here this morning, but we are releasing that. >>: Okay so that’s easy and then the back end, the only thing you are talking about with access to the back end here is to change the way that back end works, rather than just [inaudible]. >> Judith Bishop: Sure. >>: No, I like the idea of being able to use that front end and getting access to the data simultaneously. >> Judith Bishop: So the plan is to regularly reduce chunks of data. The first one would come out within a couple of days just making sure it’s right and then every few months we will bring out another lot with different data. >>: One idea would be that if you upload your universe you access to the data. >>: [inaudible]. >>: Then if you upload the universe and you own it you own all the programs that are entered through that universe. >> Judith Bishop: Yea, that’s already the case, but –. >>: No that’s not the case. >>: Not accessible to the teacher. >> Judith Bishop: Ah. >>: It’s there in the server right. >>: Right. >> Judith Bishop: Okay, I think that’s one of the changes we asked –. >>: [inaudible]. If you want to do studies then you would be able to query the Cloud and download on demand. >> Judith Bishop: Yea. >>: Ideally you would like to have something like Code Hunter or service where you can access the different aspects. Say for example I want to use the back end, but I want to use my own front end and I can just provide you with whatever are the proposed solution and you give me back the results or I can use the different pieces. But, I don’t know how much modular you can make that. >>: That already exists. >>: But where dose it breakdown? Where are the individual modules that you can use? >>: The big distinction is front end/back end. So the back end manages the data and performs the test case generation and it is driven by the front end, which does all the UI basically. If we make the front end open source then you could go in and change the [indiscernible] and tag to the back end. So instead of using [indiscernible] to generate test cases you could use call your own test generation to produce that. >>: That’s the rest interface that you were talking about. >>: Yes, exactly. >>: We had an intern this summer that built Code Hunt in a touch develop. >>: Yea, that’s an example. >>: So he wrote a plug in touch develop and he basically rebuilt a complete front end based on the touch develop language with a cross compiler to C#. >>: And that was done without any need to modify the back end? >>: No modification. >>: That’s good, because I think that’s the kind of thing that would make it very –. >>: We suspect to what Nikolai proposed is making the front end open source. That’s very important if you wanted to [indiscernible] on the UI. But, for some of the work, for example like in generation, test data generation selection, there are some kinds of well known important features. If you could expose that as a more like call back service and then we configure through your website it would just use the one that we hooked in. >>: So Tao that’s exactly why we want to go open source. You would define these hooks and you build your system and then it would take the pull request and see where we can implement it on our side. >>: Because you can do this with the same interface, the rest interface. You could just submit that you want to set this parameter this way and this parameter that way. If I understand what you mean. Are you saying, “Define the parameter for example like how you do symbolic execution?” >>: Right. >>: So many of such extensions you can probably do just by tweaking the front end. For the hint generation it turns out that actually the hint generation was pretty much a complete stand alone project. There are very few exceptions. It needs access to the secret solution and attempts from other people, which you would actually learn over time, but if you plug it in somewhere in the middle it also would like to have access to the existing database. So the tricky part here is to figure out: What do you need to do a particular extension that you might be interested in? Then like what [indiscernible] said once you identify that, you can add the right hooks to make that happen. So again if anyone at this point has some particular research ideas that you are very interested in then it would be interesting to know and we can discuss what particular extension points you would actually need to turn that into a reality. >>: So pass condition. >>: Pass conditions, very good, we have those. So the user enters the program and then when they click the button you would want all the pass conditions that [indiscernible] discovered to be sent to your server to be analyzed? >>: Yea, the ones that the article and the test differ on. >>: Oh, the miss matches. That’s a very interesting concrete ask, okay. I can see that works. It’s interesting that it doesn’t even revere the secret program in all detail. It’s kind of an abstraction and that’s interesting. >>: If the pass condition boasts together the pairs you would review it to some extent. >>: Yea, I guess if the user program was completely trivial then it’s going to miss match pretty much everywhere and that will revere the entire program. Anyways, pass conditions, that’s interesting and any other thoughts? I mean we heard about the C programming system and the extension mechanism in there and you said you already thought about other extensions. Do you have some experience in the surface area, the API, the interface, that is in between an external extension and your system or do you still think that you will tweak it as new extensions come up? >>: Yes, we can tweak it. [indiscernible]. I did not get much time to design these things. [indiscernible]. >>: Yea we are kind of in a similar situation for code and when it comes to extensions. >> Right, right. >>: So this is kind of a logistic question, but since we are getting the data from you any specific, a little bit synergetic kind of activities that all these researchers could do? I mean I could imagine some competition on a particular kind of feature. I see a lot of more resolves would be produced by these more kind of loosely coordinated kind of [indiscernible]. >> Judith Bishop: Well I think we were thinking of doing some analysis of the data. Is that still in the chapter or did we dump it? >>: For now it’s still in the chapter. >> Judith Bishop: Okay, so you mentioned that you had some tools that can analyze programs. >>: It’s just [indiscernible]. >> Judith Bishop: Uh huh. >>: It’s just some [indiscernible] tools that give some hint about code quality. >> Judith Bishop: So you would be looking at the grades, whether they are going up or going down and I think you were also doing that? So I am not sure whether we would be having competitions, but we could be having paper collaborations once we start analyzing this data, because we would be analyzing it in different ways. >>: So I think another thing would be private [indiscernible] or open source. So for example in the past all these extensions that my students [indiscernible] and other people could leverage that attempt at whatever usage they could. So it seems we are really working on similar data sets or the same data sets and building some features. I think that some of them could be data analysis. Some of the tools could be just building some feature, hint generation, test data, selection, and progress indication, whatever. I think it really makes sense to have kind of a sharing in terms of the progress and the infrastructures. >>: Yea, I mean right now we give you the data, but it doesn’t exactly come with the written analysis framework, just sharing that kind of infrastructure. Daniel has some infrastructure based on [indiscernible] that is able to pass code and that’s a good first step for anything. And I don’t know what kind of state it is in right now, but that’s definitely something. >>: Uh sort of the problem is that my parsing tools are all intermeshed with my synthesis code. So they are not really particularly portable to other users. >>: Yea, so you should spend some time to factor that out possibly. >> Judith Bishop: Yea. >>: In any case, having a shared bit repository that –. >>: [indiscernible]. >> Judith Bishop: So here’s a practical question: What’s the best way for the community to keep in touch and share ideas? And eventually e-mail just gets out of hand. >>: A forum seems to be a good way. >> Judith Bishop: Okay, which forums do people actually visit? >>: Well, maybe we should release the data on GitHub. >> Judith Bishop: Yea, we will release the data on GitHub and then we can set a forum up there. >>: Right. >> Judith Bishop: Okay. >>: I think if you keep everything in GitHub that would be nice. >> Judith Bishop: So the data released will spring that off then. Thank you. >>: So what to do with the data? One big problem area that we discussed and Daniel presented is hint generation. How can we help people to actually move forward? How can we identify that people are stuck? There is another dimension that I think we very briefly mentioned, but then didn’t discuss in more detail and that’s quality assessment of the code. So right now what we do is we count [indiscernible] instructions by code instructions. That’s a proxy of how compact is your algorithm and it seems to motivate people to keep going, but we don’t really know if that’s the most engaging way of keeping them going. Sometimes it’s confusing and the other is quality metric complexity or some other beauty metrics. So given the data one could certainly have different quality assessment metrics and run that over the code base. >> Judith Bishop: So on the data exactly the winning solutions are identified. So there is a progressing of attempts and then plunk, the winning solution. So one could just take winning solutions and then apply different objective criteria and see how they stack up against each other. >> And of course there is no information about the players, right? >> Judith Bishop: No, except that we do know of player 1. >>: You know its player 1, but there are no demographics and nothing. >> Judith Bishop: We could actually add to it the 3 point thing that is self added by players, which is whether they select themselves as novice, intermediate or expert. >>: That would probably be good. Anything that can be added to the data I think will be good if you can do some clustering. >>: So you can analyze it on the data, but what does it really tell us? That is something we can also turn into. So generating real hints is difficult, but another competition we could create is let’s say you come up with your own new assessment then we could [indiscernible] it out in the cloud and then give different players different assessments from different groups and see which one creates the most stickiness that people actually stay in the system. So for that you don’t even need to know the secret solution, you just have to analyze individual program snippets. So that would be another idea. >> Judith Bishop: So we’ve also got time stamps on the data. So if you want to evaluate winning by time stamps, etc, how long it takes. You can go and do that computation if you want to. >>: I just want to say for me the criteria for the best solution are the one of least paths. But, you actually are sitting on that information because you have to have it already when you do the symbolic execution. >>: Well we sample it. It’s not exhaustive typically if you have anything that’s truly interesting was loops, but there is interesting information that we should make available. So the data we give you contains the user’s submissions and what you can do easily with some script that again should be in that GitHub repository, you can fire that off to the Cloud to get the actual test cases out. So I think the data that we will distribute doesn’t actually have the test cases in it that are generated, but [indiscernible] in the Cloud and is available behind the API that I showed. And again today that wouldn’t give you the path, but a projection of that to the test cases and we could give you more. >> Judith Bishop: Good feedback, yea. I mean with version 2 with the data we could do that kind of a thing. Okay, so I think we should have our last question so that we don’t over run, Tao? >>: I mean you could also run [indiscernible]. >>: No execution paths beyond a second limit, yes. >>: So you probably can’t provide this, but –. >>: In the mining software repository conference community [indiscernible]. Basically every year they just pick one or two open source projects, we [indiscernible]. Then in the end they have this small like competition, but they accepted their papers, present the papers at the conference [indiscernible]. And that could be combined with the workshop idea that we discussed. >> Judith Bishop: Yea, very much so. I think data is a key that opens many doors. >>: You had a question? >>: No I had a comment that we could have some kind of advanced tool where we give a program and we ask people to shorten it. >>: Yea. >>: May I challenge that idea that shorter cycles are better in a program. I mean I come from a self revolution background and usually the programs that have the shorter cycles might not be the one that, from a comprehensive point of view, are better. So what are we trying to teach the students? Are we trying to teach the most performance way? >>: I mean think they are truly different dimensions and all of them make sense. A compact program has advantages and in some settings the best complexity you want in another setting and –. >>: It’s also an exercise in program understanding. You have to be able to understand the program very deeply in order to be able shorten. [indiscernible]. >> Judith Bishop: It’s the same as when you are learning a language, you learn [indiscernible], you learn summary and you learn comprehension. >>: So I think we should expose all of these different dimensions to the player, because I might be interested and they can look at all of these axis’s, but what we also can do since we have a human being involved is try to figure out what is most fun to optimize? >> Judith Bishop: Yea. >>: So that is a component and this is all great for teaching, but what we have found is that the fun aspect is really what’s unique and what keeps people going. That is something that we should also study to keep this going by itself. We don’t want to have to have a teacher sitting behind you to do something. >>: Right, from a [indiscernible] perspective I remember in high school I tried to do the shortest solutions and see. I tried to do everything on one line and as I went onto programming for actual companies, from a self [indiscernible] or maintenance perspective, this doesn’t make sense at all. >>: Well not the on one line part, yes. >> Judith Bishop: Daniel, you wanted to add something. >>: I am just going to say that also because we are measuring [indiscernible] instructions and not [indiscernible] it’s actually often quite un-intuitive what’s the shortest. In fact when I was like playing around and trying to minimize stuff in code [indiscernible] I actually [indiscernible] so I could figure out why some program is actually not shorter. To figure out what the compiler is actually doing and otherwise minimizing [inaudible]. >> Judith Bishop: A master code hunter. Okay, on that note I think we should close our workshop, thank our panelists very much and go forward to do great things. [Applause]

>> Nikolai Tillmann: So welcome everyone. This is... would like to start by saying a big thank you...

Related documents

Products

Support

&gt;&gt; Nikolai Tillmann: So welcome everyone. This is... would like to start by saying a big thank you...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Nikolai Tillmann: So welcome everyone. This is... would like to start by saying a big thank you...