>> Juan Vargas: We are ready to start with the second session. This one is "Tools for Parallel Testing and Debugging." And we are going to have two presentations from the University of Illinois. Darko Marinov is going to talk about the testing tools at University of Illinois. Then Danny who just came from the other building, he's going to be talking about "Refactoring." Then we'll have Koushik Sen -- Koushik Sen, hey -who just got tenure. [ Audience clapping ] >> Juan Vargas: from UC Berkley talking about "Active Testing and Concurrit." And finally Sunny Chatterlee, right here from Microsoft, is going to be talking about "Fighting Concurrency Bugs with Advanced Static Analysis Technology." This is going to be a very compressed session. We may or may not finish by 12:00 because we will have lunch. So if we can make it probably 12:15 because lunch is going to be in the back wall. [Inaudible] going to be okay. >> : That means all the good stuff will be taken and we'll just eat the scraps [inaudible]. >> Juan Vargas: Yes, both. True for both. Okay. So please let's welcome Darko Marinov from Illinois, and he is going to be talking about UIUC testing tools. Okay. Thank you. >> Darko Marinov: Can I just close this? >> : Yeah. >> Darko Marinov: Okay. Thank you. So, yeah, I'll be talking about some work we've been doing on testing for [inaudible] code at Illinois. Obviously it's a short presentation. I'll focus it on my work but there has been other by other colleagues as well. So we've seen in the previous session that people say it's difficult to develop multithreaded code. I'd like to make that a bit more precise to say it's difficult to develop correct multithreaded code. I think it's very easy to develop incorrect one, just not the one that we want. And it's very easy to introduce all kind of these bugs. Races, deadlocks, atomicity violations mostly come due to the non-deterministic scheduling. So not only it's difficult to develop this code but it's also difficult to test multithreaded code. So this is what's usually done. You have the code, you write some test and then you need to explore all these different [inaudible] that you have that could potentially lead to different results. So the issue here is that these failures can be triggered only by some specific schedules. So it's difficult to explore these schedules. The [inaudible] space is usually very large, so it's needs very sophisticated knowledge to do that. Indeed, most research is focused on that, on exploring the schedules for this. A lot of good work here from Microsoft, you know, folks have worked on [inaudible]. You know, people from Berkley have worked on active testing. I know that in the next talk that Koushik will be talking only about active testing. Actually I just learned from him that he'll be also talking about other things. Well, let me just say what most of this existing work focused on basically it was thing that someone somehow wrote the test for the code. And you have one code version, and now what you need to do is to explore these [inaudible] to find whether there is a bug. And there are a lot of techniques that are proposed for doing that. There are many other problems that we have in testing multithreaded code, especially if you want to unit test this code. So the issue is how to write these tests. Most previous research just assumes that the test somehow exists; the developers wrote it manually. But it's not clear how people can write that especially in the cases where you do want to encode certain expected result that does depend on the schedule, how to express the specific schedule for which you want to encode the result. Then the next thing is how to explore these tests especially when the code evolves. Again, as I said most of the previous research focuses on analyzing only one given code version but as we know code evolves over time, people make changes, correct bugs [inaudible] functionality. So in the context of sequential code there has been a lot of work on this regression testing, how do we make testing better. As the code changes, how do we make it more efficient? This was not addressed widely in the context of parallel code. And then there also the issues of how to dramatically generate some of these multithreaded unit tests, how to generate the test code itself and how to generate schedules. These are some of the challenges; there are obviously others, but I picked those ones that we worked on. So we do work at Illinois on all of these three topics. And what I'll spend the most time on today will be on this, the first topic of writing these multithreaded unit tests. Basically, how do you manually write tests especially in the cases where you want to say that the result does depend on a certain schedule that is taken. So we've developed a tool that we call Immunity or IMUunit, immunity for improving this testing. We've also done work on these other two topics. So one is this regression testing. And the idea is I test my code once, now I go and make the change in the code. Typically the change that I'm making on the code is quite small, but it's running this testing that takes a lot of time. The question is can I make my testing faster if I just focus on the change? So we've developed some techniques there for doing this test prioritization, test selection. [Inaudible] building on some very successful results from testing sequential code. And then one of the issues also that I mentioned was automatic generation of tests. So we had the recent result on that, something we called Ballerina, can ultimately generate the code and the check for some parallel bugs. And there is also a lot of other work by the colleagues in the department, some of them are through UPCRC and I2PC. Marthu Parthasarathy has done a lot of work with Penelope and Exceptional and a few other tools and approaches testing [inaudible] violations and looking at other problems. Grigore Rosu has done a lot of work with jPredictor and Java MOP extensions for concurrent code and so on, but I won't be talking about that because I don't have time. So the focus on will be the IMUnit. This was a project not only funded through UPCRC but we had some other, you know, NSF and NASA, NSA, Samsung and so on. It's a part of a project on trying, basically, to make this parallel testing for parallel code easier to be adopted into the whole software development process. Some people who worked on that: Vilas was the senior student who was leading this, a few other students and then Grigore and myself. But let me get to the technical things. So here's the example of this thing. Let's say we have a certain class that we want to test. This comes from some open source codes from Google. Let's say we have this class, the ConcurrentHashMultiSet. It's basically a collection but it can store objections. If you have operations like add, remove, count and what you would like to do is to write the test for this particular class. So let's say we want to test certain scenario but we have two threads that execute, so they operated on the same shared object. One of them does two "adds" and the other thread does the remove, and what you would like to do is to test whether these add and remove and count do behave as we would like them to behave. Now the issue here is that the value of this count is scheduledependent based on the order in which we execute these add and "removes", based on the order in which the instructions from them interleave, we can get different results. For example if you're executing this scenario, what should be the count of forty-two? Uh, one. Oh, okay. That was the answer, one. I thought it was a hand up. Okay. So the answer here should be one because we added, then we remove it, and we add it again. The question becomes how do we encode in our scenario this particular schedule? Now notice here that, you know, we are not doing here interleaving of methods. I just even want to say they just need to be executed atomically. So what we've seen a lot in the open source is something that's bad. Don't do this at home. This is what people do, so they use a lot of these "Sleeps." They basically say, "I'm going to start two threads." This is the actual Java code. This is starting one thread. This thread does two "adds". There is another thread going on here; this one does remove. And then in order to ensure that they are getting this order or [inaudible] in order to attempt to get the specific ordering of the events, they add these "sleeps." This basically says, "Wait for forty milliseconds, meanwhile I hope something else will finish. And I'm going to, here, wait for eighty milliseconds and do this." So as you all probably know there are a lot of problems with these "sleeps." They are not the best way to do that, but we've seen that people do this a lot. So some of these problems include that these tests are fragile. And what I mean by that is the following: that even when you put these "sleeps," you're not sure that you're going to get the result that you want. It may happen that your Java Virtual Machine does not execute the schedule that you intended. So we get into this situation where the test seemingly fails even though there is no [inaudible]. So the test did not fail, and we get here say two or zero not because there is something wrong with your remove or add but because these did not execute in the order in which you wanted them to execute. So in order to prevent these problems, people usually put these bounds on "sleeps" that are very long. Let's say, you know, "Wait here forty milliseconds," when, you know, ten would be enough. But just to make sure that you get the schedule that you want, you wait longer than necessarily and then you get inefficient tests. You also get tests that are non modular. Namely if I have two very similar tests, I cannot combine them. I cannot reuse that code because I am sprinkling these "sleeps" all over and I cannot reuse the "sleeps." What I want to reuse is this thing that I have these "adds" and "removes". And then the schedule is very implicit. You know, if you try to understand what this test is doing, what is it it's encoding, it's going to be very hard to figure out what's going on. So others [inaudible] recognize this issue, so there have a been a few research proposals how to address these things, this ConAn, ConJUnit thread control. The latest solution before ours was something called MultithreadedTC from Bill Pugh and his group from Maryland. And it addressed some of the problems from these sleep-based tests but not all the problems. So we've then proposed our solution which we call this IMUnit which we hope makes these things easier to solve. So here's how IMUnit looks like. Instead of writing those "sleeps" that I had there, what you do is write these events. You say there's certain events that are happening in the execution. And then we write this schedule that says what is the order in that I want of these events. Basically what I am saying here is that I want this add to finish before I start this remove. And then once I finish the remove then I want to start this add too. So I insert these "adds" and then I add that schedule there. Now this makes this test robust. What I mean by that is once I write this schedule there is certain execution engine underneath this that's going to ensure that the code actually does execute in the order in which we specified. If the Java Virtual Machine wants to execute in a different order then it will start kind of stopping certain threads to ensure that you get the actual order that you want. As we'll see in the experiments, they turn out to be even more efficient than the sleep-based tests. They're modular. This is something I did not show, but basically you can reuse the schedules from different tests. If you want to have different scenarios, there are actually two schedules up there. And the schedule is very explicit, so if you need to understand what this test does, if something fails, if you need to debug, if you need to change these things, it's much easier to understand. Yes? >> : [Inaudible] code a deadlock or something bad in your schedule? >> Darko Marinov: Yes. It's very, very possible. We have both static and dynamic analysis to try to help with that. [Inaudible] code is the partial order of events; you can just introduce the cycle. So you just do something wrong. Let's say there instead of two you put one by mistake. So we can statically find some cycles. But, you know, people usually don't have too many of those. What does happen often is dynamically as you run this code the schedule may be unrealizable and then we can detect whether this is happening because of the deadlock in the code on the test or because this execution engine got stuck here waiting on certain events to happen. Then we can give appropriate warning or error message telling you that you are having this problem. Okay. So to see how expressive this language is: so we got about two hundred tests, these sleep-based unit tests from open source Java code from various projects then we transferred them into this IMUnit by adding these events and orderings. And we found that we were able to express almost all of them. One issue that we did not support, though, is events in loops. So we cannot actually, you know, use this IMUnit thing as kind of general purpose programming where we would enforce ordering between events because we do not allow events to be repeated. And the reason was that simply we did not need that for test. If we wanted more expressive language that's something we consider as a future work but it's just going to make the language much more complex and, you know, potentially less likely to be adopted because of that. We also measured the speed of the execution. So what we found was that we were about three times faster than the sleep-based tests. Across all of these two hundred tests when you run with the sleep with the bounce that the developers put [inaudible] without engine, the issue here is that simply these "sleeps" are inefficient besides all the other problems that they have. So basically to just summarize on IMUnit: so it's [inaudible] to write these multithreaded unit tests. The current dominant solution kind of in practice is using these sleep-based tests despite all the problems that they have. So IMUnit addresses those problems and the schedule language is expressive and our execution is efficient. We have also a tool to help you migrate from these old, traditional sleep-based tests to the new things. You know, more details are online in the paper on the tool. So the tool is publicly available. We had some people who downloaded; they send us bug reports. On one hand we can view that as bad like, "Ah, our code has bugs." On the other we can view that as positive: someone is using our tool, you know, and cares enough to submit a bug report. Okay. So that was about IMUnit. I'm just going to skim in two minutes through two other projects, so one of them is this Change-Aware Preemption Prioritization. So the goal of IMUnit was just write tests; the goal here is as my code changes how can I make my testing faster? So basically here is the idea, you know, I have the code, parallel code. I have test. This testing here takes a lot of time. As the code changes, the change is typically small. Can I somehow explore the knowledge of this change? Can I statically analyze this change and explore that to optimize this process here? And we found that this is indeed possible. Here is a comparison for using this change-aware prioritization. Basically if you have something that understands what's changing the code versus just doing the exploration that does not understand that. And so what this shows is the speed-up [inaudible] over these change-unaware prioritization kind of as the best case in some of the studies we do is [inaudible]. In some others we use some other exploration approaches. Some of the things we sometimes obtain a speedup of five times in this testing. These are for various statefull and stateless explorations. Sometimes we obtain 2.7. There was only one case where we were slower, where our approach was slower than the default. So overall take of a message here is that there are ways to make regression testing faster for parallel code. And then the last thing that we did most recently was this about automatic generation and clustering of these unit tests for multithreaded code. So rather than manually writing some of these, to try to automatically generate. And basically we call this technique Ballerina. So what it does, it generates these tests that use random generation to find kind of randomly here what methods to put. So it generates some complex prefix; it generates potentially complex subjects. And the example that I had with that multiset [inaudible]. It was simple things but it can generate more complex things and then run some methods in parallel to try to find bugs. So the good thing here was that we could find bugs. The main potentially bad thing was that we find too many failures. So we had some clustering methods that help us to identify which of the problems are likely due to the same root cause. So evaluated on some known bugs, we found that this approach worked better than some other baselines. And we also found some previously unknown bugs. So basically to summarize we have, you know, a lot of work going on, on testing for parallel code at Illinois. I'm just presenting some of the work from my group on this IMUnit and then this regression testing test generation. There are other colleagues who work on other things. Okay? >> : Any questions? >> : What about an issue in coverage? [Inaudible] is good enough then you need to get some measurements on coverage. >> Darko Marinov: Okay. That's a very good question. So the question was, can we measure coverage and we did not look into that. But there is work on proposing various coverage criteria, how to measure, maybe interactions, the shared variable, did you cover certain interactions about that, do you cover certain, you know, locks, [inaudible] on the locks. First even defining the coverage criteria for parallel code is [inaudible] how to do. But, yeah, this work we did not do that. Yes? >> : Back on the IMUnit work, writing a test it actually seems pretty hard to read, right? Why can't you just write the thread [inaudible]? >> Darko Marinov: Okay. >> : Putting the markers and then having the, you know, [inaudible]. It seems like it's difficult to read. >> Darko Marinov: Sure. Sure. Sure. So you mean you would more want to write it the way it looked like on a slide? You know, kind of the way this way? Yeah, so that goes back to the, you know, how can I have, you know, an integrated development environment that would let me visualize my code in this way? Right, if I write one thread here, one thread there, is there something to visually spread these things and do that? >> : [Inaudible]. >> : It's [inaudible] this is the thread-one, this is the thread-two. >> Darko Marinov: Sure, sure, sure. Yeah, one can think of those ways how to do that. Yes? >> : This is a question for anybody who builds a debugger. It seems like if we're going to teach parallel computing online and do autograting, having a debugger that automatically looks for [inaudible] then the submitted code would be a very useful tool. But I'm wondering is any one ripe enough to use for that purpose? >> Darko Marinov: So here we did not really focus on, you know, trying to find bugs. It's more like you encode your test. Once you run this, you still need to use some other tools to look for these bugs. But, yes, if you have [inaudible] code you likely need to run the tools to find the bugs. Actually you can just skim through the [inaudible] code most time and just find the bug. You know what I mean? If... >> : [Inaudible]. >> Darko Marinov: Okay. Well, it gets harder. Sure. Sure. If you have, you know, messy codes then you need to run some tool to look for these bugs. Yes? >> : So in your unit testing, I think a better assertion at the very end would be something like the count is either one or two. And whenever it is one -- whenever it is two, [inaudible]. So what you want to say is some assertion that captures behaviors of all [inaudible] as opposed to [inaudible]. >> Darko Marinov: Okay, so that's a very good comment. So the suggestion is the thread that they are writing, I want to get exactly value one. In this particular schedule, the suggestion was why don't here that the value is one or two. So if I just say the value is one or two without saying under which schedule then I can be missing bugs, right? So if my code always returns one that would pass. >> : [Inaudible] if it returns two, it so happened that the [inaudible] executed before [inaudible]. Right? So essentially I was to write assertions for all possible schedules. >> Darko Marinov: That's fine. You can actually do some of that in IMUnit. Unfortunately that's not shown here. But you write assertions that actually say for which of the schedules you're getting which of the results. You do not need to write only one schedule. You can have a larger number of these schedules, and then you can encode, you know, a set of the results here. I mean the limit you can try all these, you know, [inaudible] trials, possible orderings, just encode that you are going to get to one of those under certain conditions. And these events need not be only here in the code and the test, you can actually put them in the test code. You can put them in the code and the test. Sorry. They need not be only the test code, you can put them in the bodies of these methods such that you get there [inaudible] so that it's not executed atomically. Is there one more question here or...? >> Juan Vargas: [Inaudible]. >> Darko Marinov: Okay. >> Juan Vargas: Thank you very much. >> Darko Marinov: Thank you. >> Juan Vargas: So now we have Danny, Danny Dig and -- Okay. >> Darko Marinov: He's going to talk about annotations, I believe. Danny? >> Danny Dig: Refactoring. >> Darko Marinov: Refactoring. Thank you. >> Danny Dig: All right. So I'm Danny Dig. I'm a research professor at the University of Illinois where I'm in the area of software evolution, and especially I want to give more of a high-level talk on my best work on software evolution, my current work on software evolution, and I will conclude with my future work on software evolution. Of course a lot of -- I will put emphasis, more emphasis, on the work that has to do with parallel programming, but I've done much of other works that have nothing to do with parallel programming. Regardless, they're still very useful. So this guy here, it's a famous Greek philosopher. His name is Heraclitus. He is famous for many things, among others this quote, "Change is the only guaranteed constant." You know, this is [inaudible] for us who are using software every day because we know that the only software that remains successful is the software that constantly changes. And here I have example of changes so people add more features. Microsoft constantly pushes more features, new versions. You know, Windows 8 comes now. You know, people fix bugs, improve performance, improve security. In fact the only software that doesn't change is software that is dead that nobody uses. So here are some visual reminders that our software changes. This is from -- just look for updates on my Microsoft office and apparently Office 2011 has a bunch of updates and some of them are critical, so probably I should go and apply them. Now here's another visual reminder. So Windows 7 also has a bunch of updates for me. And I see there are several people here also using Mac's, so if you're using Mac's this is another reminder that also Apple has lots of updates for you as well. Again, these are reminding us that our software constantly changes. So programming is all about change. It's, how can we manage and how can we express changes in large complex, large code-bases. So we need the way we program to better support this changing ecosystem. So I view programming as program transformations, [inaudible] version and indicate version N plus 1. Now how do they do that -- This is one of the research questions that my group addresses, what are the kinds of changes that occur most often in practice? And second, how can we automate them to improve program productivity on the software quality? So answering these questions is not only very valuable for the practice of software development, it's extremely rewarding and challenging also intellectually. Here is a very high-level overview of my work on automating this free kinds -- successfully automating free kinds of software transformations. So for my [inaudible] I've been working on how can we upgrade clients of a library API to move from version 5.2 to version 6.9 or whatever. I've also been looking at software testing which I mostly collaborated with Darko on. Well, it's not only that the software under testing or the production system changes. But when a production system changes, you also have to change all these regression test [inaudible]. You have to change the insertions. So how can you automatically also update the test [inaudible]? Of course the most recent -- My work has been in the area of how can we change sequential software for doing [inaudible] parallelism and [inaudible] parallelism. And of course this is a very hard problem, and I do not believe in full automation. I'd rather believe in an interactive approach where we use the smarts, brain of the programmer to guide the tool. And then the programmer [inaudible] to the tool, "This is the kind of transformations I want to do on my code." The tool is going to check, "Is this safe?" And if it's safe then it applies the transformation. So this is not fully automatic. It's interactive; it's driven by the human brain. More than the number of publications, you know, I am very, very happy to see a real world impact of my work. So I've been developing the first open-sources, the world's first open-source refactoring engine for Java. This was developed as a plug-in for JEdit. JEdit used to be the number one ID ten years ago. Some of my other work that is the area of software upgrading is shipping with official release of Eclipse. So Eclipse is the number one developer environment for the Java developers. It is used daily by millions of Java Developers. Some of the other work that I have done here doesn't ship with official release of Eclipse but is stand-alone plug-ins. But it's still widely used at several companies and several research institutions, so it's used widely at Google and other big large companies. Here in the area of software testing, one of our tools, ASTGen, is using the testing infrastructure at Sun NetBeams where they test net beams. ID, actually, right now is no longer Sun; this should be Oracle. And of course the most recent work on refactoring, interactive refactoring for retrofitting parallelism is going right now into the official release of Eclipse. So if you are using Eclipse Juno release which is 4.2, some of this work is going into 4.21 which is coming late August. So I do not have time to talk about all this work. This is just one slide, very high-level overview of our interactive refactoring for parallelism. So we are supporting all kinds of changes that people find practical and they need in the practice of converting their sequential code for parallelism. We have one [inaudible] called enabling transformations; these are transformations for thread-safety. They do not introduce multithreading, they just take your code and prepare it for multithreading to make it more thread safe. One example of this is making a whole class immutable. So if I'm changing the class so that all the instances from this class, you know, I can never update and never change the state. Once I [inaudible] this class, now this class is what we call [inaudible] thread-safe. I can share it with all my friends in the world, and there is no need for synchronization. Here we have refactoring from the second category. These are actually refactoring to introduce multithreading. So here we have one that introduces the parallel recursive divide-and-conquer four-joint task parallel pattern that Tim also has in his book in OPL. So we have one refactoring, and this is one of [inaudible] that goes in Eclipse 4.21. And we also have another one loop parallelism. This converting sequential loops to parallel loops, again, via a refactoring interactive approach. We have a third category of refactorings. These are these are the ones that you want to apply if you add some locks or some other heavy way synchronization mechanism in your code. You want to make your code more scalable, so you want to get rid of this and replace them with more scalable constructs like these Atomic packages that use underlying, under-the-hood compare and swap hardware instructions. So Java has them via the Atomic package. The [inaudible] has them via the interlog construct. I forgot how it's called in -- TBB also has something called Atomic. So it's the same construct. We always as an approach, as the way how I'm developing and how I'm conducting research, I always validate empirically. So we validate our refactoring tools by running them against hundreds of files from opensource projects and also doing controlled experiments with programmers. And we found out that these kind of tools dramatically improve programmer productivity. It's much -- You know, they fast enough even though they do very complex and very intelligent program analysis which requires interprocedural point analysis to figure out what [inaudible]. So these are still very fast. It's fast enough that actually programmers can use it in the interactive environment, in interactive mode. We found out that unlike the open-source programmers, our tool applies the transformations correctly; whereas, open-source developers apply them incompletely so they probably carry out nine-tenths out of the refactoring and left one-tenth. And exactly that one-tenth left, it's a bug. Of course there is a good motivation. So once we refactor this code they exhibit good speed up. Doug Lea, who is the Java number one lead architect of the Concurrent package in the Java standard libraries, writes on the mailing list that he was very impressed with one of tools, which is ReLooper. He says, "I expect it'll be useful to just about anyone interested in exploring these forms of parallelization." Well, it's not only that we develop these kinds of tools and we put them into the hands of the programmers but, you know, I believe that if you put them, if you ship them as a plug in for a widely-used developer environment they are much, much more likely to be used. If you want to really impact the lives of millions of developers that's the way to ship and to package your research. So the two that we've presented so far are all developed as plug-ins for Eclipse. Now we're starting to work on plug-ins for Visual Studio, but that's the next slide coming. You can download these tools for free, open-source, at refactoring.info/tools or you can wait for another five weeks and you can download them from the official release of Eclipse. So if you do the Eclipse Juno release, the minor release number one, some of these tools will be integrated in the official release of Eclipse. This is very exciting. On the educational side, we have these successful summer schools. So I educate more than 800 participants who come to these summer schools on this topic of refactoring for multi-code parallelism and on the topic of multi-code parallelism itself. So we just finished last week our fourth summer school on multicode programming at Illinois, and this was very successful. We had some high profile keynote speaker at the caliber of Doug Lea and Cliff [Inaudible]. I also do one-week training courses, so I've done three of them in this area. And when I come to Seattle and I either come at Boeing or I come at Microsoft. Boeing is very interested in this topic, so apparently they are interested enough to hire me to teach these one-week programming intensive exercises and these training course. I also do tutorials and big conference like OOPSLA or ICSM. And more than this, actually, we have another metric so that actually people out there who care enough about tools so they not only download and use our tools but they also they bother to send us a bug report. So we are humans; we make mistakes. And our software is no different, so all the tools that I've presented so far I know they have some bugs. We are fixing those bugs. We are making them more production-quality and apparently some of them are good enough that Eclipse Foundation has integrated them into official release. So here I want to talk about some of the current and ongoing work. So I looked at the industry trend, and I saw that the industry has this trend of converting the very hard problem of using parallelism to a slightly easier problem of using a parallel API, using a library. And Microsoft has this library; it's TPL and PLINQ. Intel has [inaudible] building blogs. And there are many other libraries out there. Yet, we know very little about how do programmers actually adopt these libraries in practice. We know very little of how can they find examples; how can they educate themselves on how to use these library API's. You know, some of these API's are overly complex. Some of them are very rich, so they expose several overloaded methods with several arguments. Yet, programmers have very little examples and very few examples from the real world. So also library designers if they don't know about how people use these in practice, they do not find examples of misusages. They do not know what's tedious, what's error-prone. Researchers can make wrong assumptions or you can build a data [inaudible] tool thinking that people only use logs. It turns out that from our study we found out that actually logs are just one construct. There's a long tale of many, many other synchronization constructs. So we analyzed all the open-source projects that we could find in CodePlex repository; this is Microsoft CodePlex repository and GitHub is a very widely popular repository. So we analyzed all the projects that use Microsoft TPL and PLINQ libraries. This is about 17 million lines of code contributed by 1600 different developers, and this is the first in-depth large study, a study on this scale, of how do people in the wild, how do they adopt these parallel libraries. So some of the findings that we discovered through this study are quite interesting. For example, we found out that indeed open-source developers embrace and adopt parallelisms. We saw that 37% of all the open-source projects in the GitHub and CodePlex ecosystem use some sort of multithreading and out of them 74% use [inaudible] concurrency [inaudible] and about 39% use multithreading for improving throughput and actually squeezing parallelism out of their code. We also found very surprisingly that in 10% of the case, developers' code -- You know, we think code runs in parallel but in fact their code runs sequentially. There was just this very small minor syntactic mistake that they made in their code. Also found out that developers make their parallel code unnecessarily complex. So this is what we know from software engineering; we call this accidental complexity. Parallel programming is hard but they make it unnecessarily hard. This is the first large-scale research project. We could only do this because Microsoft has this new infrastructure called Roslyn, so it's analysis infrastructure for Visual Studio. And we were, for a while and I think we are still one, we are still the one [inaudible] pushing it to its limits. So when you crank this on the 17 million lines of code, you push it. You know you push this to its limit. And of course we found lots of bugs, and we report more than 20 bugs to the developers. They were very keen on acknowledging and on fixing those bugs. So in the recent release that just came out last week or so, Roslyn is much more robust now. And, you know, one of the things is how do you report bugs without actually cutting bridges and without losing friends? And apparently we managed to report more than 20 bugs in the Roslyn environment and we're still very good friends with them. So some of the implications: so this good news for developers. So we have this website learnparallelism.net. The only reason why it's called "dot net" is because it's for dot net. Here you find thousands and tens of thousands. You know, if you're a new developer who has just heard about, "Okay, I can do a parallel four in C Sharp and TPL," you can look at this website and can find tens of thousands of examples of how other developers in this more than 600 open-source projects use parallel dot four or any other constant you care to learn about. This is good news for researchers because now I can make a more informed decision on what are the kind of research tools that I want to develop. Of course it's good news for library designers because Stephen Toub who is the lead architect of TPL found this study very useful and this is going to influence future API development of the TPL library. So since I'm in the Microsoft ecosystem I thought, well, Microsoft was nice to me so it gave me another grant to keep working on Microsoft technology. So I started another project, this is very recent, on refactoring for spreadsheets. So the surprising figures are the number of spreadsheet end-users -- These are people who are called end-user programmers -- this is estimated to be at least a hundred million. You know, this is very, very conservative; the number could be way, way, way much larger than a hundred million. But anyway this give you sense, at least [inaudible] more end-user programmers than professional programmers. Well, what does it mean practically? It means that the number of bugs that end-user programmers create is at least [inaudible] large than the number of bugs in professional software. And when we looked at thousands of spreadsheets from the real world, we found out that indeed they are riddled with the same, they are plagued with the same smells, with the same errors, with the same mistakes that professional programmers make so lots of hard-coded expressions, duplicated expressions, duplicated constants, accidental complexity. So this has effects on both the performance and also the future maintainability of the spreadsheet and the work we could sell. So we developed REFBOOK. This is the world's first refactoring tool for Microsoft Excel formulas. And right now we are supporting several refactorings. Here is an example of one of them. So you can look in a table. You can extract a sub-expression from a complex formula and extract it in its own column, and then the tool itself will go and find some other instances of the same sub-expression in other columns and replace all those instances with, you know, the new column that you just extracted. So now if you have to change your table in the future, you have to change that sub-expression, just go and change it in one single column. You don't have to go and hunt all these other columns that previously were duplicating the same sub-expression. So like with any other tools we always [inaudible] the empirical list. So here we've done an evaluation of [inaudible] from three different angles to look at a survey and control experiment with 28 Excel users and also look at a case study where we looked at 4000 spreadsheets from the real world. And we found in it that users prefer the [inaudible] ability and the nicer maintainability of refactored formulas. We also found that our tool, of course as you expect with tools, you know, they are faster and more accurate than doing these changes by hand. And we also found that these refactors are widely applicable because [inaudible] thousands of spreadsheets there, as I said before, they were riddled with all kinds of smells and all kinds of problems that could be fixed through refactoring. So I want to conclude with -- Uh-oh. Apparently I'm presenting the wrong version of the slides. What happened here? Oh, so this is a recover file. This is -- This is the recover file. This is not my -Whoa. Okay. This is surprise ending. I noticed that one my bullets was empty before. Okay. Let's see. Okay, that's because I [inaudible] -It's because I closed the lid so that the lid wouldn't bother you. So probably that crashed Microsoft Office and it recovered this. Anyway, what I wanted to say here -- Okay. Currently I moved it somewhere. Okay, so I'll just end up then with this slide. I had one more slide which in my deck of slides is nice and polished, but you know apparently the version that Office recovered just doesn't contain my text. So we are starting and we are developing and inventing the next generation of development environments. This is a large project funded by SHF: Large, program funding by National Science Foundation. We are inventing a new generation of programming environments that treat software changes intelligently. So we are enabling programmers to actually author and prescribe their own transformation. You can be able to use them in other context. We are enabling a version control system to show the history at the high-level, so it makes it easier to understand the changes. Of course we are inferring these high-level changes from low-level changes. So it means it's going to significantly and drastically change the ecosystem of programming environments. And of course it would be good to see some of these things going into Visual Studio probably in a few years. So that's all that I had to say. [ Audience clapping ] >> Juan Vargas: We are running a little late. Probably if you have questions, please see Danny during lunch. We are now going to have the presentation from Koushik Sen, "Active Testing and Concurrit." Joseph Tereles reminded me that Koushik came from the University of Illinois, and he just got tenure from UC Berkeley. So this is another example of great collaboration between two schools. >> Koushik Sen:Thank you. >> Juan Vargas: Big transfer. >> Koushik Sen:So today I’m going to talk about Concurrit: a domain specific language for testing concurrent programs. And this is joint work with my student Jacob Burnim, Tayfun Elmas, my [inaudible], and my colleague, George Necula. So in the last few years there has been a significant amount of progress in automated test generation for both sequential programs and concurrent programs. And I was in fact involved in a couple of those projects, and what I noticed in the last few years that there is a very slow adoptions rate for these kind of techniques in the industry. And people are not -- programmers are not [inaudible] to use these automated test generation tools. And on the other hand if you look at tools like JUnit, xUnit and so on, programmers use them regularly and they're very popular. So we started thinking about developing a similar kind of xUnit tool for concurrent programs, and we came up with this idea called Concurrit. And there are some other tools that you have seen like IMUnit, and there was a tool developed by Bill Puy [inaudible] in the same spirit. So suppose this is -- So I use this SpiderMonkey JavaScript engine as a running example. Suppose you want to test this code. It has 121,000 lines of code, and it used by the Firefox browser. And you want to test several functions in this code. Okay? Now it's easy to write a sequential test for this code. You just write the function, harness function, and you fix the inputs and you call the methods that you want to test and you check the output. And one of the nice properties of this kind of unit test is that if you run it multiple times you get the same outcome, either it will fail or it will say that the test passes. So this is a determinism property for test and it's very desirable. And we get this kind of determinist behavior for test written for sequential programs. Unfortunately if you want to do multithreaded testing or concurrent testing for this program then you create several threads and you run this code in parallel. But if you write such a test for concurrent program, you lose this property of determinism because the threads can interact with these other and you may not get the same output if you run it multiple times. Okay? So that is the key problem why there has not been any successful tool for unit testing of concurrent programs. Now what people use in the absence of such unit testing tools for concurrent programs, they do stress testing. The idea is you create numerous threads, thousands of threads, and you run the program for several minutes and see if something bad can happen or not. Okay? And this is people call stress testing, and it's really used in practice. Another set of approaches that people have developed is called model checking. It's mostly in the research area where you try to explore explicitly all the schedules of the thread and see if there is any bug. The problem with model checking is if you try to apply to real world code, like say Apache or Firefox, it's very hard to scale because there are too many -- the [inaudible] is too high. Moreover, you have to have total control over all possible sources of non-determinism if you are trying to model checking, and this is not realistic for real software. Okay? And also we have seen that many times programmers want direct control over the scheduling and the test that they are trying to write. And if we can incorporate the programmers inside in the test then our testing could be effective. Okay? Another alternative that people have explored and which is surprising the most used approach to test concurrent programs is to insert sleep statements as Darko showed. And if you look at the [inaudible] for various kind of concurrent software, you'll see that most of the bug reports include sleep statements. And the idea is you create a number of threads, but also you put the sleep statements at the right location so that you can play the schedule and you can reproduce the bug. And the reason by the success of the sleep statements is that it's very lightweight and convenient to write this test. On the other hand it's not formal and it's not robust; you might see the bugs sometimes. You may not see bug other times. So what we wanted to do, we wanted to come up with a testing framework that is as lightweight and convenient as sleep statements but at the same time which is formal and robust. Okay? And Concurrit is the result of that. So I'll give you a tutorial introduction to this Concurrit using this benchmark the SpiderMonkey JavaScript engines which crashes on some execution used with assertion failure. And here is the bug report that was filed by some user to the bug database. And as you can see that he tells us there is some kind of schedule involving three threads and some unknown schedule between two other threads, and if you take that particular schedule you will hit the assertion violation. Now our question is how can you take this kind of programmer insight and translate it into a very small test script so that we can make this bug reproducible. And here is what we do in Concurrit. We take this user insight and ideas about the thread schedule from the programmer, and we write a test in our DSL. And then we run the DSL along with the software and the test and systematically explore all-and-only thread schedules that are specified by this DSL and see if we can reproduce the bug. And I'm going to describe how the DSL looks like. And here is how it works. The Software Under Test we instrumented and it generates events in a similar way as immunity does. But whenever it sends events, it gets blocked by the Concurrit DSL, and the Concurrit DSL based on its logic, it unblocks the thread from time to time. Okay? So let me show you how a Concurrit test would look like. So the bug you have seen, it says that it mentions that you will see that assertion violation if you schedule three particular threads in a particular way. Okay, so we definitely know that the bug is happening due to three threads. Okay? So let's try to first write a test that will see if the bug is due to concurrency or not. So here is our first test script which says that, "Pick any three thread in the program and run them sequential until they terminate." Okay? And this is what we do in this test. We pick TA, TB and TC, three distinct threads, and while they're running, choose a thread and run them until the end. Now this is a test that we write in our DSL, so this is very easy to understands. It's an embedded DSL in C and C++ so you don't have to learn a new language to write this kind of test. You have to understand a few constructs that we have provided, and once you have written the test you will run it along the software and the test. And we'll explore six schedule in this case, six possible ways of running these threads. And we see that there is no assertion violation. And at this point we know that the bug is definitely due to concurrency. Okay? Now we need to refine this test so that we can actually create that particular schedule and hit the assertion violation. So we write our next test which will try to explore all the schedules of the program. And that's what we exactly do in case of model checking. Okay? So here is the second test that we write. We just change one line in our test which says that, "Run thread until ReadMem or WritesMem," or some other event instead of saying that, "Run thread until it ends." Okay? So this means that [inaudible] these threads in all possible ways and explore the [inaudible]. Now this is a good test. This actually finds the bug, but it's like model checking. We end up exploring more than one million schedules and it runs for days and eventually finds that bug. Okay? And also in this process we learn something: if we write this kind of test, you have to take control of all the possible sources of non-determinism in your court, otherwise, you cannot guarantee systematic termination of your search. Okay? And moreover, since you are instrumenting all possible instructions in your core, it's too heavyweight and you spend a lot of time searching schedules which are not important for the bug. Okay? So we came up with this notion called Tolerant Model Checking where we say that it's unrealistic to instrument all possible sources of nondeterminism in your source code. So, why don't we allow the programmers to specify the non-determinism that matters for the purpose of the bug and control them? Okay? And also why don't we provide mechanisms so that we can constrain the search, so that we can say that, "Only interweave the threads within these two functions. Do not interweave the threads all over the places." Okay? So these are the two things we provide in this Concurrit DSL, this concurrent framework. And also in this framework, if you wish to do [inaudible] you can write your own heuristics which can full model checking or it can be context model checking and other heuristics. So let me show you how we can localize the search and only specify the non-determinism that we are interested in. So if you look at the bug report again, it says that the bug happens only when the three threads are executing the js destroy context. Two of them are executing the js destroy context and one of them is calling the new context. Now let's try to encode it in our test. And here is our test. We modify again two more lines. Here, instead of waiting until it entered the js new context, we also make the thread TA and TB until entered the function js destroy context. And then we try out all possible interweaving between these threads. Okay? So this is the most restricted and localized search. We do not start the search as soon as we create the threads. But once we have entered those particular functions, we start the search. Okay? Now if we -- So this is how the tree look like, the model checking tree looks like. It first takes a particular specific schedule and then it tries to search the interweaving space. Now if we write this test then we can actually hit the bug after exploring fifty thousand schedules, and we see the assertion failure in a few hours. So this shows the power of Concurrit. In order to write a Concurrit test, you don't have to control all possible sources of non-determinism in your code. You can just specify the non-determinism that you want to control and keep the other sources of non-determinism uncontrolled. And the Concurrit test can still do systematic search, and if at any point the systematic search fails because the fact that you are not controlling all uncontrolled non-determinism, it will raise a flag. And at that time the programmer, what they can do, they can either make the test more robust by putting more non-determinism or they can continue the search and not expect any soundness guarantee. Okay? So this is the idea. And finally fifty thousand schedules are too high. So we looked at the exact thread interweaving and we tried to further localize it. And it turns out in the bug report, it tells us where to exactly interweave the thread so we can create the bug. And we incorporate that knowledge further into the test by adding three of these lines. And if we now run the test along with our software [inaudible] test, we actually hit the bug schedule within ten iterations. And so after refining the test, we had a better understanding about the bug and we knew exactly why the bug is happening. So we finally came up with an exact schedule of the thread that actually leads to the bug. And this specifies a single schedule. You don't have to do a search, and you can run it and it will hit the bug on the first schedule. Okay? And note one thing, this is not like specifying the entire schedule of all the threads. We are only specifying the key scheduling distributions that are important for producing this bug. Now once you have this test, you can put it as a regression suite in your test. And it's kind of robust to code change. If the code changes, it will still run and it will be able to find the bug if it's still present. So this is a brief tutorial of the Concurrit framework that we have developed for testing concurrent programs where the programmers get better control of what they want to write and how they want to control the schedules and play with the various kinds of model checking heuristics and also search techniques. So we have implemented this tool in an embedded DSL for C++ which is available. We can write both unit tests and system testing. And for unit testing we run it in the same process, and we do both manual and automated instrumentation. And for system testing, we run it as a separate process. So I guess we are running out of time. So just to give you an idea, as I mentioned, that most of the model checking techniques do not scale, but we managed to run it on regular software including the Moxilla JavaScript engine, the Threading Library, the Memcahed, Apache HTTP Server and MySQL, and we managed to actually reproduce a number of bugs that has been reported in the bug database in a robust way. And the tests are like five or six lines of code. And that was a big success for us. Thank you. >> Juan Vargas: Thank you. [ Audience clapping ] >> Juan Vargas: So if you have more questions, Koushik is going to be around for lunch. And now we are going to have the last presentation by Sunny. Sunny Chatterjee from Microsoft, and he's going to be talking about concurrency bugs with advanced static analysis technology. And after his talk we will have lunch. Food is going to be in the back behind this wall, and then we are continuing with a session on applications from one to three. So at one, please come back here and we will continue with the session. >> Sunny Chatterjee: Hello, everyone. My name is Sunny Chatterjee. I am a developer in the Analysis Technologies team in Windows. And today I'm going to talk about a set of concurrency tools we have developed based on static analysis that we use for finding and fixing concurrency bugs in major software in Microsoft like Windows Office and other divisions. I know we are running short on time so I'll be as brief as possible so that you can go for lunch. So first I would like to talk about our team. We are part of the Engineering Desktop team in Windows. We develop and support some of the most critical program analysis tools and services that are used across Microsoft. We have various tools at the source level and binary level. At the source level we have a set of global analyzers, local analyzers, and we have a source code annotation language called SAL. At the binary level, we have binary instrumentation tools and we have code coverage tools based on the binary instrumentation technology. Today specifically I'll be talking about a concurrency SAL which is a source annotation language we have developed for specifying locking behavior in a program. We'll also talk about toolset that is called EspC. EspC stands extended search over programming that finds concurrency defects which can also understand SAL. We'll also talk about how we are using these tools internally in Microsoft to find and fix thousands of concurrency bugs that compile time on the developer's desktop and how we are planning to ship these tools externally so that we can help the ecosystem. So we are all aware of the common locking problems that we see every day. There are insufficient lock protections which results in [inaudible] conditions. There are lock order violations which result in deadlocks. We forget to release locks resulting in orphaned locks. There are no-suspend guarantee violations. There are API's which have an implicit locking nature like send message and we inadvertently call them without realizing that they might have a potential to block our application. There are many similar classes of locking problems, and our tools try to find and fix these problems at the developer desktop. So the key challenge for us is how do we force a locking discipline because we all know that locking discipline is essential for avoiding multithreaded errors. However, it's surprisingly difficult to enforce in practice and there is no support in the high-level languages like C, C++. So our solution is a set of concurrency tools that is based on annotations for locking rules. Annotations are a formal way to make implicit assumptions about the locking behavior explicit. We then have a tool called EspC that uses local static analysis to catching locking violations in the program. And we have a tool called CSALInfer which is annotation inference and patching tool that can help jumpstart the effort on a legacy code base where you don't have any annotations. So what happens is, if you run CSALInfer it automatically infers and patches your code with concurrency SAL, and then EspC becomes a lot more effective and accurate in catching locking violations in your program. So the question might be asked why we use SAL as a solution. So there are a few other approaches we explored. For example, manual review of code. We all know there can be a large number of paths in a given program. And in a multithreaded environment it's very difficult to figure out concurrency bugs just by manual inspection. Testing we found is ineffective for two reasons: one is there are -often times we find simple programming errors where data needs to be protected by a lock. We acquired the lock, we write to the data and we forget to relieve the lock in a certain path. These are the kind of simple programming errors that we do not want to defer to until testing. Testing is also ineffective because sometimes concurrency bugs are very hard to detect and debug. They might result in nondeterministic failures. They might result in [inaudible] stress tests, so mapping that back to the source code can be time consuming and expensive. We also have global analysis tools like global EspC, but it heavyweight and time consuming. It can take a week to provide results which renders it a bit ineffective where we want to actually provide results right on the developer desktop at the compile time. So what we use is a local analysis tool called EspC which can find all these bugs at the compile time. But to make it more effective we use SAL because SAL provides a calling context to the EspC tool set which, otherwise, would limit its accuracy. SAL is nothing but a lightweight specification language that makes implicit assumptions about a function explicit. So what happens many a time we find people do provide locking side effects about a function in comments. Wouldn't it be nice to have a formal way where we can provide a formal language where we can specify the same commands and at the same time the local analysis tools can take advantage of these? SAL does exactly that. We have a few annotations that we came up with like -- I just wanted to show a few of them. For concurrency annotations we have "acquires lock," "releases lock," for example, provide function post conditions where functions [inaudible] event functions acquire and relieve the lock as opposed condition respectively. Similarly we have function pre-conditions like "requires lock held" and "requires lock not held" where in certain cases a function needs a certain lock to be held or not held before it is called as a precondition. And these two annotations let us specify these preconditions. There are also a group of invariant annotations, sometimes specific data needs to be protected by a certain lock. And we can specify this behavior using the "guarded by" annotation. There is also a "lock level order" annotation that can specify your ordering on the locks. So whenever we acquired the lock in the reverse order, we can find out deadlock situations. There is also an annotation called "low competing thread" because we are seeing that in a multithreaded environment there are certain initialization functions or constructors that execute in a singlethreaded context and no competing thread tails are not to worry about multithreaded context when analyzing that function. So our tool is based on a very rich static analysis platform that's used for a wide variety of static [inaudible] in Microsoft. So the beauty of this platform is that it abstracts out the high-level source language at the front end. So basically it can parse C, C++, SAL. It can parse manage code using an MSIL driver. It can also parse JavaScript and reconstruct an intermediate representation using controls or graphs. And we have an analysis layer on top of the intermediate controls or graphs which provides the tools for analysis. For example, we have an alias analysis engine which can help us determine if two variable regions point to the same memory location. We have a pretty accurate symbolic path simulator called Symbolic Simulation Manager which can accurately point out which are the feasible paths in a given program. And on top of that we have a group of checkers like EspC that check for specific properties. EspC, for example, checks for concurrency properties. There are other checkers, like mal Pointer. There are other specific checkers on top of that. So this is a very, very robust platform that's used extensively for writing static tools. So... >> : [Inaudible]? >> Sunny Chatterjee: So the false negative rate -- The criteria is that if we want to enable the tool on the developer desktop or if we run it as a daily build then the tool has to be 80% or more accurate. So what we do is -- So the warnings that we enabled on the desktop have a false positive rate after 80% and less for false negative, it's not -- So for a developer desktop it's a not a big concern because they don't cause a negative experience. What we want to make sure is we want to enable developers and make them realize that we are actually helping them find the right bugs. So we are more tuned and optimized towards reducing false positives and not so much on -- we haven't done any analysis on false negative in that way. But if we have the right annotations then it is very, very accurate. The only scenarios where you will have false negatives is when you don't have good annotation in your code base. Then sometimes in the calling context you might not have all the locking behavior. And at that point you might have false negatives. So the approach we take is we run EspC out of the box without any annotations. It provides value. Then you run CSALInfer to infer annotations. And you run it again in an iterated way, and it provides a lot more accurate warnings. So here we show how the lock sequence is computed in the control [inaudible]. So at every point in the program, EspC keeps track of a set of locks that acquired and released in the program. And then it does some checking for finding concurrency defects that I'll talk about in the next slide. I wanted to talk about a few optimizations that we had to make so that this can scale in large code bases like Windows because, otherwise, we did have scaling issues. One issue is that we tried to explore every path in a given function. And we found that that does not scale always. So one of the algorithms we use is to have a merging. So in a given path, if the property we are checking for doesn't change then we merge the path into one. And this really helped in the scalability of this tool. There are also heuristics we used. We tried to determine if a given function actually is concurrent or not. And if it doesn't seem like an interesting function to analyze, we will skip that. The path simulation also is time-consuming. So what we do is we first [inaudible] off and then we tried to analyze to find concurrency defects. And then we turn it back on to rule out the false paths, the infeasible paths, that way we can provide very accurate warnings at the same time in a very optimal way. We checked for various classes of defects. For example, we checked for cyclic locking to point out deadlock warnings. We checked for insufficient locking to point out race conditions. We also find out if functions exit while holding a lock so that we can point out orphaned lock warnings. I would like to talk briefly about the annotation inference engine that we have today. So we initially developed an annotation inference based on a [inaudible] constrained solver. And when we tried to deploy it in Windows, it did not scale. So what we ended up developing is a hybrid tool which uses certain classes of heuristics to infer those annotations. In this particular example, you can see that when we are writing to the data in process buffer, we are protecting it by the lock PCS. So EspC is smart to figure out that the data must be protected by the lock and it infers guarded by annotations on the structure. Next about internal adoption in Windows. So today we run these tools on the engineer desktop. So every developer that writes code and compiles would have EspC running in the background process. And all these concurrency warnings will show up on the desktop. This way we are fixing thousands of warnings even before they are getting into the code base. We also run these tools daily as a build verification service. So even if a developer checks in a concurrency warning in the code base, we have a daily build verification service that would flag these errors. And the developer would need to fix these errors before code can move out of the branch. So this ensures we have a high-quality product which does have these concurrency warnings. It is used very extensively in Windows. In the beginning of Windows 8, for example, we added like thousands of concurrency annotations. It was a joint effort across all the [inaudible] that signed up for doing this work. We have also other divisions like Office, Windows Mobile, SQL and the ConCRT team that have also used these tools. We are helping the ecosystem by shipping EspC as part of the VS level code analysis feature. So this is available in the pro and ultimate versions of Visual Studio 2012. In this case, for example, we see a program where a count is protected, the balance is protected by the lock CS, critical section, and the unsafe withdraw method. We are actually accessing that data without acquiring the lock. So EspC is quick to warn about the risk and the possible risk condition. So this way we do believe that by exposing this exposing this to external developers, we can help the ecosystem write better multithreaded code. We have a bunch of resources externally. We have our MSDN documentation that talks about concurrency SAL. This documentation is still work in progress that we are still finalizing, but you can still access it today and go through these annotations in details. We also have our team blog externally in Visual Studio code analysis. We also have a couple of talks that talk go into -- it's like a one-hour talk so it shows a lot more details about these tools in the build conference last year. So you can also go and take a look at those talks. So what we covered today is a brief primer on concurrency SAL, and we learned that we shouldn't treat locking disciplines as afterthoughts. It should be very much part of an interface design. We learned that we should be using EspC toolset because for a Microsoft product, no corner case is rare. So we do want to avoid those hangs and non-deterministic failures at the customer's desktop. We also talked about how we internally adopt these tools in Microsoft because the cost of fixing a bug increases with time. It's cheapest to fix it at the developers desktop. This way we can push quality upstream. We also talked about how we are shipping these with VS11 code analysis feature to help the ecosystem as a whole. So that's pretty much what I had. [ Audience clapping ]