>> Kathryn McKinley: I'm Kathryn McKinley and it's my pleasure to welcome Na Meng who got her Ph.D. from the University of Texas working with me and Professor Miryuhng Kim who is now at UCLA and she's been interested in how to make software tools that make programmers more productive and produce better code with fewer errors. I'll let her tell you more about her thesis work now. Oh, and one more thing, she got my best reviews ever. Someone wrote: I love this paper on her reviews. >> Na Meng: Thank you, Kathryn, for the introduction. Hello everyone. I feel great honor to have this opportunity to present here. And please feel free to give me questions as I go through the talk. So today I'd like to talk about automating program transformations based on examples of systematic errors. So software development really costs a lot of time, money, and effort within the different faces of software life cycle, maintenance incurs the greatest cost. So this cost has risen dramatically have people have estimated that nowadays maintenance accounts for more than 60 percent of the overall cost of all software. So even with all this expense and effort, the U.S. economy still pays more than $60 billion every year due to software errors. So we have designed and implemented approaches to help developers improve productivity and software equality so that we can reduce both software maintenance costs and serious consequences caused by software errors. So actually, during software maintenance, developers make a lot of code changes to fix bugs and new features migration applications and refactor code. There are some existing tools to help developers maintain their software by automatically making code changes such as the such as the search and the replace feature of editors, refactoring tools and bug fixing cools. However, the tool support is very limited. For instance, the widely used search and replace feature of editors can help developers find keywords and then replace the keywords with alternative words. However, the feature can only apply identical text replacement. And this feature doesn't actually observe the syntax or semantics of programs. Refactoring tools can help developer to improved code implementation by applying a series of behavior-preserving program transformations. However, these tools are limited by the predefined semantic preserving errors so similarly, bug fixing tools can help developers finds bugs and even fix bugs. However they're limited by the predefined bug errors or the inherent fixing strategies. So when developers want to make some code changes which are not covered but existing tools, they're always on their own. And so we would like to have developers to make code changes. So the inside of our research is based on some observations made by some recent studies which is many code changes are systematic meaning that there are the lot of similar but not necessarily identical code changes applied to different locations. So one study shows that 75 percent of structural change to see material software is systematic. And the other study shows that 17 to 35 percent of bug fixes are actually recurring fixes involving similar edits to methods. So if we can automatically apply this similar code changes to multiple locations we can significantly improve programmer productivity and software equality. So to give you a scenario of the systematic editing, I put example here. So developer Pat a wants to update this but database transaction code to prevent the SQL injection attacks. The programmer may need to first identify all the locations relevant to the edits and then manually change the locations one by one. So in this process, with more details, when the programmer wants the make the edit, he may 1st plan the edit in mind or by hand and then manually find all the edit locations. So for each edit location the programmer needs to first customize the edit and the applied edit because this different locations are similar but sometimes different. Like they may contain different statements, they may use different identifiers. So this process is tedious and error prone. So after making this repetitive code changes to different locations, developers may also want to do the code or refactoring in order to extract some common code between different locations so that they can reduce the code redundancy. So by extracting out the common code between different locations, they may furthermore reduce the repetitive future systematic edits to other locations into the single updates to extracted code. So in order to have developer to see perform such tedious and error prone tests we have designed and implemented approaches to automatically create program transformation from code change examples provided by developer. This program transformation can help developer to apply customized edits to each single edit locations they want to change similarly. It can also help developers to find locations when develops find oh, it is hard to identify all the locations themselves. And after make the systematic edit, our approach can also help developers to automatically refactor in a code in order to reduce code redundancy and in that way they can prevent the future systematic edits by only updating the single location which contains the extracted common code. So in the follow part of my talk, I will first talk about how did we locate and apply systematic edits and then I will talk about how did we export systematic edits for refactoring. Next I will talk about the possible future directions of our research and then conclude. Do you have any questions? So far so good. In the first part of my research, actually we have designed and implemented two approaches. The first approach is that when developers want to make code changes to different locations in a similar way, we only require them to change one of those locations and show it as an example. And by learning from this single code change example, our approach sided with automatic credit program transformation in order to change all the other locations similarly. So there are many challenges in this approach. The first, the code in different locations may contain different syntactic structures and identifiers. Therefore, the generalized program transformation should tolerate the difference so that later it is applicable to locations containing different context. And the second challenge is that the edits in different locations may vary in terms of their provisions to apply and the identifiers they use. Therefore, we'll automatically apply in the program transformation to different locations, the tool needs to first examine the context in order to correctly position edits with respect to the program structures and use the proper identifiers. So when developers provide code change example, Sydit first leverages a program difference algorithm to create an AST edit script which can represent the code changes. It then extracts context irrelevant to the edit using program dependent analysis. So the extracted context will serve to help our program transformation correctly position the edit operations in the new targeted locations. In order to generalize a program transformation which is applicable to locations containing different identifiers and different program structures, we need to abstract both identifiers and edit predictions in the example edit script. For instance, for the edited statements, we need to -- with identifier abstraction, we'll replace the concrete identifiers of variables methods and the text with corresponding symbolic names. And at the end of this step, we can guide a general abstract program transformation from the exemplar code changes and when developers specify the calculated locations they want to change similarly, establishes context matching between the abstract contacts and each targeted location so that it can customize the edit and furthermore apply the edits to the locations. So there are merely two phases. The first phase is to create the general program transformation and the second phase is to apply it to a targeted locations. So in the following several slides, I will talk about each type with more details. And this example will run through all the steps so we created this example based on the real code changes we mine from the open source approach at Eclipse debug core. So in this example, there are two similarly changed matters. And when developers change the matters and show it as code change example Sydit will leverage this to generate a program transformation. So as you can see from this code example, developers make some -- actually refactor a code by changing the configured declaration and actually the developers also add some new feature in order do something else. So Sydit first creates AST edit script using a program differencing algorithm so the AST edit script may consist of four types of edit operations such as statement insert delete update and move. So this edit script kind of represent the difference between the old version and the new version of the changed method. For the specific example, it creates an update, a move and a several insert operation. So in the second step, Sydit extracts a compact relevant to the edit in order to identify the context which later can be used to correctly position edits in the target locations. So it leverages program dependence analysis. For instance, with the edited statements marked in the orange color, Sydit first leverages control dependence -- leverages containment dependence analysis in order to find all the nodes which contain the edited ones such as the method declaration and while estimate and if statement. It then leverages control dependence analysis in order to identify nulls which is control dependent on by the edited nodes. And finally it identifies nodes which are data dependent on by the annotated statements like the either declaration and the configure declaration. So this stuff actually tries -- so by extracting the context to all the example script, we try to encode the constraints it has for its targeted locations in terms of controlled dependence and data dependence relations. So if later a targeted location can contain some context that matching this added relevant context, it means that it can meet all the constraints in terms of the data dependence and the control dependence proposed by the edit script and in that way it will be safer for us to apply the edits to those locations. So now the other two generalize a program transformation which is applicable to the locations containing different program structures or using different identifiers, we abstract the identifiers and add in positions used in the example edit script. So with identifier abstraction, we replace the concrete identifiers with call-responding symbolic names and with added abstraction, we recalculate the position of each added script with risk back to the abstract context. So at the end of this step we can get an abstract context aware edit script. The way users structure the targeted locations, we will try to apply the general program transformation to those locations. So before applying the edit script to a targeted location, Sydit he needs to first try to establish a context matching between the abstract context and the targeted location. The other two identify a substrate in the targeted location which corresponds to the abstract context. If there is a such matching, it means that the target location can meet all the constraints like the data dependence and the control dependence proposed by the edits we want to apply. And the basic idea of in algorithm, the context matching algorithm is like we first do the leaf node matches between the two context and so for all the candidate leaf node matches of the abstract contact we try to identify the best match in the targeted location which is corresponding to the path between a leaf node and the root node. So editing all these step, we want to establish 1-to-1 node mapping so each node in the abstract context. And when developer provide this method be as the targeted location to change similarly this step can't identify the nodes which are marked with the yellow and the relevant context and then furthermore establish identifier mapping between the abstract context and the concrete context. So in this way, the identifier mappings will facilitate us to customize the general program transformation for the targeted location so the customization consists of two parts. One is to concretize the identifiers and the other is to concretize the other positions. So with identifier concretization, we replace the symbolic names with corresponding concrete ones. And with edit position concretization we with recalculate the position of each edit operation with respect to the targeted location. So at the end of this step, we can create a customized concrete edit script which is applicable to the targeted location and by applying the edit script we actually changed the code and then show it to developers to ask for their opinion whether they would like to take it otherwise or not. So the other -- questions. >>: How do you know from one example what the right extract is? Like in your example you have a method column that returned an object that you called another method on and then cast the return result. I mean, couldn't that have been abstracted out as just another -- a single method call? I mean. That could have been the attention of the user, that the thing that they were refactoring was an invasive column to some other class? Do they ever see the abstraction of the pattern? >> Na Meng: Currently they cannot see the abstraction of the patterns. So we put that inherently in the tool. We just -- so in this default citing we just generalize all the identifiers and actually we're not quite sure what is the best way to extract the context relevant to the edit. In our evaluation, we also explore different ways to model the context and use the context to modify different targeted locations. And we can find some best settings for our evaluation but I'm not quite sure whether the setting can also be applicable to other scenario which is not covered by the evaluation settings. >>: Could you show the transformation to the users? >> Na Meng: Yes, we can. We can. So here, we have this added scrapes which scribes which node you should change and how to change it and actually some other students in Mirelle's group who was an undergraduate student who built a tool based on my project, the basic idea is to show the added script to developers so that developers can read them and even modify them if they are not satisfied by the representation of the edit script. So after their modification, in that project, we'll apply the modified version of the added script to the targeted location. So we gave the developers opportunity to modify the [indiscernible] for the program transformation. So in order to evaluate how effective Sydit is when making systematic edits, we create a data set consisting of 56 examples. So we found those examples from file open source projects. In each example, there are two similarly changed methods and for each example, Sydit [indiscernible] first of program transformation from one method and then apply the transformation to the other method. In order to better understand the examples and better understand the capability of Sydit, we classified examples into six categories based on different matrix. First whether the similar trenched methods are trenched identically or differently and the second whether the added script involved a single node change or multiple node changes. And the third, if the nodes are -- if multiple nodes are involved in Unicode changes, whether they are contiguous or non-contiguous. So for the search and the replace feature, why don't we use the search and replace feature? Making the single node identical changes is very easy. However, for the changes which may involve a lot of non-contiguous node changes and the nodes changes apply to different locations are different, it will be very hard. And this is also a challenge it has for other existing approaches. And however, in this scenario, our tool can do a better job and we evaluated our tool with this is at. The evaluation shows that by inferring the program chance formation from one code change, Sydit can create and apply program transformation for 82 percent of the examples. And it can produce correct edits and apply the correct edits to the targeted locations in 70 percent of examples. The similarity between the tool-generated and the human-created version is 96 percent on average, meaning that we are very close to the developer created versions. And to answer Andrew's questions, so we also explored, try to explore what is the best way to extract the context in order to achieve a good balance between context batching flexibility and added application practice. So we first changed a number of dependence distance in our program study analysis. When K equal to one, we only include nodes have I have direct dependence relations with the edited nodes into the context. When K equal to two, we also include nodes which have interactive dependence relations with -- into the context. And our hypothesis is that the more nodes we include into the context, the more constraints the resulting program transformation will put on to the target location. And as a result, we may fail in more cases in order to establish context matching between our generalized program transformation to the target locations and evaluation shows that one K equal to one, we can guess to the best of performance. And in our second settings, we also changed the abstraction settings in the identifier abstraction steps. So by default, we generalize the program transformation by abstracting all the identifiers including the variables, methods and [indiscernible]. However, we can also do it differently. We can only generalize maybe a part of those identifiers or even don't do any abstraction. And so the more variables we generalize, the more identifiers we generalize, the more flexible the resulting program transformation will be when establishing identifier matchings with the targeted location. Without surprise, the four identifier abstraction leads to the best performance while no abstraction leads to the worse performance. And finally, what we changed the ->>: I have a question. What does V, T and M stand for here? >> Na Meng: V stands for the variable identifiers. T stands for the type name identifiers and the M stands for method identifiers. Andrew, please. >>: How many of the examples in the corpus that you tested require the power of structural matching or how many could have been written just using like regular expression and matching? Because that would be the alternate, the more sum version of search and replaces playing with braces and regular expressions and everything and often you can get what you want. In some cases it's harder, but I'd be curious to know. >> Na Meng: Yeah. So actually, I haven't measured that specific one in the evaluation, but maybe from the classification of the examples, I can get some idea like say for the single node changes, if you want to make the identical code changes, sure, definitely you can do that. If you want to make different code changes, let's say in one place you want to replace the identifier A with four. In the other location the want to replace the identifier B with bar. Since you want to apply slightly different changes [indiscernible] whether rectangular patterns can help you do that. And the fold in multiple nodes changes. Let's say if you want to change, maybe contiguous code changes, if you want to change a contiguous code region, maybe regular expressions can help you express that. But for the cases which like the non-contiguous code regions, you want to change several code regions which contains some interesting code so in that case, regular expression may not quite help because they are not contiguous. I'm not sure how do you express the gap between the gaps standing between the added code regions. >>: So could right now your tool produces another way to think about answering Andrew's question is right now your tool produces insert delete move and update. And that is a regular expression language, right? So in some sense, the process by which you produce that, the description of the transformation could use the form he suggests but when you are making it concrete, it's a different process than regular expression batch. Like maybe you could do the matching partially in that language like just like regular expression matching is almost terming complete, right? You could probably express anything. But that's like just a different way, like how do you generate that? >>: At the risk of sidetracking, the real issue is parse tree that's a context free grammar. There's going to be some things you can't capture with a regular expression course and the problem is can you summarize the parts that you don't care about with some simplistic regular expression? I think it will be hit or miss in practice whether or not you can do that. It's just the theory behind it. But I think it's a great question. Having used this approach, it's a good question. >>: For your matching you could classify everything using [indiscernible] and then go after the top matches, right? So then that would allow us to actually allow some deviation in the parse stream so you might be able to get away with editing stuff that is not an exact match. So then you would cover your gaps, right? >> Na Meng: So how do you leverage the merged learning part? >>: You take the AST and then have a classification for all the ASTs that you're interested in and then throw your machine learning at all of the ASTs that are candidates and then for kind of your top matches, those are the ones that you would consider do the code edits for. >> Na Meng: The thing I'm a little concerned about this approach is that you don't know what kind of code changes developers want to make. There is not a vocabulary for possible changes throughout [indiscernible]. And the way extract the context based on the code changes by developer. So if they make arbitrary code changes then the extracted code template can be arbitrary. So definitely you can have a limited assess of the possible AST structure but we don't know what there will be a lot of ways to combine this different structure scale in order to construct the different programs. So with the learning part, I'm not quite sure whether it will over feast model for a specific set and if later developers introduce some new code changes which is not covered in this training set, which model is not well trained for, then maybe we will have problems in terms of the matching and applying changes. >>: And your tool doesn't require exact matching. shorter answer to his question. Right? >> Na Meng: That's like a Okay. >>: You already can do a partial change that the programmer needs to do and you just say that it matches if you have got like 85 percent similarity, right? >> Na Meng: Yeah. >>: Then you don't have to get [indiscernible] like as a practice for an interview talk. >> Na Meng: Okay. So in the third settings we modify the up swing and the down swing dependent setting. So up swing means that we also include nodes which are is depended on by the [indiscernible] shows what you will the down swing means that we also include nodes which depend on the additive nodes. So it's like the direction of the dependence between annotated ones and changed ones. And evaluation shows that up swing only leads to the best performance. So finally, after identifying all the entrenched nodes or combining the identifier untrenched nodes together with the edited nodes, a set of extract context relevant to the edit is. >>: I have a question. So did you look at the idea of taking more than one? Like sometimes programmers will write a couple of times and say, look, I'm tired of this. I have a million-line code base. Did you at the example where maybe you took two examples? >> Na Meng: Yeah. I will talk about this in next slide. Thank you for the question. It's a good introduction. So in summary, [indiscernible] systematic edits with high accuracy in many cases. However, in order to make the best usage of this tool, developers need to first pick a good example in order to cover all the added locations they want to apply within the systematic edit. And second they need to manually identify all the locations they want to change similarly. And if the developers cannot find all the locations, then the tool cannot help. And we will like to have developers to automatically find those added locations because in some scenarios, finding edit location social security more challenging than making the edits. And by automating this process to find other locations, we can have developer to see avoid some errors of omission. And so we have designed and implemented a new approach called Lase which extends Sydit by adding a feature of finding edit locations. To use the tool, developers need to provide tool more code change examples and similar to Sydit, Lase also leverages a program differencing algorithm to create edit scripts. However, different [indiscernible] needs to identify the common edit shared between different edit scripts and then regard a comma edit as a systematic edit demonstrated by all this code change examples. So in this way, Lase can filter out any edit operations specific to some of the code change examples. And the similar to Sydit, Lase also needs to generalize identifiers in order to create the general program transformation. However, it only generalizes identifiers when necessary. For example, with this two edited statements, since they are different in the terms of the usage of single variable E versus [indiscernible], Lase creates a general program -- general representation by abstracting the single variable while keeping the others concrete. So in this way, Lase makes possible that the resulting program transformation can correctly find edit locations which are specialized by accessing a certain field [indiscernible] a specific method. And similar to Sydit, Lase also needs to extract the context relevant to the edit in order to later correctly position the edits to other locations. However, a little different thing is that -- sorry. Is that it needs to align the context extracted from different locations and then finds the common context to share between them. Using that one as the edit relevant context. And at the end of this step, Lase creates a partially abstract context aware edit script and with this edit script, Lase try to see establish context matching between this abstract context and all the methods contained in the whole projects. So if a method contains a context matching the [indiscernible] context, it is regarded as a candidate at its location. And so each such edit location Lase customizes the edit and applies the result. So different from Sydit, Lase contains three phases and the first phase creates a program transformation will serve double duty, both find edit locations and applying edits. So in order to understand how effective Lase is when finding edit locations and applying edits, we have created a data set consist of 24 system editing tasks which [indiscernible] JDT and SWT. So for each systematic editing task, for each task, developers check in model one. They use repetitive bug fix patches in order to fix the same bug in different locations. So the reason why they check in multiple patches is that the initial patch was incomplete. They didn't fix the same bug in all the locations which contained the bugs and so all this task Sydit first the program transformation based on the code changes in the initial patch and then use that program transformation to find other edit locations and subject edits. So in order to evaluate Lase capability to find edit locations, so we measure this several things. I want to talk about the details of each column's meaning. I will just show you an example. So for instance, for this example two, we expected Lase to identify 16 examples, 16 edit locations to change similarly. And actually, Lase finds 13 locations. And by examining locations found by Lase, we find that 12 of them are correct suggestion. So [indiscernible] recall and accuracy to show that we have good size here. In terms of evaluating the performance of Lase when making edits, we post the examples, we post the evaluation results in the last columns. And it shows that, so, for this first example, we expect Lase to correctly infer nine edit operations and the results show that Lase actually can infer all the edit locations correctly. From this numbers of the edit operations, you can see that Lase can handle some very cheeky program transformation which contains around 34 edit operations. So the tool is really powerful. Okay. Andrew. >> Andrew: In that first example, you said that it should have found 13 but it only found 12. What did it get wrong? What is it about the matching that couldn't find that last example? >> Na Meng: So this is a good question. So the reason why he doesn't find the rest examples is still depends on the quality of the examples provided by developers. Since currently we created the context by extracting the commonality shaved between the examples. So if that commonality doesn't hold for some of the edit locations, we cannot find those locations out. It's like a bound is 16. Edit locations maybe you only share four statements in common. However, with examples perhaps we can file examples file statements and regard those file statements and the context. Actually it is a super size of the actual context shared between the examples. Does that answer your question? >> Andrew: Yeah. Brings up a related point which is do you think it's possible to fix Lase to do better? Or is this the algorithm itself has this limitation or is there some way to improve it? >> Na Meng: Yeah. So I think maybe one possible way is kind of trying some similar approach as [indiscernible] because they -so [indiscernible] has the approaches like the Sprite sheet with all transformation so it required to [indiscernible] to provide input output examples and they [indiscernible] the program to make the transformation. And if the transformation is wrong, then ask the developers to provide more examples to address the program transformation. So similarly, I think we can improve Lase in this similar way by requiring developers to provide some counter examples. Let's say if developers provide another example which only held four statements previously shared with the previous four examples then that is doable. So on average Lase can find edited locations with 99 percent procedure, 89 percent recall and it make edits which are very similar to the developer created the version so the similarity is 91 percent. So what is the most interesting finding we got in this evaluation is that for three bugs Lase actually suggested in total nine edits which developers missed and later confirmed. One bug the font size shouldn't be as float so we found that and we recommended that to them and we said, oh, yes, we actually miss it so that's good. In summary, Lase finds edit locations with high procedure and recall [indiscernible] edits with high accuracy. However, it may encourage the bad practice of creating and maintaining duplicated code by facilitating the process on the other hand, systematic edit repetitively applies to multiple locations may indicate a good refactoring opportunity to reduce the cold redundancy. So we have designed and implemented a new approach that is to export systematic edits for refactoring. And with this approach, we would like to investigate two research questions. First, can we always refactor code which enter those systematic edits and the second, are the refactored versions always preferred to the systematic edited versions? So in our approach, we declare developers to first provide all the code and locations they change systematically and with Lase, we can create a partially abstract context aware edit script so the edit script will help us identify all the edited statements in the new version of each code location and then scope a contiguous code region for refactoring. So the code region will include all the edited statements in each location. And finally, our approach will apply a series of refactoring in order to extract the common code, replace the core bounding code snippet in each location with a method call. So there are several challenges to automatically refactor code which under go systematic edits. First the editing code made yesterday would unchange the code. So the extracted code regions from different locations needs to be aligned with each other so that Rase can create method for the extracted code. Second, the code extracted from different locations may vary in terms of the identifiers and expressions they use. So we need to create a general representation of a template to represent the code regions from different locations while tolerating any difference between them. And the third one is extracting code may break data flow or control flows of the original code. So the tool needs to insert some extra code added to the extracted method or to the arrange the locations so that data flows and control flows are not modified after refactoring and actually in our implementation, the Rase is implemented to leverage system edit to extract the common code and it can also create some new tabs and methods as needed [indiscernible] and difference in terms of the types, in terms of the usage of types methods identifiers and expressions. And insert some return objects access labels in order to deal with data flow and control flows. So in order to evaluate how effective Rase is when refactoring systematically edited code, we use two date sets so the first set existing of 56 similarly changed method pairs which is borrowed from our Sydit evaluation and the second one exists of 30 similarly changed method groups. So our evaluation shows that Rase can automate 30 cases out of the 50 similar change method pairs and 20 cases out of the 30 method groups. It can always refactor more cases than the whole method extraction refactoring meaning that the systematic editing base for refactoring is always more flexible than the method of clone-based refactoring. The reason is systematic edits can always scope a smaller code region to a method. So in order to understand why Rase cannot automate the rest cases we better examine the code and classify the reasons into four categories. So the first reason is that there is limited language support for generic types. So if the code to extract contains some tab variations in the tab tracking expression, it will be really hard to extract the code. And the second is we want to extract some code out of a class and the code itself contains method locations to the private methods in that class. It will be very difficult to extract that code out because the private methods will become accessible to any code outside this class. And the third one is that if systematic edits only delete statements from code locations, there is no edited statements can be found in the new version of the code location, so as a result, systematic editing cannot help us scope code region for refactoring. And the fourth reason is that if the extracted code region from different locations are not very similar to each other, they do not share enough commonality, it will be very hard to extract code. So among these four reasons, some of the reasons such as the know edited statement font, is caused by the limitation of our implement of the systematic editing based approach. However some of the reasons such as the limited language support for direct taps and no common code extracted reveal the fact that refactoring is not always applicable to code which under go systematic edits and this answers our first research question. So our second research question is like refactoring always desirable to the systematic editing? So in order to answer this question, we have manually examined the version control history of the subject programs in order to see how developers actually maintain their software after making systematic edits. And the columns feasible and the infeasible corresponding to the -correspond to the case when Rase can automate a refactoring and when it cannot. The refactored role corresponds to the cases when developer actually refactor a code after making systematic edits to code locations. So our hypothesis is that the number at the intersection between factored and feasible should be as large as possible. If that is the case, we can predict refactoring based on systematic edits. However, as you can see, the result shows that in most of the cases, developers actually don't do refactoring even though me made systematic edits to a lot of code locations and for this refactored cases, there are maybe three ways that developers use to maintain their code. They may code eval the code locations by making repetitive code changes to this locations again and again. Or they may make divergent code changes to different locations so that the code will become more and more different from each other. Or they may even don't touch the code anymore because they think the code is stable, they don't want to introduce any changes to them. So in most cases, developers actually don't touch the code anymore. Actually we e-mailed the developers of this subject programs asking them why and when they would like to do refactoring and they answered we don't typically refactor unless we have to change the code for some bug fix a new feature. That means that refactoring is not always desirable compared to the systematic editing. So in summary, automatic clone removal refactoring is not always feasible for the code which under go systematic edits and the clone removal refactoring is not always desirable even if we can produce automatic tool support to conduct the refactoring and automated refactoring doesn't obviate the need for systematic editing. Developers need tool supports for both approaches. So here are some future directions I would like to pursue. I would like to recommend code changes across different project so that developers can leverage the expertise and knowledge of the other developers when maintaining their own software. So the hypothesis is that developers of different projects are motivated to share code changes and there are two scenario about this code sharing. So when developers build their applications on the same library or framework, they may share code changes when migrating their applications from one version of the library to another version. Another scenario is that when developers migrate their application from the desktop to mobile devices or to the cloud, they may share code changes in order to tailor the applications for the new computing results. And another search direction I would like to pursue is to automatically check the correctness of programs. So currently when we recommend code changes to developers, we rely on developers to make the good judgment about whether the edits are correct or not, whether they should take it or not. However, if we can automatically check the correctness of the recommended edits, developer will have more confidence in the code recommendation made by us. And finally, we would like to recommend code changes for mixed language programming so motivating scenario is that nowadays a lot of scientists develop their prototype tools in lab because they are changing that way. Later they will transform the code into C code so that they can have better performance and later they think that equation program may still not fit their need, maybe they also want to translate them again to the paralleled programs. So although currently we can get a lot of tool support from the compilers and the refactoring, the tool support is still not sufficient. So I would like to recommend the code changes to have developers translate their programs from one language to another one or to help program ares to translate their mono lingo programs to multiple lingo programs. So that developers can benefit from the advantages provided by different languages without worrying too much about the interaction between languages. So to conclude, so we have designed two approaches to recommend code changes based on code change examples provided by developers. And in addition to recommend code changes we can also find edit locations with the program transformation. And we also provide experimental evidence showing that clone removal refactoring is not always feasible or desirable by developers and developers actually need both tool support for automatic removal refactoring and systematic editing. So thank you for your attention. This concludes my talk. Do you have any other questions? [Applause] >>: So my first question is I think I missed the description. Software corpuses that you used. I don't recall seeing a slide there. Could you talk a little bit more about that because that obviously affects what judgments you can draw or what conclusions you can draw from your research. >> Na Meng: Yeah. So to evaluate our first approach, I mined software repositories of five open source projects. Which are under Eclipse ->>: Projects? >> Na Meng: The projects Eclipse JBT. >>: What does JBT mean? >>: [Indiscernible]. >> Na Meng: >>: Yeah. I'm buried under acronyms. >> Na Meng: And there are some other plug-ins that are also open source projects. >>: So what five things did you look at? All parts of Eclipse? All things in Eclipse? >> Na Meng: No, not all part. I suggest the sum of the parts, yeah. And so those ones, we tried to find similarly changed methods and in each example ->>: He's more interested in what the programs did. So you had a compiler. What other -- what were the five other things that they did? >>: Yes. That was all. >> Na Meng: Okay. So Eclipse, JBT. So it provides the basic support for Eclipse and for the development of the Java. And another like the compare plug-in which is aiming at comparing the difference between two texts, and another thing is debug core which is maybe the kernel of the debugging feature of Eclipse and another is from DNS. It is not Eclipse. It's like a domain name implementation in Java. >>: In the work service protocols. >> Na Meng: >>: Yeah. [Indiscernible] it's on the list very good question. >> Na Meng: Yeah. And so Lase, JBT, the other one is SWT. So SWT is used to develop the GUI in Java and for that SWT, the interesting part about that is that it actually has a product line. It has different implementations of SWT at different OS platforms. So it is very common that we can find similar code changes applied to different projects because they seem to branch out from the same products. And this code changes sometimes they're identical and sometimes they're different. >>: My next question. How do you think Java influenced your results? Would things be different if you looked at C or C++ or whatever other languages are out there? >>: C#. [Laughter] >>: C++. >> Na Meng: Yeah. So currently, in terms of the implementation, it depends on the static analysis conducted based on the syntax tree. And I think it is not limited by the approach is not limited by this implementation. I think we can see something similar, some similar results in other languaged programs. In terms of the refactoring, I think the question is about the refactoring capability because I mentioned some limitation is like it's limited to support for the language so that as a result, we cannot do the refactoring and for some other programming languages, even if its doesn't contain that constraint like C++ it still have the constraint like if the code from different locations are not very similar to each other, the common code interleave with uncommon code, it will still be difficult to refactor the code. So this is not limited by the language feature. It is just a common across different languages. >>: But it could be that people program in languages to follow up on David's question, could be people program differently in these programs and because you're using an objector in its style, it's actually more likely that people are doing systematic editing in some languages than other languages. Do you ever an opinion on that? >> Na Meng: I don't think so because it depends that -- so it seems there's some other similar to the cult clones so although people have a lot of cold clones in Java programs, they also have cold clones in C programs, C-plus programs. It is unavoidable that systems they may make similar code changes to all these cold clones and developers totally agree that cold clones are not avoidable. They may meet the redundancy sometimes. >>: I think in C++ at least you can see a lot of subtleties around the language semantics. Picking up code and moving code around in C++ can have non-obvious side effects that you don't expect you wouldn't encounter in Java. So like the destructors and I've seen really horrible code where you declare a variable and when the variable goes out a scope like a destructor will. It's like, you gotta be kidding me but that's what people do. >>: So you think it would be systematic edits would have higher potential to introduce errors because of the badness of the language used? >>: I think you would either -- this is all conjecture on my part but either you have a potential for introducing subtle problems because you're working at a lower level as a programmer the flip side of that might be more constrained in terms of needing to improve more semantic information about saying, well, I could do this but I'm not sure about this because of the semantics involved here. So you might have to be a little bit more informed by the semantics. >>: I'm on a team that do the automatic code refactoring and fix-ups in Windows and we -- so the platform sits on top. We've kind of tried to build this global picture of the entire product and then we say, oh, well, knowing all of the things we know at this point, we can try to make this edit. So that kind of takes care of the -- at least the finding part of it because you have leveraging the entire -- you have this massively huge nuclear weapon that you're trying do a little code edit on. >>: It's actually a question for both of you then. In C and C++ languages, you've got the preprocessor and you had mentioned [indiscernible]. If the same code you want to edit is in several different conditions of a pound if, maybe editing it for one condition might be actually breaking it for another? How do you deal with that kind of dealing with the preprocessor. If you were to apply it to C or C++, do you just have to test every single possible definition of pound if before you're allowed to change something? >> Na Meng: Yeah. I think that should be the only way that you could guarantee the correctness. >>: In the pretext world, do you scope to the architecture and the code based and stuff? You're going to get whatever the pound ifs are for whatever the build is for that particular. >>: I can imagine if you made the change you can insert of the right pound ifs with the change to make sure it only applies to the one you analyzed and not anything else. >> Na Meng: Yeah. I think that's why also applied to this. So we have different configurations, maybe we will get the different versions of the same program based on configuration. So similar things. >>: Yeah. Understanding the registry stuff for example, are you talking about a 64-bit registry? 32-bit registry? A lot of that stuff depends on ->> Na Meng: Yeah. >>: One thing, when you talked about one of your future works, you talked about leveraging multiple edits. I think if you go back two slide, code change recommendation across products, have you thought about crowd sourcing change recommendations in a way that if you looked at let's say Sydit to make all these changes to 50,000 projects on source [indiscernible] and I probably just dated myself, but let's say you found 50,000 projects, do you think you'd find for all the possible check-ins and changes that they made like 50,000 change templates or would it be more like ten and like if it's ten, very commonly you'd probably find factoring templates that are probably a lot more common but even the ones that the next level down that didn't make it to refactoring, could you imagine then if you bind all projects around the world you come up with instead of like 30 change templates that pretty much cover everything any programmer has ever really wanted to do expect maybe the one percent be kind of a cool tool to be able to leverage that. >> Na Meng: Yeah. So in terms of the coverage about the code changes we want to handle currently I do not have explicit boundary about that. So in terms of like when developers migrated their applications from -- so I would like to recommend code changes based on something which is shared between developers. Let's say developers are sharing some code or they are sharing some libraries. In that case they have motivation to share their code changes. So if developers checking any random changes which are just specific to their projects, I do not want to recommend those code changes to other developers because they are very specific and limited to these projects and it is really hard to argue that my code recommendation will actually help other programmers. >>: There's a paper from [indiscernible] where they looked at changes made by students who received a particular kind of compiler warning. And this is over like 50, 60,000 sessions of editing sessions but students across ten different universities and abstracted that out into a set of things that says like people who have this kind of compiler error usually make this kind of change. >> Na Meng: Oh, really? >>: And there is only like 5 or 10 different changes that were even sort of possible for lots of these compiler errors of something where it's simpler like you need to add an extra parameter to your method. >> Na Meng: Yeah. That's interesting. So in order to identify the common code changes, definitely we need something to give us the Oracle so in your scenario, the compiler gave the Oracle about the warnings so that you can -- it can help you automatically label what kind of code changes are interesting and how to mine the software, how to mine the code changes. In our case, maybe our Oracle can come from the commit messages made by developers if they have some similar commit messages talking about the similar box or talking about a similar feature editions then that is a good indicator. >>: Stack overflow. >> Na Meng: Yeah. Stack overflow. I really think that stack overflow is a good resource to learn from other developers. I always find it hard to ask the right question and to find the relevant answers to that question. So basically, I'm a site person. I don't always want to ask my own questions. I just want to mine the relevant answers to my question. Sometimes I will be fortunate enough to use the correct keywords in order to find answers but sometimes I just cannot. And I really like stack overflow but I think there should be some better way to recommend code changes to developers. >>: Just do it when they're sleep. [Laughter] >>: I'll ask 1 or 2 more questions. So what have you done with your system? Did you make it available to other people to share? >> Na Meng: Yeah. Yeah. Yeah. So I made this tools publicly available and sometimes if I do not put them on my website so people will send me e-mails asking can I get original copy of it and I just send it on. And for the data set, I also make it publicly available and I will also improve that for the third part because we got something saying you should put this data set publicly available so other people can use the same tests. >>: it? So your tool is publicly available. And are people using >> Na Meng: Some researchers are using it. see any developers using it. I currently do not >>: Is that just because you haven't advertised it or it needs a lot more work to be product quality? >> Na Meng: I think both. >>: Obviously your Ph.D. project so nobody is expecting you to ->> Na Meng: >>: No. -- large number of developers can use. >>: Why do you think it needs to be more -- what would you do to it if you were just going to make it into a product and try to sell it so people would want to use it? >> Na Meng: Yeah. Yeah. I [indiscernible] and second I in some companies and at the the developers to try to use it. >>: What does the UI look like? >> Na Meng: >>: will first do the enhancements will also try to do some internship same time tell the developers to ask the tools and give us feedback about UI is a [indiscernible] plug-in. But how? >> Na Meng: Need to try some buttons to show the example, they just select the new old version and the new version. Got both versions and right click some button so we will automatically generate the program transformation. And in order to select the target locations, they also needs to manually -- if they want to select target locations, they need to pick them and then the user selection will be shown as a table in the UI. So you think the implementation is -- like people can use it? You don't think they're not a bunch of bugs that you would fix or try to make it more generic? >> Na Meng: I think the UI is good. not quite sure how many bugs there. In terms. -- actually I'm >>: So because you have only used it, you don't have experience yet on what you need to fix probably. >> Na Meng: Yeah. I saw, but I just fix it. the other part, yeah. >> Kathryn McKinley: >> Na Meng: [Applause]. More questions? Thank you. I'm not quite sure All right. Thank you.