>> Kathryn McKinley: I'm Kathryn McKinley and it's my... to welcome Na Meng who got her Ph.D. from the...

advertisement
>> Kathryn McKinley: I'm Kathryn McKinley and it's my pleasure
to welcome Na Meng who got her Ph.D. from the University of Texas
working with me and Professor Miryuhng Kim who is now at UCLA and
she's been interested in how to make software tools that make
programmers more productive and produce better code with fewer
errors. I'll let her tell you more about her thesis work now.
Oh, and one more thing, she got my best reviews ever. Someone
wrote: I love this paper on her reviews.
>> Na Meng: Thank you, Kathryn, for the introduction. Hello
everyone. I feel great honor to have this opportunity to present
here. And please feel free to give me questions as I go through
the talk. So today I'd like to talk about automating program
transformations based on examples of systematic errors. So
software development really costs a lot of time, money, and
effort within the different faces of software life cycle,
maintenance incurs the greatest cost. So this cost has risen
dramatically have people have estimated that nowadays maintenance
accounts for more than 60 percent of the overall cost of all
software. So even with all this expense and effort, the U.S.
economy still pays more than $60 billion every year due to
software errors. So we have designed and implemented approaches
to help developers improve productivity and software equality so
that we can reduce both software maintenance costs and serious
consequences caused by software errors. So actually, during
software maintenance, developers make a lot of code changes to
fix bugs and new features migration applications and refactor
code. There are some existing tools to help developers maintain
their software by automatically making code changes such as the
such as the search and the replace feature of editors,
refactoring tools and bug fixing cools. However, the tool
support is very limited. For instance, the widely used search
and replace feature of editors can help developers find keywords
and then replace the keywords with alternative words. However,
the feature can only apply identical text replacement. And this
feature doesn't actually observe the syntax or semantics of
programs. Refactoring tools can help developer to improved code
implementation by applying a series of behavior-preserving
program transformations. However, these tools are limited by the
predefined semantic preserving errors so similarly, bug fixing
tools can help developers finds bugs and even fix bugs. However
they're limited by the predefined bug errors or the inherent
fixing strategies. So when developers want to make some code
changes which are not covered but existing tools, they're always
on their own. And so we would like to have developers to make
code changes. So the inside of our research is based on some
observations made by some recent studies which is many code
changes are systematic meaning that there are the lot of similar
but not necessarily identical code changes applied to different
locations. So one study shows that 75 percent of structural
change to see material software is systematic. And the other
study shows that 17 to 35 percent of bug fixes are actually
recurring fixes involving similar edits to methods. So if we can
automatically apply this similar code changes to multiple
locations we can significantly improve programmer productivity
and software equality. So to give you a scenario of the
systematic editing, I put example here. So developer Pat a wants
to update this but database transaction code to prevent the SQL
injection attacks. The programmer may need to first identify all
the locations relevant to the edits and then manually change the
locations one by one. So in this process, with more details,
when the programmer wants the make the edit, he may 1st plan the
edit in mind or by hand and then manually find all the edit
locations. So for each edit location the programmer needs to
first customize the edit and the applied edit because this
different locations are similar but sometimes different. Like
they may contain different statements, they may use different
identifiers. So this process is tedious and error prone. So
after making this repetitive code changes to different locations,
developers may also want to do the code or refactoring in order
to extract some common code between different locations so that
they can reduce the code redundancy. So by extracting out the
common code between different locations, they may furthermore
reduce the repetitive future systematic edits to other locations
into the single updates to extracted code. So in order to have
developer to see perform such tedious and error prone tests we
have designed and implemented approaches to automatically create
program transformation from code change examples provided by
developer. This program transformation can help developer to
apply customized edits to each single edit locations they want to
change similarly. It can also help developers to find locations
when develops find oh, it is hard to identify all the locations
themselves. And after make the systematic edit, our approach can
also help developers to automatically refactor in a code in order
to reduce code redundancy and in that way they can prevent the
future systematic edits by only updating the single location
which contains the extracted common code. So in the follow part
of my talk, I will first talk about how did we locate and apply
systematic edits and then I will talk about how did we export
systematic edits for refactoring. Next I will talk about the
possible future directions of our research and then conclude. Do
you have any questions? So far so good. In the first part of my
research, actually we have designed and implemented two
approaches. The first approach is that when developers want to
make code changes to different locations in a similar way, we
only require them to change one of those locations and show it as
an example. And by learning from this single code change
example, our approach sided with automatic credit program
transformation in order to change all the other locations
similarly. So there are many challenges in this approach. The
first, the code in different locations may contain different
syntactic structures and identifiers. Therefore, the generalized
program transformation should tolerate the difference so that
later it is applicable to locations containing different context.
And the second challenge is that the edits in different locations
may vary in terms of their provisions to apply and the
identifiers they use. Therefore, we'll automatically apply in
the program transformation to different locations, the tool needs
to first examine the context in order to correctly position edits
with respect to the program structures and use the proper
identifiers. So when developers provide code change example,
Sydit first leverages a program difference algorithm to create an
AST edit script which can represent the code changes. It then
extracts context irrelevant to the edit using program dependent
analysis. So the extracted context will serve to help our
program transformation correctly position the edit operations in
the new targeted locations. In order to generalize a program
transformation which is applicable to locations containing
different identifiers and different program structures, we need
to abstract both identifiers and edit predictions in the example
edit script. For instance, for the edited statements, we need
to -- with identifier abstraction, we'll replace the concrete
identifiers of variables methods and the text with corresponding
symbolic names. And at the end of this step, we can guide a
general abstract program transformation from the exemplar code
changes and when developers specify the calculated locations they
want to change similarly, establishes context matching between
the abstract contacts and each targeted location so that it can
customize the edit and furthermore apply the edits to the
locations. So there are merely two phases. The first phase is
to create the general program transformation and the second phase
is to apply it to a targeted locations. So in the following
several slides, I will talk about each type with more details.
And this example will run through all the steps so we created
this example based on the real code changes we mine from the open
source approach at Eclipse debug core. So in this example, there
are two similarly changed matters. And when developers change
the matters and show it as code change example Sydit will
leverage this to generate a program transformation. So as you
can see from this code example, developers make some -- actually
refactor a code by changing the configured declaration and
actually the developers also add some new feature in order do
something else. So Sydit first creates AST edit script using a
program differencing algorithm so the AST edit script may consist
of four types of edit operations such as statement insert delete
update and move. So this edit script kind of represent the
difference between the old version and the new version of the
changed method. For the specific example, it creates an update,
a move and a several insert operation. So in the second step,
Sydit extracts a compact relevant to the edit in order to
identify the context which later can be used to correctly
position edits in the target locations. So it leverages program
dependence analysis. For instance, with the edited statements
marked in the orange color, Sydit first leverages control
dependence -- leverages containment dependence analysis in order
to find all the nodes which contain the edited ones such as the
method declaration and while estimate and if statement. It then
leverages control dependence analysis in order to identify nulls
which is control dependent on by the edited nodes. And finally
it identifies nodes which are data dependent on by the annotated
statements like the either declaration and the configure
declaration. So this stuff actually tries -- so by extracting
the context to all the example script, we try to encode the
constraints it has for its targeted locations in terms of
controlled dependence and data dependence relations. So if later
a targeted location can contain some context that matching this
added relevant context, it means that it can meet all the
constraints in terms of the data dependence and the control
dependence proposed by the edit script and in that way it will be
safer for us to apply the edits to those locations. So now the
other two generalize a program transformation which is applicable
to the locations containing different program structures or using
different identifiers, we abstract the identifiers and add in
positions used in the example edit script. So with identifier
abstraction, we replace the concrete identifiers with
call-responding symbolic names and with added abstraction, we
recalculate the position of each added script with risk back to
the abstract context. So at the end of this step we can get an
abstract context aware edit script. The way users structure the
targeted locations, we will try to apply the general program
transformation to those locations. So before applying the edit
script to a targeted location, Sydit he needs to first try to
establish a context matching between the abstract context and the
targeted location. The other two identify a substrate in the
targeted location which corresponds to the abstract context. If
there is a such matching, it means that the target location can
meet all the constraints like the data dependence and the control
dependence proposed by the edits we want to apply. And the basic
idea of in algorithm, the context matching algorithm is like we
first do the leaf node matches between the two context and so for
all the candidate leaf node matches of the abstract contact we
try to identify the best match in the targeted location which is
corresponding to the path between a leaf node and the root node.
So editing all these step, we want to establish 1-to-1 node
mapping so each node in the abstract context. And when developer
provide this method be as the targeted location to change
similarly this step can't identify the nodes which are marked
with the yellow and the relevant context and then furthermore
establish identifier mapping between the abstract context and the
concrete context. So in this way, the identifier mappings will
facilitate us to customize the general program transformation for
the targeted location so the customization consists of two parts.
One is to concretize the identifiers and the other is to
concretize the other positions. So with identifier
concretization, we replace the symbolic names with corresponding
concrete ones. And with edit position concretization we with
recalculate the position of each edit operation with respect to
the targeted location. So at the end of this step, we can create
a customized concrete edit script which is applicable to the
targeted location and by applying the edit script we actually
changed the code and then show it to developers to ask for their
opinion whether they would like to take it otherwise or not. So
the other -- questions.
>>: How do you know from one example what the right extract is?
Like in your example you have a method column that returned an
object that you called another method on and then cast the return
result. I mean, couldn't that have been abstracted out as just
another -- a single method call? I mean. That could have been
the attention of the user, that the thing that they were
refactoring was an invasive column to some other class? Do they
ever see the abstraction of the pattern?
>> Na Meng: Currently they cannot see the abstraction of the
patterns. So we put that inherently in the tool. We just -- so
in this default citing we just generalize all the identifiers and
actually we're not quite sure what is the best way to extract the
context relevant to the edit. In our evaluation, we also explore
different ways to model the context and use the context to modify
different targeted locations. And we can find some best settings
for our evaluation but I'm not quite sure whether the setting can
also be applicable to other scenario which is not covered by the
evaluation settings.
>>:
Could you show the transformation to the users?
>> Na Meng: Yes, we can. We can. So here, we have this added
scrapes which scribes which node you should change and how to
change it and actually some other students in Mirelle's group who
was an undergraduate student who built a tool based on my
project, the basic idea is to show the added script to developers
so that developers can read them and even modify them if they are
not satisfied by the representation of the edit script. So after
their modification, in that project, we'll apply the modified
version of the added script to the targeted location. So we gave
the developers opportunity to modify the [indiscernible] for the
program transformation. So in order to evaluate how effective
Sydit is when making systematic edits, we create a data set
consisting of 56 examples. So we found those examples from file
open source projects. In each example, there are two similarly
changed methods and for each example, Sydit [indiscernible] first
of program transformation from one method and then apply the
transformation to the other method. In order to better
understand the examples and better understand the capability of
Sydit, we classified examples into six categories based on
different matrix. First whether the similar trenched methods are
trenched identically or differently and the second whether the
added script involved a single node change or multiple node
changes. And the third, if the nodes are -- if multiple nodes
are involved in Unicode changes, whether they are contiguous or
non-contiguous. So for the search and the replace feature, why
don't we use the search and replace feature? Making the single
node identical changes is very easy. However, for the changes
which may involve a lot of non-contiguous node changes and the
nodes changes apply to different locations are different, it will
be very hard. And this is also a challenge it has for other
existing approaches. And however, in this scenario, our tool can
do a better job and we evaluated our tool with this is at. The
evaluation shows that by inferring the program chance formation
from one code change, Sydit can create and apply program
transformation for 82 percent of the examples. And it can
produce correct edits and apply the correct edits to the targeted
locations in 70 percent of examples. The similarity between the
tool-generated and the human-created version is 96 percent on
average, meaning that we are very close to the developer created
versions. And to answer Andrew's questions, so we also explored,
try to explore what is the best way to extract the context in
order to achieve a good balance between context batching
flexibility and added application practice. So we first changed
a number of dependence distance in our program study analysis.
When K equal to one, we only include nodes have I have direct
dependence relations with the edited nodes into the context.
When K equal to two, we also include nodes which have interactive
dependence relations with -- into the context. And our
hypothesis is that the more nodes we include into the context,
the more constraints the resulting program transformation will
put on to the target location. And as a result, we may fail in
more cases in order to establish context matching between our
generalized program transformation to the target locations and
evaluation shows that one K equal to one, we can guess to the
best of performance. And in our second settings, we also changed
the abstraction settings in the identifier abstraction steps. So
by default, we generalize the program transformation by
abstracting all the identifiers including the variables, methods
and [indiscernible]. However, we can also do it differently. We
can only generalize maybe a part of those identifiers or even
don't do any abstraction. And so the more variables we
generalize, the more identifiers we generalize, the more flexible
the resulting program transformation will be when establishing
identifier matchings with the targeted location. Without
surprise, the four identifier abstraction leads to the best
performance while no abstraction leads to the worse performance.
And finally, what we changed the ->>:
I have a question.
What does V, T and M stand for here?
>> Na Meng: V stands for the variable identifiers. T stands for
the type name identifiers and the M stands for method
identifiers. Andrew, please.
>>: How many of the examples in the corpus that you tested
require the power of structural matching or how many could have
been written just using like regular expression and matching?
Because that would be the alternate, the more sum version of
search and replaces playing with braces and regular expressions
and everything and often you can get what you want. In some
cases it's harder, but I'd be curious to know.
>> Na Meng: Yeah. So actually, I haven't measured that specific
one in the evaluation, but maybe from the classification of the
examples, I can get some idea like say for the single node
changes, if you want to make the identical code changes, sure,
definitely you can do that. If you want to make different code
changes, let's say in one place you want to replace the
identifier A with four. In the other location the want to
replace the identifier B with bar. Since you want to apply
slightly different changes [indiscernible] whether rectangular
patterns can help you do that. And the fold in multiple nodes
changes. Let's say if you want to change, maybe contiguous code
changes, if you want to change a contiguous code region, maybe
regular expressions can help you express that. But for the cases
which like the non-contiguous code regions, you want to change
several code regions which contains some interesting code so in
that case, regular expression may not quite help because they are
not contiguous. I'm not sure how do you express the gap between
the gaps standing between the added code regions.
>>: So could right now your tool produces another way to think
about answering Andrew's question is right now your tool produces
insert delete move and update. And that is a regular expression
language, right? So in some sense, the process by which you
produce that, the description of the transformation could use the
form he suggests but when you are making it concrete, it's a
different process than regular expression batch. Like maybe you
could do the matching partially in that language like just like
regular expression matching is almost terming complete, right?
You could probably express anything. But that's like just a
different way, like how do you generate that?
>>: At the risk of sidetracking, the real issue is parse tree
that's a context free grammar. There's going to be some things
you can't capture with a regular expression course and the
problem is can you summarize the parts that you don't care about
with some simplistic regular expression? I think it will be hit
or miss in practice whether or not you can do that. It's just
the theory behind it. But I think it's a great question. Having
used this approach, it's a good question.
>>: For your matching you could classify everything using
[indiscernible] and then go after the top matches, right? So
then that would allow us to actually allow some deviation in the
parse stream so you might be able to get away with editing stuff
that is not an exact match. So then you would cover your gaps,
right?
>> Na Meng:
So how do you leverage the merged learning part?
>>: You take the AST and then have a classification for all the
ASTs that you're interested in and then throw your machine
learning at all of the ASTs that are candidates and then for kind
of your top matches, those are the ones that you would consider
do the code edits for.
>> Na Meng: The thing I'm a little concerned about this approach
is that you don't know what kind of code changes developers want
to make. There is not a vocabulary for possible changes
throughout [indiscernible]. And the way extract the context
based on the code changes by developer. So if they make
arbitrary code changes then the extracted code template can be
arbitrary. So definitely you can have a limited assess of the
possible AST structure but we don't know what there will be a lot
of ways to combine this different structure scale in order to
construct the different programs. So with the learning part, I'm
not quite sure whether it will over feast model for a specific
set and if later developers introduce some new code changes which
is not covered in this training set, which model is not well
trained for, then maybe we will have problems in terms of the
matching and applying changes.
>>: And your tool doesn't require exact matching.
shorter answer to his question. Right?
>> Na Meng:
That's like a
Okay.
>>: You already can do a partial change that the programmer
needs to do and you just say that it matches if you have got like
85 percent similarity, right?
>> Na Meng:
Yeah.
>>: Then you don't have to get [indiscernible] like as a
practice for an interview talk.
>> Na Meng: Okay. So in the third settings we modify the up
swing and the down swing dependent setting. So up swing means
that we also include nodes which are is depended on by the
[indiscernible] shows what you will the down swing means that we
also include nodes which depend on the additive nodes. So it's
like the direction of the dependence between annotated ones and
changed ones. And evaluation shows that up swing only leads to
the best performance. So finally, after identifying all the
entrenched nodes or combining the identifier untrenched nodes
together with the edited nodes, a set of extract context relevant
to the edit is.
>>: I have a question. So did you look at the idea of taking
more than one? Like sometimes programmers will write a couple of
times and say, look, I'm tired of this. I have a million-line
code base. Did you at the example where maybe you took two
examples?
>> Na Meng: Yeah. I will talk about this in next slide. Thank
you for the question. It's a good introduction. So in summary,
[indiscernible] systematic edits with high accuracy in many
cases. However, in order to make the best usage of this tool,
developers need to first pick a good example in order to cover
all the added locations they want to apply within the systematic
edit. And second they need to manually identify all the
locations they want to change similarly. And if the developers
cannot find all the locations, then the tool cannot help. And we
will like to have developers to automatically find those added
locations because in some scenarios, finding edit location social
security more challenging than making the edits. And by
automating this process to find other locations, we can have
developer to see avoid some errors of omission. And so we have
designed and implemented a new approach called Lase which extends
Sydit by adding a feature of finding edit locations. To use the
tool, developers need to provide tool more code change examples
and similar to Sydit, Lase also leverages a program differencing
algorithm to create edit scripts. However, different
[indiscernible] needs to identify the common edit shared between
different edit scripts and then regard a comma edit as a
systematic edit demonstrated by all this code change examples.
So in this way, Lase can filter out any edit operations specific
to some of the code change examples. And the similar to Sydit,
Lase also needs to generalize identifiers in order to create the
general program transformation. However, it only generalizes
identifiers when necessary. For example, with this two edited
statements, since they are different in the terms of the usage of
single variable E versus [indiscernible], Lase creates a general
program -- general representation by abstracting the single
variable while keeping the others concrete. So in this way, Lase
makes possible that the resulting program transformation can
correctly find edit locations which are specialized by accessing
a certain field [indiscernible] a specific method. And similar
to Sydit, Lase also needs to extract the context relevant to the
edit in order to later correctly position the edits to other
locations. However, a little different thing is that -- sorry.
Is that it needs to align the context extracted from different
locations and then finds the common context to share between
them. Using that one as the edit relevant context. And at the
end of this step, Lase creates a partially abstract context aware
edit script and with this edit script, Lase try to see establish
context matching between this abstract context and all the
methods contained in the whole projects. So if a method contains
a context matching the [indiscernible] context, it is regarded as
a candidate at its location. And so each such edit location Lase
customizes the edit and applies the result. So different from
Sydit, Lase contains three phases and the first phase creates a
program transformation will serve double duty, both find edit
locations and applying edits. So in order to understand how
effective Lase is when finding edit locations and applying edits,
we have created a data set consist of 24 system editing tasks
which [indiscernible] JDT and SWT. So for each systematic
editing task, for each task, developers check in model one. They
use repetitive bug fix patches in order to fix the same bug in
different locations. So the reason why they check in multiple
patches is that the initial patch was incomplete. They didn't
fix the same bug in all the locations which contained the bugs
and so all this task Sydit first the program transformation based
on the code changes in the initial patch and then use that
program transformation to find other edit locations and subject
edits. So in order to evaluate Lase capability to find edit
locations, so we measure this several things. I want to talk
about the details of each column's meaning. I will just show you
an example. So for instance, for this example two, we expected
Lase to identify 16 examples, 16 edit locations to change
similarly. And actually, Lase finds 13 locations. And by
examining locations found by Lase, we find that 12 of them are
correct suggestion. So [indiscernible] recall and accuracy to
show that we have good size here. In terms of evaluating the
performance of Lase when making edits, we post the examples, we
post the evaluation results in the last columns. And it shows
that, so, for this first example, we expect Lase to correctly
infer nine edit operations and the results show that Lase
actually can infer all the edit locations correctly. From this
numbers of the edit operations, you can see that Lase can handle
some very cheeky program transformation which contains around 34
edit operations. So the tool is really powerful. Okay. Andrew.
>> Andrew: In that first example, you said that it should have
found 13 but it only found 12. What did it get wrong? What is
it about the matching that couldn't find that last example?
>> Na Meng: So this is a good question. So the reason why he
doesn't find the rest examples is still depends on the quality of
the examples provided by developers. Since currently we created
the context by extracting the commonality shaved between the
examples. So if that commonality doesn't hold for some of the
edit locations, we cannot find those locations out. It's like a
bound is 16. Edit locations maybe you only share four statements
in common. However, with examples perhaps we can file examples
file statements and regard those file statements and the context.
Actually it is a super size of the actual context shared between
the examples. Does that answer your question?
>> Andrew: Yeah. Brings up a related point which is do you
think it's possible to fix Lase to do better? Or is this the
algorithm itself has this limitation or is there some way to
improve it?
>> Na Meng: Yeah. So I think maybe one possible way is kind of
trying some similar approach as [indiscernible] because they -so [indiscernible] has the approaches like the Sprite sheet with
all transformation so it required to [indiscernible] to provide
input output examples and they [indiscernible] the program to
make the transformation. And if the transformation is wrong,
then ask the developers to provide more examples to address the
program transformation. So similarly, I think we can improve
Lase in this similar way by requiring developers to provide some
counter examples. Let's say if developers provide another
example which only held four statements previously shared with
the previous four examples then that is doable. So on average
Lase can find edited locations with 99 percent procedure,
89 percent recall and it make edits which are very similar to the
developer created the version so the similarity is 91 percent.
So what is the most interesting finding we got in this evaluation
is that for three bugs Lase actually suggested in total nine
edits which developers missed and later confirmed. One bug the
font size shouldn't be as float so we found that and we
recommended that to them and we said, oh, yes, we actually miss
it so that's good. In summary, Lase finds edit locations with
high procedure and recall [indiscernible] edits with high
accuracy. However, it may encourage the bad practice of creating
and maintaining duplicated code by facilitating the process on
the other hand, systematic edit repetitively applies to multiple
locations may indicate a good refactoring opportunity to reduce
the cold redundancy. So we have designed and implemented a new
approach that is to export systematic edits for refactoring. And
with this approach, we would like to investigate two research
questions. First, can we always refactor code which enter those
systematic edits and the second, are the refactored versions
always preferred to the systematic edited versions? So in our
approach, we declare developers to first provide all the code and
locations they change systematically and with Lase, we can create
a partially abstract context aware edit script so the edit script
will help us identify all the edited statements in the new
version of each code location and then scope a contiguous code
region for refactoring. So the code region will include all the
edited statements in each location. And finally, our approach
will apply a series of refactoring in order to extract the common
code, replace the core bounding code snippet in each location
with a method call. So there are several challenges to
automatically refactor code which under go systematic edits.
First the editing code made yesterday would unchange the code.
So the extracted code regions from different locations needs to
be aligned with each other so that Rase can create method for the
extracted code. Second, the code extracted from different
locations may vary in terms of the identifiers and expressions
they use. So we need to create a general representation of a
template to represent the code regions from different locations
while tolerating any difference between them. And the third one
is extracting code may break data flow or control flows of the
original code. So the tool needs to insert some extra code added
to the extracted method or to the arrange the locations so that
data flows and control flows are not modified after refactoring
and actually in our implementation, the Rase is implemented to
leverage system edit to extract the common code and it can also
create some new tabs and methods as needed [indiscernible] and
difference in terms of the types, in terms of the usage of types
methods identifiers and expressions. And insert some return
objects access labels in order to deal with data flow and control
flows. So in order to evaluate how effective Rase is when
refactoring systematically edited code, we use two date sets so
the first set existing of 56 similarly changed method pairs which
is borrowed from our Sydit evaluation and the second one exists
of 30 similarly changed method groups. So our evaluation shows
that Rase can automate 30 cases out of the 50 similar change
method pairs and 20 cases out of the 30 method groups. It can
always refactor more cases than the whole method extraction
refactoring meaning that the systematic editing base for
refactoring is always more flexible than the method of
clone-based refactoring. The reason is systematic edits can
always scope a smaller code region to a method. So in order to
understand why Rase cannot automate the rest cases we better
examine the code and classify the reasons into four categories.
So the first reason is that there is limited language support for
generic types. So if the code to extract contains some tab
variations in the tab tracking expression, it will be really hard
to extract the code. And the second is we want to extract some
code out of a class and the code itself contains method locations
to the private methods in that class. It will be very difficult
to extract that code out because the private methods will become
accessible to any code outside this class. And the third one is
that if systematic edits only delete statements from code
locations, there is no edited statements can be found in the new
version of the code location, so as a result, systematic editing
cannot help us scope code region for refactoring. And the fourth
reason is that if the extracted code region from different
locations are not very similar to each other, they do not share
enough commonality, it will be very hard to extract code. So
among these four reasons, some of the reasons such as the know
edited statement font, is caused by the limitation of our
implement of the systematic editing based approach. However some
of the reasons such as the limited language support for direct
taps and no common code extracted reveal the fact that
refactoring is not always applicable to code which under go
systematic edits and this answers our first research question.
So our second research question is like refactoring always
desirable to the systematic editing? So in order to answer this
question, we have manually examined the version control history
of the subject programs in order to see how developers actually
maintain their software after making systematic edits. And the
columns feasible and the infeasible corresponding to the -correspond to the case when Rase can automate a refactoring and
when it cannot. The refactored role corresponds to the cases
when developer actually refactor a code after making systematic
edits to code locations. So our hypothesis is that the number at
the intersection between factored and feasible should be as large
as possible. If that is the case, we can predict refactoring
based on systematic edits. However, as you can see, the result
shows that in most of the cases, developers actually don't do
refactoring even though me made systematic edits to a lot of code
locations and for this refactored cases, there are maybe three
ways that developers use to maintain their code. They may code
eval the code locations by making repetitive code changes to this
locations again and again. Or they may make divergent code
changes to different locations so that the code will become more
and more different from each other. Or they may even don't touch
the code anymore because they think the code is stable, they
don't want to introduce any changes to them. So in most cases,
developers actually don't touch the code anymore. Actually we
e-mailed the developers of this subject programs asking them why
and when they would like to do refactoring and they answered we
don't typically refactor unless we have to change the code for
some bug fix a new feature. That means that refactoring is not
always desirable compared to the systematic editing. So in
summary, automatic clone removal refactoring is not always
feasible for the code which under go systematic edits and the
clone removal refactoring is not always desirable even if we can
produce automatic tool support to conduct the refactoring and
automated refactoring doesn't obviate the need for systematic
editing. Developers need tool supports for both approaches. So
here are some future directions I would like to pursue. I would
like to recommend code changes across different project so that
developers can leverage the expertise and knowledge of the other
developers when maintaining their own software. So the
hypothesis is that developers of different projects are motivated
to share code changes and there are two scenario about this code
sharing. So when developers build their applications on the same
library or framework, they may share code changes when migrating
their applications from one version of the library to another
version. Another scenario is that when developers migrate their
application from the desktop to mobile devices or to the cloud,
they may share code changes in order to tailor the applications
for the new computing results. And another search direction I
would like to pursue is to automatically check the correctness of
programs. So currently when we recommend code changes to
developers, we rely on developers to make the good judgment about
whether the edits are correct or not, whether they should take it
or not. However, if we can automatically check the correctness
of the recommended edits, developer will have more confidence in
the code recommendation made by us. And finally, we would like
to recommend code changes for mixed language programming so
motivating scenario is that nowadays a lot of scientists develop
their prototype tools in lab because they are changing that way.
Later they will transform the code into C code so that they can
have better performance and later they think that equation
program may still not fit their need, maybe they also want to
translate them again to the paralleled programs. So although
currently we can get a lot of tool support from the compilers and
the refactoring, the tool support is still not sufficient. So I
would like to recommend the code changes to have developers
translate their programs from one language to another one or to
help program ares to translate their mono lingo programs to
multiple lingo programs. So that developers can benefit from the
advantages provided by different languages without worrying too
much about the interaction between languages. So to conclude, so
we have designed two approaches to recommend code changes based
on code change examples provided by developers. And in addition
to recommend code changes we can also find edit locations with
the program transformation. And we also provide experimental
evidence showing that clone removal refactoring is not always
feasible or desirable by developers and developers actually need
both tool support for automatic removal refactoring and
systematic editing. So thank you for your attention. This
concludes my talk. Do you have any other questions?
[Applause]
>>: So my first question is I think I missed the description.
Software corpuses that you used. I don't recall seeing a slide
there. Could you talk a little bit more about that because that
obviously affects what judgments you can draw or what conclusions
you can draw from your research.
>> Na Meng: Yeah. So to evaluate our first approach, I mined
software repositories of five open source projects. Which are
under Eclipse ->>:
Projects?
>> Na Meng:
The projects Eclipse JBT.
>>:
What does JBT mean?
>>:
[Indiscernible].
>> Na Meng:
>>:
Yeah.
I'm buried under acronyms.
>> Na Meng: And there are some other plug-ins that are also open
source projects.
>>: So what five things did you look at?
All parts of Eclipse?
All things in Eclipse?
>> Na Meng: No, not all part. I suggest the sum of the parts,
yeah. And so those ones, we tried to find similarly changed
methods and in each example ->>: He's more interested in what the programs did. So you had a
compiler. What other -- what were the five other things that
they did?
>>:
Yes.
That was all.
>> Na Meng: Okay. So Eclipse, JBT. So it provides the basic
support for Eclipse and for the development of the Java. And
another like the compare plug-in which is aiming at comparing the
difference between two texts, and another thing is debug core
which is maybe the kernel of the debugging feature of Eclipse and
another is from DNS. It is not Eclipse. It's like a domain name
implementation in Java.
>>:
In the work service protocols.
>> Na Meng:
>>:
Yeah.
[Indiscernible] it's on the list very good question.
>> Na Meng: Yeah. And so Lase, JBT, the other one is SWT. So
SWT is used to develop the GUI in Java and for that SWT, the
interesting part about that is that it actually has a product
line. It has different implementations of SWT at different OS
platforms. So it is very common that we can find similar code
changes applied to different projects because they seem to branch
out from the same products. And this code changes sometimes
they're identical and sometimes they're different.
>>: My next question. How do you think Java influenced your
results? Would things be different if you looked at C or C++ or
whatever other languages are out there?
>>: C#.
[Laughter]
>>:
C++.
>> Na Meng: Yeah. So currently, in terms of the implementation,
it depends on the static analysis conducted based on the syntax
tree. And I think it is not limited by the approach is not
limited by this implementation. I think we can see something
similar, some similar results in other languaged programs. In
terms of the refactoring, I think the question is about the
refactoring capability because I mentioned some limitation is
like it's limited to support for the language so that as a
result, we cannot do the refactoring and for some other
programming languages, even if its doesn't contain that
constraint like C++ it still have the constraint like if the code
from different locations are not very similar to each other, the
common code interleave with uncommon code, it will still be
difficult to refactor the code. So this is not limited by the
language feature. It is just a common across different
languages.
>>: But it could be that people program in languages to follow
up on David's question, could be people program differently in
these programs and because you're using an objector in its style,
it's actually more likely that people are doing systematic
editing in some languages than other languages. Do you ever an
opinion on that?
>> Na Meng: I don't think so because it depends that -- so it
seems there's some other similar to the cult clones so although
people have a lot of cold clones in Java programs, they also have
cold clones in C programs, C-plus programs. It is unavoidable
that systems they may make similar code changes to all these
cold clones and developers totally agree that cold clones are not
avoidable. They may meet the redundancy sometimes.
>>: I think in C++ at least you can see a lot of subtleties
around the language semantics. Picking up code and moving code
around in C++ can have non-obvious side effects that you don't
expect you wouldn't encounter in Java. So like the destructors
and I've seen really horrible code where you declare a variable
and when the variable goes out a scope like a destructor will.
It's like, you gotta be kidding me but that's what people do.
>>: So you think it would be systematic edits would have higher
potential to introduce errors because of the badness of the
language used?
>>: I think you would either -- this is all conjecture on my
part but either you have a potential for introducing subtle
problems because you're working at a lower level as a programmer
the flip side of that might be more constrained in terms of
needing to improve more semantic information about saying, well,
I could do this but I'm not sure about this because of the
semantics involved here. So you might have to be a little bit
more informed by the semantics.
>>: I'm on a team that do the automatic code refactoring and
fix-ups in Windows and we -- so the platform sits on top. We've
kind of tried to build this global picture of the entire product
and then we say, oh, well, knowing all of the things we know at
this point, we can try to make this edit. So that kind of takes
care of the -- at least the finding part of it because you have
leveraging the entire -- you have this massively huge nuclear
weapon that you're trying do a little code edit on.
>>: It's actually a question for both of you then. In C and C++
languages, you've got the preprocessor and you had mentioned
[indiscernible]. If the same code you want to edit is in several
different conditions of a pound if, maybe editing it for one
condition might be actually breaking it for another? How do you
deal with that kind of dealing with the preprocessor. If you
were to apply it to C or C++, do you just have to test every
single possible definition of pound if before you're allowed to
change something?
>> Na Meng: Yeah. I think that should be the only way that you
could guarantee the correctness.
>>: In the pretext world, do you scope to the architecture and
the code based and stuff? You're going to get whatever the pound
ifs are for whatever the build is for that particular.
>>: I can imagine if you made the change you can insert of the
right pound ifs with the change to make sure it only applies to
the one you analyzed and not anything else.
>> Na Meng: Yeah. I think that's why also applied to this. So
we have different configurations, maybe we will get the different
versions of the same program based on configuration. So similar
things.
>>: Yeah. Understanding the registry stuff for example, are you
talking about a 64-bit registry? 32-bit registry? A lot of that
stuff depends on ->> Na Meng:
Yeah.
>>: One thing, when you talked about one of your future works,
you talked about leveraging multiple edits. I think if you go
back two slide, code change recommendation across products, have
you thought about crowd sourcing change recommendations in a way
that if you looked at let's say Sydit to make all these changes
to 50,000 projects on source [indiscernible] and I probably just
dated myself, but let's say you found 50,000 projects, do you
think you'd find for all the possible check-ins and changes that
they made like 50,000 change templates or would it be more like
ten and like if it's ten, very commonly you'd probably find
factoring templates that are probably a lot more common but even
the ones that the next level down that didn't make it to
refactoring, could you imagine then if you bind all projects
around the world you come up with instead of like 30 change
templates that pretty much cover everything any programmer has
ever really wanted to do expect maybe the one percent be kind of
a cool tool to be able to leverage that.
>> Na Meng: Yeah. So in terms of the coverage about the code
changes we want to handle currently I do not have explicit
boundary about that. So in terms of like when developers
migrated their applications from -- so I would like to recommend
code changes based on something which is shared between
developers. Let's say developers are sharing some code or they
are sharing some libraries. In that case they have motivation to
share their code changes. So if developers checking any random
changes which are just specific to their projects, I do not want
to recommend those code changes to other developers because they
are very specific and limited to these projects and it is really
hard to argue that my code recommendation will actually help
other programmers.
>>: There's a paper from [indiscernible] where they looked at
changes made by students who received a particular kind of
compiler warning. And this is over like 50, 60,000 sessions of
editing sessions but students across ten different universities
and abstracted that out into a set of things that says like
people who have this kind of compiler error usually make this
kind of change.
>> Na Meng:
Oh, really?
>>: And there is only like 5 or 10 different changes that were
even sort of possible for lots of these compiler errors of
something where it's simpler like you need to add an extra
parameter to your method.
>> Na Meng: Yeah. That's interesting. So in order to identify
the common code changes, definitely we need something to give us
the Oracle so in your scenario, the compiler gave the Oracle
about the warnings so that you can -- it can help you
automatically label what kind of code changes are interesting and
how to mine the software, how to mine the code changes. In our
case, maybe our Oracle can come from the commit messages made by
developers if they have some similar commit messages talking
about the similar box or talking about a similar feature editions
then that is a good indicator.
>>:
Stack overflow.
>> Na Meng: Yeah. Stack overflow. I really think that stack
overflow is a good resource to learn from other developers. I
always find it hard to ask the right question and to find the
relevant answers to that question. So basically, I'm a site
person. I don't always want to ask my own questions. I just
want to mine the relevant answers to my question. Sometimes I
will be fortunate enough to use the correct keywords in order to
find answers but sometimes I just cannot. And I really like
stack overflow but I think there should be some better way to
recommend code changes to developers.
>>: Just do it when they're sleep.
[Laughter]
>>: I'll ask 1 or 2 more questions. So what have you done with
your system? Did you make it available to other people to share?
>> Na Meng: Yeah. Yeah. Yeah. So I made this tools publicly
available and sometimes if I do not put them on my website so
people will send me e-mails asking can I get original copy of it
and I just send it on. And for the data set, I also make it
publicly available and I will also improve that for the third
part because we got something saying you should put this data set
publicly available so other people can use the same tests.
>>:
it?
So your tool is publicly available.
And are people using
>> Na Meng: Some researchers are using it.
see any developers using it.
I currently do not
>>: Is that just because you haven't advertised it or it needs a
lot more work to be product quality?
>> Na Meng:
I think both.
>>: Obviously your Ph.D. project so nobody is expecting you
to ->> Na Meng:
>>:
No.
-- large number of developers can use.
>>: Why do you think it needs to be more -- what would you do to
it if you were just going to make it into a product and try to
sell it so people would want to use it?
>> Na Meng: Yeah. Yeah. I
[indiscernible] and second I
in some companies and at the
the developers to try to use
it.
>>:
What does the UI look like?
>> Na Meng:
>>:
will first do the enhancements
will also try to do some internship
same time tell the developers to ask
the tools and give us feedback about
UI is a [indiscernible] plug-in.
But how?
>> Na Meng: Need to try some buttons to show the example, they
just select the new old version and the new version. Got both
versions and right click some button so we will automatically
generate the program transformation. And in order to select the
target locations, they also needs to manually -- if they want to
select target locations, they need to pick them and then the user
selection will be shown as a table in the UI. So you think the
implementation is -- like people can use it? You don't think
they're not a bunch of bugs that you would fix or try to make it
more generic?
>> Na Meng: I think the UI is good.
not quite sure how many bugs there.
In terms.
-- actually I'm
>>: So because you have only used it, you don't have experience
yet on what you need to fix probably.
>> Na Meng: Yeah. I saw, but I just fix it.
the other part, yeah.
>> Kathryn McKinley:
>> Na Meng:
[Applause].
More questions?
Thank you.
I'm not quite sure
All right.
Thank you.
Download