>> Andy Begel: Hi, everybody. Welcome to Anita

advertisement
>> Andy Begel: Hi, everybody. Welcome to Anita
Sarma's talk. I'm Andy Begel, a researcher in the
VIBE group and Anita is a friend of mine who has had
a nice long history in software engineering. She did
a Ph.D. at UC Irvine with André Van Der Hoek. She's
the author of the famous Palantir conflict avoidance
tool, which when I saw it, it was like, damn it, why
did she invent that before I did, because that was a
really awesome idea. After she finished her Ph.D.
dealing with coordination issues for software
developers, she went off to a post-doc at Carnegie
Mellon, worked with Jim Herbsleb, Marcelo Cataldo, a
bunch of people there doing kind of neat things in
socio-techno congruence about trying to match up
software teams and the work that they do together and
try to keep that in sync, and now she's a professor
for the last five years?
>> Anita Sarma:
Four.
>> Andy Begel: Four years at the University of
Nebraska in Lincoln, Nebraska, which is where my
father-in-law works, so that's kind of exciting, too,
and she is -- has been working on a lot of things
continuing in conflict avoidance for software
engineers and trying to help generally software
engineers improve their coordination, and she's going
to be giving a talk today about distributed software
development and how to deal with coordination there.
So I give you Anita.
>> Anita Sarma: Thank you. So that was a nice
introduction. Thank you. So right now I'm at UNL
and our research group is called ESQuaReD Lab,
empirically-based Software Quality Research, so we
don't just build tools, we try to make sure that the
tools are useful, and that's one of the reasons why
I'm here. We have had this amazing idea of build a
new tool, and then after talking with Andy, suddenly
realizing what is the context in which this is
actually going to be used. So one of the things I'm
trying to do here is talk to developers to kind of
see the kind of problems I'm trying to solve exist,
and if not, what kind of problems they actually have.
So a little brief introduction about what I do. I
have three strands of research, so first of them is
empirically based to understand what's really
happening in these communities that are actually
doing some kind of software development or any
programming limited tasks. So in understanding
online communities, because they are easy to get the
information from, open source development or Q&A,
programming-based Q&A sites like stack overflow, so
one of the questions in that area I'm interested in
knowing is what motivates them to contribute, and now
we don't even have a single project anymore. Like if
you are doing ER work, you realize you recognize -you need some other project, you need people's
expertise from some other project maybe, so it's
almost like an ecosystem. There's no monolithic
project anymore. Like no eyelet, right? We have an
ecosystem. People have not yet looked into what
happens in ecosystem, can I learn the social norms
and technical knowledge from one project and transfer
them to another project, so that's this little chunk
of my work is looking at these online communities,
understanding how people migrate from one project to
another, are they transferable knowledge, are there
specialized roles. Another part is end user software
engineering. A lot of guys use Excel and there's a
whole bunch of people out there who do some kind of
programming, not because that's their job, but that's
because they need to do that to get their work done,
right? So Excel Web mashups. So over here I'm
interested in understanding what kind of software
principles and techniques we can actually get from
software development and help this class of users,
looking at how people do debugging in end user
software. People just don't write from scratch.
They pick some example from one place, put another
example from another place, try to glue them together
and see if it works, but when that happens, you don't
know the kind of examples that you picked, and when
you put them together, where's problem? In the blue
code? In the first code you've got? In the second
code you got? You changed it a little bit, but, you
know, you don't know what's really happening. So can
we help these end users debug and looking at how they
even go out, look for examples, how they are trying
to look for problems, looking at information foraging
query to look at the foraging behavior. The other
strand, the one that I had started with in my Ph.D.
project, still continuing, which is how can we
support coordination in teams. And in these strands
one of the things that I already talked about, you
want to look at the state of practice that's
existing, try to understand the theories behind
what's happening. Based on those theories and
insights, build tools, and then evaluate them, and
that kind of encompasses my research directions.
Today I'm going to talk about mainly supporting
coordination in teams, so starting with software
development. This is a real nice piece of, you know,
jigsaw puzzle that's out there and that's really
where software development is. We have these
different pieces of code that we have built by
different people. There's no more one person
building all the project and all the code, right? We
have to divide labor, we build things together in,
and what usually happens is, you know, these little
spaces you have is almost like workspaces. You take
from the main repository or [indiscernible] the main
archive code is, you take it out, you work on it.
Once you are done, you try to put it back into the
system, and hopefully all these interfaces line up.
While you are making changes, nobody has changed any
of these other interfaces, right? So you're hoping
as I work and I put things back in, it will be
amazing piece of technology and it will work. What
happens in real life? Here is the dependency
packages in the PERL language. It's a pretty simple
language, right? In the pictures we show complexity,
but if you see this picture over here, the blue stuff
is package, and what you're really seeing is, is a
whole bunch of spaghetti code. There's calls from
one package to another package, calls from a file in
a package to another file in another package. What
that leads to is then social dependencies among
people, because I'm dependent on say, Chris, for his
file, I get to have to coordinate with him, find out
what he's changing. If he's mucking around with his
code, that might impact me. So if we look at the
social dependencies among these people, this is what
you get. And to make matters worse, these
dependencies are not static. They are changing over
time because everybody is evolving code. So here's
the same PERL language. In this case you have the
vertical lines, which means the particular version or
revision, and between revisions you can see how much
the code has changed, right? So some versions don't
have as much changes, just a little bit. Some have a
lot of change. It might be because of the new
release or it wasn't important part of the project,
right? So the take-home over here is there are
dependencies, people depend on each other, and it's a
moving target. Things continuously change.
There's been a whole set of studies done to
understand how do developers work in these kind of
settings, and more specifically looking at what kind
of questions should developers have to ask. Some of
them is like who do I go to for help? I just started
on the project. I need to work on this work. I need
to work on this piece of code. Who has the
expertise? Who can help me out. This is especially
true if you start a new project, right? You have to
understand who's the expertise. Who should be
assigned to this task? Who has the right expertise
who can get this done in the shortest amount of time?
Sometimes bugs get -- keep getting reassigned because
you assign it to someone, they don't have the
expertise, then they assign it to someone else. So
you call these hard little bugs. They keep getting
tossed from one person to another person. Which
tasks need to be completed before the others? There
is dependency. If I am depending on some release or
some code to be done before I can get my work done,
I'm getting blocked on this other person finishing
their work, right? So how do we manage or interleave
these tasks. Other two questions is -- oops -- which
other artifacts are affected by my change. I'm here
making pieces -- making changes to my code. Who else
might be affected because of that? On the other
side, as I'm working, I'm working on this really
important topic, who else is working in the space in
my team that can affect my work, right? So these are
the questions you have to ask because as I'm working
in my space, I need to know what's the impact of
other people's work and what's the impact of my work
on others. So if you look at a bunch of the
questions that's about the impact, and coordination
can lead to -- a breakdown in coordination can lead
to two kinds of conflict I say. First is a direct
conflict, which is I'm working on this piece of code,
someone else works on the same piece of code, and
when we're ready to synchronize, we cannot do that.
In the case of where we have configuration management
systems like Subversion or Git, usually what this
translates to is a merge conflict. You have changed
the file and he goes in first, he's happy because he
gets to check in. I go to check in second and whoop,
I have a merge conflict, and then I have to kind of
see how to merge them together. It might be that the
same file but the changes are in different spaces, so
I can merge them pretty easily, but sometimes it
could be that there is some dependency that goes from
one block to another block, so what happens then? I
stick it in, we run build or we run test and then
something still doesn't work. Or another case where
I think things look fine, I built fine, everything is
fine, my unit does go fine, I stick it into the
configuration management system because all the files
are fine, but then something fails, build or test,
and that might be because something else depended on
an API or some behavior got changed on the side, and
when we actually ran the build test that's when
things failed. So these are undesirable
consequences, and literature studies in different
kinds of projects, some pretty large communication
projects, some were telecom projects, some were at
NASA, some were smaller projects. A lot of studies
has been done that shows this is a problem that keeps
happening. Work is frequently restructured because
what you would want is all these little pieces to
work, but these little pieces should be given to
separate teams with good APIs. But as you keep
working, things get changed, right? So you have to
restructure work to bring back the modularity.
Parallel work does tend to lower the quality of the
code, and that's because you will have merge conflict
and sometimes when even you automatically merge it,
there might still be some problems. Developers
recognize the significance of conflicts. Nobody
would like to do conflict resolution, even if it's
just merge. Merges are hard because going back to
the old cases of SVN when you actually do an update
in merge, you get a lot of squiggly lines, me, them,
me, them. You have to go through all of them and try
to fix it, right? So people have seen that people
try to race, so they would say that this is very old
work, like from '95, from Becky Grinter, and what she
saw was people used the CM system logs as almost like
coordination time. Like, okay, Tom checked it out
one day bad, the release is coming up in one week, so
he would be at this stage, so I'd better hurry up
because I want to check in first. I don't want to do
resolution. People also sometimes use another
practice called partial commits, so even though my
thing is not completely done, it's not tested yet, I
want to put part of my code in because at least then
that part of the code will be saved from me having to
do resolution, right? And these kind of informal
practices kind of go against the basic software
engineering principles we have that you should have
tested code that goes into the system. Another thing
is informal communications take place. This was from
a study in aerospace company, and what they did, they
had really long check-in cycles, that is, almost two
weeks long. So people would check out and keep
working on their code. In two weeks' time they'll be
ready to finish and put it back in the system. That
is a long time when things can change, so what they
would do, they sent an e-mail out, hey, I'm ready to
check this piece of code. This piece of code has
dependencies on these other artifacts. Is there
someone out there who would be affected, you know?
Speak now or forever hold your peace kind of stuff.
And then they have this -- all this e-mail
communication back and forth outside of the CM
system. Once they knew what was happening, who is
going to be affected, and how changes will actually
be integrated, then they will go ahead and put it in
the system. And this is something I found when we
were doing it in the class project with my tool
workspace awareness. We were like, nobody had any
conflicts. It was amazing. All this work people,
project, nobody had a conflict, and then I got the
e-mail logs and there was a whole bunch of actual
informal communication and coordination going on
outside of the configuration management system. And
they only had -- in this case, the student had only
one person who could check it in, so he was the
manager. We would get all the code, make sure
everything worked fine before checking it in.
Resolution is time consuming. A recent study by Brun
Netal [phonetic] and my work has shown in open source
project that resolution time takes hours to days to
take place. One thing I have an intuition about and
has been a little bit talked about in the literature
is everybody knows about these merge conflicts and,
you know, that is a coordination problem, but many
people recognize that indirect conflicts like test
failures or build failures are also actually caused
by coordination problems. They think that's the
regular day life. You have nightly builds and you
have all this process and setup to overcome this
problem. Nobody thinks that if we could actually
coordinate better, if we can understand these
dependencies, maybe this whole class of problems
would not be even here. That's what I want to focus
more in. In my study, which we looked at four
projects, that was also the four projects that Udi
[phonetic] run and Escalise [phonetic] had looked at,
so we wanted to see wanted to see, okay, do conflicts
occur, how big is the problem, right? That was
before I go trying to solve this. So if you look at
these projects, we have -- this was -- all these
projects were in Git. So we looked at the number of
merges there were, and the way we looked at the Git
tree structure was if there was a merge that is two
branches actually merged together, we would run -- if
it was a safe merge, it was clean merge, then we
would run a build script, and these projects had the
build and test scripts. If the build pass, then we
would run task on them. So we will make sure at the
range how good the quality of the code was. And when
we look at this, you see the total number of
conflicts ranged from 40 percent to 54 percent.
That's quite a few number of merges that had some
sort of problem, right? Further looking at breaking
these conflicts into merge, build or test, we see
that it ranges very differently for different
projects. You have some had 14 percent or had only
18 percent. Then we also wanted to look at the time
it takes to resolve these problems, and this was
surprising to me because I thought merges would be
the easiest to solve. In the case of PERL, it took
an average 23 days and a median of ten days to
resolve this merge conflict. Caveat, these are open
source project. It does not mean when there was a
problem, that they actually eventually went and, you
know, resolve it then and there. This is how long
the merge or the problem existed. Going through
seems like build failures is the easiest for PERL to
manage. Test and merges are difficult. And
different projects show different profiles, but
take-home message is you have merge conflict with an
eight to 19 percent of all the merges would have
problems. Build off the clean merges range from two
to 15 percent.
to 35 percent.
there.
If you look at the test conflicts, up
So there's a lot of problems out
>>: So I -- merge and build conflicts and test
conflicts are taught among teams where multiple
people are working together?
>> Anita Sarma:
Yes.
>>: And increasingly there's lots of pieces of
software that are individual developers or small
numbers, I'm thinking of like apps, for instance.
How do you see -- certainly there are socio-technical
issues there, but they're entirely different than
these type of conflicts, I think. So what -- how do
you see sort of the, in general, like software
development, do you see these types of conflicts
growing?
>> Anita Sarma: It depends on the process. So if
you are doing Agile, it's less time that people have
to make changes, but if you are talking about
individual pieces are being developed by individual
people, like in a branch?
>>: No, I don't even mean that. I mean some
16-year-old is developing a game for I/O apps and he
doesn't have -- he's the only person on the project,
right? So he doesn't have these type of conflicts.
>> Anita Sarma: So then these exactly merge conflict
and build failures would not exist. What he would
have problems is when he tries to have the game work
on the particular operating system. That's when
things will change. So this is really, if he had
worked from version one and now it's version two,
forward compatibility versus backward compatibility
and then he'll have to fix it. So I think if -- this
is based on a team setting, these kind of problems.
If it is individual problems, it is the dependency
that you have with one API, multiple APIs, and
keeping up to date about which API you have built on
and how far it has moved, so that's where I think it
will go.
>>:
Right.
So I guess I'm asking you to speculate
about the API, API conflicts versus this type of
conflicts in the future where we're going to see more
of.
>> Anita Sarma: I think these kind of projects still
going to stay. If you have individual project -individual people building stuff, I think the APIs
will not be as big a problem because there should be
backward compatibility in the APIs and they are
slower moving than the semi-structured APIs within a
team. So yes, they will stay, but it will not be as
bad as these. And the way you resolve it will be
different because you really do not have any control
over what this other API is moving towards and where
it's going because you are kind of a consumer. You
are a client for that API. So unlike -- until you
have like a Facebook kind of application, which is
very big, you probably will not be able to have an
impact on the API development, how fast they're
moving. Yes, sir? All right. So as part of my work
what I want to do is, as I said, there's the reason
these problems occur is because there's some kind of
dependencies, and I want to say can individuals be
able to visualize what these dependencies are,
understand who is interdependent on their own work,
and then can we help them coordinate the tasks. So
first I'll start with visualizing the software
dependencies. So built a tool called Tesseract,
which is a multifaceted way of exploring your own
project. What it does, it has an environment that
correlates and understands relationships across
different entities. In any particular software
development, you have the different silos in which
your data exists. You have the code versioning
system where all your code exists, the bug versioning
system with issues and bugs are there. Then you have
e-mail communication that's going on the side. But
anytime you want to look into this data, you have to
go into one database, look into it, then look at the
other part and try to decipher the links or
relationships between them. So what this tool tries
to do is kind of says, can we explicitly state how
these relationships are and how they are over time.
So the tool -- this is in open source data rhythmbox
in GNOME, so what this does, it has four panes. On
the top pane you can choose a particular project, and
for that project it then shows the activity level of
that project. On the top part over here you see the
blue lines, and those are the code commits that
happen over the course of time. And then the bottom
is the green lines, which is communication, how much
communication happened. And in this case in GNOME,
we had the e-mail exchange from the mailing list, we
had any kind of comments that were taking place in
the bug, database bug tracking, Bugzilla, and anytime
there were any patches, so we're assuming that if I
submitted a patch to Andy, Andy looked at it, he
probably knows what I was trying to communicate, and
if Chris comments on that particular bug, he probably
read what I had written. So that kind of shows the
communication, and this is all three of us are
communicating with each other. On the left pane it
shows the file read bug graph and it shows how files
are connected with each other, so this is open source
data. What we used is if two files had been
committed together, we call it logical commit or
co-committed. There must be some kind of logical
dependencies between these files because of which
they are committed together. On the right-hand side
here we show the social network of the people
network, who's communicating with whom, and the
bottom part is the bug database, so are the number of
bugs that were open for this period of time that we
have. If I look into more into this e-mail
communication of the communication network, what we
have tried to understand is congruence, and
congruence really means if -- what is the fit between
the people who have to communicate and who are
actually communicating, right? So in this case if
Chris, Gina, Andy are working on some project
together, there is some dependencies, right? So what
we are saying, if they have checked in some files
together, there is dependency between the files,
there's dependency between these people, so we say
there is a need to speak. And if you look at the
green network here, so that's the gray one here. The
green network says these are the people who are
actually communicating with each other. Then we do a
map between the student [indiscernible] box to see if
they had a need for communicating, and that's green.
If they had a need to communicate but they were not
communicating, that's red. And then there's this
gray line which means we didn't see any technical
need for them to communicate, but they are still
communicating for some reason., right? So we call
this the congruence. And the thickness of the line
shows how many times they had communicated with each
other. It's just edge weight. So going back to the
picture, so I had that big cluster of files and it
says there's too many for me to understand, so let me
filter it out. So over here I've filtered out that
only show me edges that have been committed five or
more times. Then I get this network. And I say,
okay, this particular node over here, shell RC, is
kind of central, so I want to know who has been
working on that. So if I click on it, in this
communication network, it highlights the people who
had ever worked on this file. And then I can see,
oh, these people are actually communicating with each
other, so maybe whatever changes had been done they
actually knew about.
>>: What's the significance of the edge lines in the
file graph?
>> Anita Sarma: This is just a force directed
layout, so if you go out farther, it just tries to
put them away. So if they have been connected
multiple times, they'll be closer. It's just a
layout picture. So if they are farther away, it's
just trying to, like, keep the graphs as far as
possible.
>>:
Okay.
>> Anita Sarma: So here you have them closer because
there are multiple connections going on.
>>:
Okay.
>> Anita Sarma: All right. So other thing I could
be interested in knowing is, all right, here is my
bug list. If I click on a particular bug, this tool
tells me, okay, the two people who had worked on as
part of this bug are these two people, and for this
particular bug, these files had been changed. So
what Tesseract does is it allows you to go from one
perspective of your project, say my bug reports, to
kind of my developer history to kind of my project
history. So it allows you to explore how things are
connected and give you more idea of what these
relationships are and what they might mean. When we
looked at user studies as well as just interviewing
with GNOME developers, what we found, especially for
GNOME was, people who were seasoned developers who
had been doing this project for, like, five or six
years, they said I know this network. I am perfectly
aware of which files changed with what. I know who's
working on what. Then we ask, like, so how do you
know that? And this particular person would spend
morning every day at least three to four hours going
through the entire mailing list kind of seeing what's
the status of the project. He was that into the
project, right? But here it will say, like it would
be really useful for on-boarding or finding experts,
experienced finding for people who do not care about
the project that much. So his was like, if you
really care about this project, you're an open source
person, you should know this. You should have this
mental model in your head. If you are not that
serious or if you're new, this project will be -this tool will be really helpful. A couple of them
were managers, and the thing that they picked on was
like, I love when I see a red line, right? Because I
know these two people should talk, but they are not
talking, so I should be able to facilitate them.
Another thing that came out was what we have in this
data is only things that are archived. So even if
I'm not talking with someone over e-mail and maybe
over bug tracker, it might be I'm sitting right next
to the person and I talk to them all the time. So
one of the tool features they wanted to say was like,
yes, red line, but I want to click it, add it to
green because I now have communicated with that
person, or as the manager, I know these two people
talk to each other during the meeting. Another thing
that's interesting was we had brokers in the
communication network, not in this picture, but we
will see that A and B had a red line between them,
but there was this other person, C, that they
communicated with. Actually, we will say like, you
know, even though they're not talking directly to
each other, there is a manager, there's some
facilitator through which information is passing from
one node to another node. Of course, the success of
this project depends really on the existence of link.
Oftentimes you don't find that in open source data,
at least, so if you have the comments that have
happened, the files that have changed, how do you
link that to the bugs or how do you link that to any
of the patches that have happened? So that depends
on the link, and Chris Word [phonetic] over here had
this tool that was allowed to actually decipher the
links or add links to the comment and bug, so it will
be really useful in a tool like this. So going on to
the next set of tools or two tools that I'll talk
about is basically how to have coordination in
helping with the distributed development. So as I
said earlier, we have this complicated interrelated
piece of code that we need to work on. We all work
on in our own private workspaces, so the idea of
workspace awareness was can we monitor these private
workspaces. As people are going about their daily
everyday life, can we know what files they're
changing, watch the dependencies that they're
changing to identify these potential conflicts, which
will be the merge conflicts and the build conflicts
that we talked about which cause our dependency
violations and can we notify developers about this as
things are happening, as people are still making
changes, so instead of waiting till they have made
their changes and committing it, can we move forward
in time and say, as you're making changes, here, this
might be a problem and you want to talk to them. And
that was the Palantir tool which was the Ph.D.
project that Andy was talking about, so what we did
was we wanted to make it really lightweight because
the more stuff you have in your development editor,
it distracts you, it interrupts you. So what we
tried to do was we just took over the Package
Explorer view in Eclipse and made this really small
little icons, right? So if you look at the blue bar,
blue icon over here, what we said was you are working
on this particular file and someone else right now is
also working on this particular file. So there was
an option where you could have this on for all the
files you had in a workspace, or you could have this
view only for the files that you are dirty, that
means you have changed it since you have checked out.
And what we did was let's say someone else is doing
it, we also wanted to say what is the severity, how
big is the change that you will have to deal with,
and we did, as the percentage of lines of code that
have changed, or the entire lines of code for the
file, and the other was we want to show them how this
information could be percolated up the directed
structure, because it's often possible in a large
project you have multiple projects or packages that
has collapsed, so you wanted to say even while you
are in the top view, if anything in any of the
underlying packages or directories had changed. And
for that what we did was a very simple directory
severity calculation which we said how many artifacts
does this project or package have, out of that, how
many have been touched by somebody else, and just
aggregate that and put that in the directory and we
keep going that upwards, right? So if your directory
had two files, one was changed, so the percentage
change for the directory will be 50 percent. If that
belonged to another directory which was two other
directories, it will become 25 percent. It actually
just keeps going up and getting decays a little bit.
The other thing which we wanted to do was merge
conflicts are still easier to identify. Can we in
some case find these build failures or these other
indirect conflicts that are more difficult to find.
So here we have this red little icons and we wanted
to say is what's happening here, where is the changes
in address with whatever is the modest severity
there, but changes over that is causing some kind of
impact into your system, so that's what the "I" for
impact leading out, and here your worker credit card
because you have this little arrow here which shows
you're working on credit card, and the impact is
coming to yours. So what we did was we did cross
workspace impact analysis. It was very simple
analysis, very rough analysis at this point. So for
each workspace, we made a lookup table, SAT, saying
from call graph which other files and which other
methods you're dependent on, this particular file,
and in a remote workspace if some method or some file
was changed, you'd get that notification. We'll just
do a simple lookup. If you are using this method,
this method has been changed, there might have been
impact, and this is right now at the signature level.
So if anything actual behavior changes, we would not
have found out. If anything of the signature
changed, we would have found out. And we get more
information on this little view, and this was my
attempts at drawing bombs. So a red bomb would mean
that someone has changed a method that we're
depending on. In this case, someone has deleted this
particular method, get name. It's impacted by Ellen
on Pete's work, and it says -- the red means that
changes have been committed, so this is definitely
going to be a problem. A yellow bomb meant something
is happening in the workspace, so this might be a
problem, but we don't know. They might actually
merge this change back. And then green was if you
have changed everything in user [phonetic], in yellow
code and you wanted to find out what else impact will
impact any particular project. But other interesting
thing was this exclamation point. As I said, what we
had done was a very simple call graph signature level
analysis at this point, but what we wanted to say
was, hey, in this particular research you were
depending on payment -- you were depending on
payment, your credit card was depending on payment,
and something has to be added, something new. For
example, an init method has been added, so the system
was not smart enough to know, like, you were
depending on payment, a lot of the initialization
code had been moved to this other method, so the
behavior had changed. We said that's a lot of
complicated analysis. The user usually has a lot
better understanding of the domain knowledge, has a
lot better understanding of what the systems are,
what people are doing, so we wanted to say by
exclamation point, like, a new method has been added,
what's the details of this new method. So as a user,
if I know payment initialization is important to me,
I will go and check what's happened or talk to Ellen
to see what she's doing. So we tried to offload some
of the more complicated computation part to the
developers because they would know better idea.
So when we looked at the results, we saw conflicts
were pretty good in being detected as they merge as
long as they were the syntactic level. We found
developers undertake action upon noticing a potential
conflict, and in our user study, we found a whole
spectrum of people, right? So some people were as
soon as a little bit of a conflict information got
in, they had to go look what it was, and there was
one particular user who could not have anything that
is in conflict thing, so they have to go and, like,
find out what it was, start an e-mail communication
or IM communication to see why it was. And at the
other end of the spectrum of people who were just
open [indiscernible] really wide and they would not
care about what happened to Package Explorer view, so
all the details were lost. So the only time that
these people, group of people would look at the
Package Explorer view or look at the conflict was
when they were starting a new task or when they were
looking at another file because a particular task
might have multiple files so they had to go open the
Package Explorer view, or when they were taking
breaks. One of the tasks on the user study was they
had to write comments. People always wrote comments
after the fact, so they would code everything up and
then they would open the Package Explorer view or
take a break, look around, then write comments, look
around, write comments, right? They didn't like
writing comments, at least the students we had. So
those were the times where people looked at the
conflict icons. Fewer conflicts grew out of hand.
This was -- I don't have details about the experiment
setting, but this was using a configure study, so we
had one person come in and we told them they're
working in a group. The other two people could only
be contacted through IM, but these were our
configurers, so they are research helpers, and we had
like every -- 15 minutes into a particular task that
would be conflicting, we would see the conflict and
it was automatically checked in. So we saw some
people trying to race. They would have this blue
icon coming and said, oh, I need to go fix it before
this other person actually checks it in. So they
tried doing that, racing to finish it. Then they
realized it's already committed, so they were really,
once they faced a conflict and they had to resolve
it, they were really particular in making sure this
other group of people who wanted to have no
conflicts, as soon as an icon came in, they would
contact this person and say, hey, I'm working on this
task, these are the files I'm changing, what are you
working on? How much time will you take to work on?
So they were trying to synchronize and manage through
the IM. So the resulting code was higher quality in
the terms that fewer conflicts were left in the code
base and this was based on because we had seen the
four conflicts in an eight-task experiment. There
was a penalty, because a lot of people were
communicating over IM and it takes time to
communicate, so the control group people were faster
than the experimental group, but they had more
conflicts left in the code base once we said, okay,
now, the experiment is done, here is all the
conflicts or dependency problems that could have
been. One of the faults in the experiment was -which I didn't realize at that point -- we did not
make the control group resolve all the build
failures. We should have taken time. We didn't do
that. Anyway, so that was fun. And there are a lot
of other workspace awareness tools that have come
after Palantir, done better jobs in UI and better
jobs in the kind of analysis being done. This is
FastDash, which was from Microsoft, so Mary's group.
And what they did was, like here are all the files
and all the people that are being worked on this in
an Agile setting, so they will say which particular
file, what's the status of the file. If a file is
being worked on by two people, they show the names of
the people there, and like this is a problem space,
you might want to look into that. This was by Uri
and his group, and what this has done, this is based
on Git and they look at there is -- Git works in all
kinds of branches, so everybody has their own branch,
and sometimes people do local commits, smaller
commits before they actually push the changes back
into the master repository, so they had a shadow
repository that pulled in all the local commits into
the repository and they would be able to say, like,
you know, when there is a possibility that there
would be a problem, a build failure or test failure
by actually running the build and test scripts on
this master shadow branch. All right. So a few of
the limitations of Palantir and this other workspace
awareness tools are the way the approach works is
conflicts are only identified after they occur,
right? So as I'm making changes, as I'm doing stuff,
these monitoring spaces tell me here is the file that
is going to have a merge conflict, here's a
dependency that has been violated, and the more
changes that have happened, the more time it will
take to resolve. They are usually being
coarse-grained impact analysis at this point.
Because of this notifications coming in, there is the
case of information overload or interruption that we
saw from the Palantir study, and the opportunities
for improving or extending these kind of analysis to
a larger unit, like tasks or like scenarios and
features that Microsoft uses, the problem is because
all the analysis is done at the file or the directory
level, so bringing it up to a higher level like
logical units, like tasks is kind of difficult from
this setting. So what I wanted to do next was can we
go even more forward in time. Can we proactive. Can
we find these dependency problems or find when
changes are going to have an issues before people
have even started doing the changes. Can we look at
the tasks that they are going to work and figure out
the dependencies at that point. So then we can also
give them the solutions at the task level. And one
of the things I wanted to look at was would this
avoid individualistic solutions, right? So in
Palantir when we had this blue icons or these red
icons, people want to go race in or do partial
commits or talk to the other person and select I want
to put my code in, right? So it was a lot of
individualistic setting. Can we look at it so that
avoid the race conditions and these kind of
individual strategies to see what is good for the
entire team. Maybe it is okay for Chris and Andy to
have some conflicts here because they are working on
a code that doesn't need to go for this release, but
maybe Gina and Tom should not be affected, right? So
there might be based on the team policies different
strategies that this could work. So we built
Cassandra, so Palantir was the crystal ball from Lord
of the Rings so you know what's happening in the
seven kingdoms. So we want to say can we go to
Cassandra, right? Can we predict, even before tasks
have been started on, what are the future problems
that could occur, right? So that's the Greek
mythology, Cassandra. So we can minimize the
conflicts that can arise from the individuals working
in the workspaces. So this is an approach for
Cassandra, so first we are assuming a workflow where
this is more akin to the workflow that might happen
in open source. So I come in in the morning. I say,
okay, these are the bugs that I have in my inbox by
pulling it from Bugzilla or something. I might want
to order the task based on my preferences or any kind
of priority that we have. After that, what needs to
be done is kind of identify the files that's going to
be changed for the particular task, and once we have
the files that's going to be changed, what are the
dependent files and what's the files that can be
impacted. And then we have to analyze these tasks to
understand the dependencies so that we can understand
the conflicts, and then we formalize those
constraints into hard constraints or soft constraints
to talk about in a few minutes and then evaluate them
to find which tasks can be independently created, and
we use Microsoft Z3 actually to do the constraint
evaluation here. So let's give an example of a
constraint example. So here we have, say,
shape.java. We have three other classes that are
inheriting from Shape. If we have a scenario where
we have Alice, who has three tasks, TA1 A2, A3; Bob
with TB1, B2 and B3. In task one, Alice is working
on rectangle and shape, and task -- and B1 Bob is
working rectangle and square. If you look at the
example of the dependencies, right? So if Alice and
Bob were to do the first tasks, there's going to be
one merge conflict because they're working on the
same class here, rectangle, they're modifying it.
They might also be in direct conflict because
whatever changes Alice might be making to shape might
affect square or triangle that Bob needs to work on.
And there might be some order like of task
precedence, like canvas needs to be built before
panel can be inherited from canvas and be built,
right? So there is some precedence ordering that is
also necessary in tasks. So what we do that, we can
work that into hard constraints, that is, TA2 has to
be done before A3. There's no other way around. And
then the rest are soft constraints, so TA1 and TB1
can be done in parallel, but there will be some
consequences with that. So looking at how do we get
the constraints from the Fe and Fd sets, so what we
are using is either we could use some kind of data
mining, if there's a bug feature or bug request that
needs to be done, we can look back in time to find
out what other bugs are similar, what other files
might have been changed for that bug, right? So we
have a seed set of files that will be changed for
that particular bug. And the user can definitely
refine it by adding more files or removing more
files, going back to like how MyLyn does this
context, so usually you will have a task, you can say
I'm going to change these files, I want to look into
these files, so you can have some developer input for
refining, and then we can do a basic analysis, like,
again, going back to dependency finder, looking at
the call graph analysis, one-half level to kind of
say these files are also dependent on these other set
of files. And then you have this Fe and the Fd set,
so if there are two people and their Fe sets are
intersecting, that means there is going to be a merge
conflict, right? So there is going to be a direct
constraint between these two tasks. If you have an
Fe set and it might have some impact on this Fd set
with somebody else might be working on, right? So
then there will be an impact, so we have call this
the indirect constraints. So right now we are only
looking at Fe sets, impact on Fd set, which is being
changed by someone else. It could be that there are
two different sets and the impact is somewhere
downstream in another set of files and that needs a
little more refined analysis. So once we have these
constraints, we need to evaluate them. And we're
using Microsoft Z3, so we have these constraints
between all these tasks. We put it into Microsoft
Z3, and if there's a solution that is -- that exists
for these two people, two tasks that do not constrain
each other, right? Then what we want to do, there is
a solution, but we want to try to match it as close
as the developer's preference or the developer's
priority. So to do that, we look at, okay, this is a
solution that Z3 gave to us, four, two, three, one is
an order which won't have any problem, but the
developer wanted to do one, two, three, four, so it's
kind of very far from the preferences. So what we
want to do is minimize the cost. So if it is, we
find out how far. So four was three halves of eight,
so we said, okay, there are three units is the cost
at this point. So we tried to say, okay, for this
solution, how far is it from the developer's
preference and we make a cost object out of that.
And then we put that cost back as a constraint and
say, in this case, there is a three, a two, so
there's at least five places that would be changed,
right? So we want to say -- we want to put that back
in the constraint solver and say try to minimize this
cost from, say, five to three. So we kind of go
binary and see if there's any solution. So we go
back and try to revalue the constraint space and we
keep doing that until we find the cheapest cost
function that we have, because that's the minimal set
we could get. And then we want to display this
information back to the user and say this is your
recommended task. We do that, the UI, what we have
is this is MyLyn, and over here is the MyLyn's task
view. We kind of highjack that task view, task list
and say, okay, if these were the tasks that existed
for this particular person and one, two, three was
the ordering that they were going to do. MyLyn does
not allow for task reordering, so we had to put a
plug in that allowed that, so the developer can say
implement plot should be actually number one and not
number three, so they can move this, and there is
this little tick mark that says, okay, run the
constraint solver now. So we run the constraint
solver two times when the user says run it now. And
the other part is when someone has checked in their
code, what we say, like the other people who are
still working on, I finished my part, what is the
next task I should do. And it takes some time for
people to check in their code, right? The comments
and stuff like that. So we use that time to run
the -- reanalyze the constraint space and saying, you
know, some constraints will be added, some will be
removed based on what other people are doing, what
files you have changed and say this is the next best
task for you. So when we give this visualization of
what's the next best task for you, we kind of said
this is the order which we want you to do, and this
little exclamation mark that says why we want you to
look at that. And over here it says, you know, this
is your task ID. It conflicts with this other
person. What is the task that is conflicting. It is
direct or indirect conflict that could be causing
this. So the idea is to kind of say like, it's not
just like, hey, you have to do task number two, but
saying if you do task number two, these are some of
the problems you will face and seeing if developers
actually reschedule and do the same task or they
actually pick Cassandra's order. What happens if
there is no solution, right? So this was the first
case when we ran the constraint space, we found two
tasks that were independent of each other. What if
there's no solution? What if it's an unSAT
situation. Then what that means is we have to relax
some of the constraints. We cannot relax any of the
hard constraints because something has to be done
before these others can be done, right? The output
is needed for as the input for this task, too. So we
can relax some of these softer constraints. And the
way we relax soft constraint could be very much
dependent on the team policies. So you could have,
for example, a conflict focus. Initially when we
started this VSU, merge conflicts were easier to
solve, so we could just say if there is no solution,
just break all constraints that cause direct
conflicts, right? But after looking at the data, we
had MyLyn aside the PERL data. In PERL build is
easier to solve than merges are. So again, it might
depend on the project's profile what tasks and what
conflicts you want or you do not want. It could be
team focus. For example, as I said, it might be okay
for two people to have a problem as long as the
majority of the team doesn't have any conflicts. It
could be task focus. We could say these three tasks
are going to be released now, so nothing should ever
impact these tasks. Or it could be these files are
frozen because we're ready for release or these are a
public API so all constraints relating to this
particular file should be left alone. We started
with looking at conflict focus. That was the easiest
to do. So what we do is when we have an unSAT model,
a basic model was start relaxing by you could do all
direct conflicts or you can do one conflict at a
time. Z3 right now does not give a minimal set. It
says, okay, these are all the constraints that could
happen, but it could be that if you relax one of the
constraint, the solution will still work, right? So
there is some optimization that we can do. Right now
for starting we said, okay, first step, let's relax
all the direct conflicts and see how fast and how
much time it takes. The other one is empirically
guided, so you can look at your particular project's
profile and see builds are easy, so I'm okay with
removing indirect conflicts maybe. But in this case,
you have to be a little more fine-tune and say remove
one conflict at a time, run the constraint solver and
see if you get a solution and then keep doing that.
So there are some of the underlying assumptions here.
We are assuming developer selected task from a given
set, so there is a set that is awaiting for them at
the beginning of the day or a week or a month. There
is a task context that, you know, we know for this
task these kind of resources are going to be changed.
We know that ahead of time. We're doing very
coarse-grained analysis right now. That task
assignments are done at the beginning of the period,
so the weekly time I know these are my tasks for this
week or a particular sprint. There's one active task
per developer. It's not that a user has three or
four or five tasks all open because as soon as you're
working on a task, we concretize those constraints,
so we said this person's working on this, nobody can
touch these constraints, right? We also are assuming
tasks are unique across developers and
nontransferable. It's not that they start something
without checking it in, i.e., mail it to someone
else, so there are two tasks going on at the same
time. And another thing is we're assuming developers
commit their changes at task completion, and this is
kind of important for the approach is, initially when
we are starting there's going to be a lot of noise in
the data because we are kind of looking back in time
to find which files will be changed, we're looking at
this call graph analysis, so there's going to be
overapproximation in the dependencies we have, but
we're assuming as developers are making changes, they
have much more fine-tuned, fine-grained idea of
what's changing in the dependencies, so when I have
completed my first task, then I have a much better -the tool has a much better idea of the constraints
that's out there to get a much better prediction for
the next task you're looking at. But if you don't
commit your changes, then there's going to be a lot
of changes, a lot of constraints in the space leading
to unSAT solutions. I'm not going to go into depth
with this because I'm sort of running out of time.
So to evaluate it, just to see if Z3 and constraint
solving would work in an open source setting, we
looked at the four projects we had, Jenkins, PERL,
Voldemort, Storm. We looked at weekly data, monthly,
quarter and six months' data, and I want you to focus
on that in many cases we had the direct and indirect
conflicts, so 12, 34, 49. We ran it. We saw SAT
conditions as well as unSAT conditions. When we had
the unSAT conditions, we relaxed some of the
constraints. And if you look at it, we got
solutions. There was number of conflicts that got
avoided, four out of five, 33 of 35, 36 out of 40.
So we're pretty good in resolving these problems. Of
course, in this case we knew exactly what the changes
were and what the dependencies were. There was no
false positives involved because this was gotten data
after the fact. Okay. Conflicts avoided. Another
thing is time was pretty -- this is in seconds, so
it's pretty fast. When we looked at it, it was
63 percent developer preferences matched, 23, 25, so
we ran it again to see as close as we got the
developer's preference. And in this case,
developer's preference was just based on the order in
which the changes were done. And we saw that most
times we got pretty close to developer preferences
matched. Only in one case it timed out, which was
set for three minutes. So the solution works pretty
fast. So to summarize and to conclude, what we have
is we have -- what I have showed you now is there are
dependencies. It's hard to understand what those
dependencies are. So here is a set of tools that
actually let you explore these dependencies through
Tesseract, which looks at different project elements,
aggregates those different types of relationships as
networks that exist, cross linking, and you can look
at it over time. The other set of work that I talked
about is how can we -- scheduler, how can we
coordinate the tasks or different changes that
developers are making in the team context. And one
thing is we want to eliminate seclusions. If you're
in a branch, if you're in a private workspace it's
not that you did not know what's happening around
you, but we are insulating it so you have some
information. Going from early detection of conflicts
to scheduling tasks to avoid the conflicts
altogether. And another thing we have not talked
about is a lot of the stuff that I spoke today was
very coarse analysis at the call graph level, but can
we use the development context to do much more
finer-grained analysis or scope it, say, for the
behavioral changes. For example, if you wanted a
semantic analysis like Austin, wall [indiscernible]
execution, [indiscernible] for the entire code which
is very expensive. Can we use the development
context like code ownership or who's active now to
scope this analysis space. Thank you. Work is
supported by a bunch of NSF grants and an air force
grant. I thank my students in the lab, Josh, Corey,
Bakhtiar, Sandeep and Rafael who did a lot of the
work, most of the work, which was after I graduated.
Thank you. Questions? [applause]
>>:
We have time for questions.
>>:
So the patches with how -- are people willing to
put in that upfront work to save, like, later costs?
>> Anita Sarma: That's something I need to look at
as a user study. So one of the things I think to
help them understand that -- that's why we did the
past study to say like how much time it takes for
solution. The solution time is less. They would not
put the upfront time, right? This was a study done
by MyLyn, so they're rating their own dog food, and
there they found that because what MyLyn would allow
is putting the degree of knowledge information or
degree of interest information, degree of interest
information that tells you which of the files might
be changed. People wanted that information badly
enough that they actually put that extra time to say
I will be changing this, this, this, this, and this
files. So in MyLyn, it worked, but it was a research
group. So it remains to be seen how it's going to
work. Going back to the Palantir study, which was
less upfront work, but the first time when they saw
the conflict, they wanted to just race ahead and
finish it, but the next time they found the problem.
Before they wanted to go ahead and finish their task,
they always started the communication and say I'm
going to be change these things. What are you
changing? So people are simply very conflict averse
in the settings that I have seen. So if this would
help resolve or help or not even have these
conflicts, I believe they should be, but it depends
how the development context and team policies are.
>>: [inaudible]
>>:
The cultural thing.
>>:
More questions?
>>: I can ask one. So in a way, the conflict
avoidance is kind of like pessimistic concurrency in
merging control where if somebody has that file out
and it's locked in, you can't check it out, you can't
touch it, which brought to mind the idea of offline
work. So one of the challenges, if you're not
connected to your source control system and
something's locked, you can't check anything out, you
can't do any work until you connect. So this
eliminates essentially parallel development or the
ability for parallel development if you're avoiding
these conflicts, and so that was just sort of an
observation. But the question I had around this was
in the studies you did of those open source projects,
the four that you put up there, how big were the
conflicts? Were they multi-person? Were they more
than two people usually? Or like if you needed to
coordinate work between -- to avoid a conflict, was
that just coordinating between two people's tasks or
was it like 15 people needed to have all their work
coordinated?
>> Anita Sarma: These are open source, so they were
smaller. And the way we defined the conflict was
when two branches were merged. So by definition of a
problem in this case was two people because there
were two branches that were merged. So there's
always two people. But we are looking at kind of
branch study, and at some point in Voldemort we had
15 branches that were active, and I think there were
five or six unique developers. So because it's Git,
everybody has multiple branches that's going on. So
I think the worst case we have seen is in these
projects, six people working simultaneously, but the
problems that I have shown is by definition two
people. So if you broaden it out and look at this
point if all the branches were to come in, there
would have been five people you have to coordinate.
Does that answer your question?
>>: It does.
Chris --
It has follow-up, but I'll let
>>:
It's kind of a separate --
>>:
Oh, just --
>>:
I'll let you ask your question.
>>: The follow-up was so you put in a lot of work
into this SAT system to try to automatically schedule
things. If it's potentially just two people, or
let's say it's a small number of people, could you
get away with just a visualization of here are the -like the Palantir visualization, I'm sure the tasks
might overlap with this person or that person, and if
it's not too complex to see, maybe the people can
just figure it out for themselves which ones to
choose.
>> Anita Sarma: That's possible. So if it is -- so
we are going for the hard-core case where we have 15,
20 people working together, but if it's a smaller
graph of people and relationships, we could just
leave it as a visualization and let them choose. The
only problem would be it might become common
material, depending on how many tasks each person
has.
>>:
Sure.
>> Anita Sarma: So that's where having some
automated too would help. Z3 was pretty easy. We
plugged it in and it does magic behind those screen,
so ->>: You have been looking at this kind of awareness
and conflicts so you may not have called it like back
in Palantir, whatever. What's -- what is your
feeling about, like, the type of granularity that
people should be looking at when you talk about
conflicts? So in one of your tools you're looking at
changes to individual signatures on methods to -I've seen things to files or components. What are
your thoughts on how one should think about or what
level granularity we should look at these things?
>> Anita Sarma: It's a difficult question.
good question, right? So ->>:
It's a
I don't ask simple questions.
>> Anita Sarma: The granularity of the changes has
two issues. One is the human levels capability of
understanding I am working on this particular file, I
know it's going to affect these other files much
easier to do than looking at another team, another
component. But if you are having a higher
granularity of changes, doing any kind of impact
analysis gets more complicated with a lot more
possibility of false positives. So if you're looking
at the component change, you have a lot of changes
and it could be all these changes have a lot of
effects, and if you're overapproximating, you'll have
everything as being affected.
>>: Even if it's deadlocked for no real reason,
right?
>> Anita Sarma: Yes, you can have deadlocks, because
even if we are looking at this is even if you look at
control flow graphs, right? Most of them get
overapproximated. If you want to be sound, that is,
if you wanted to catch all the problems, then it gets
overapproximated and everything gets affected so you
have one big lump. If you are being too tiny, then
you have an easier. So the smaller the chain sets,
the easier it is to understand. So one way of
thinking of this, I would say, depending on the
user's needs, so we are talking it's completely
client needs, right? So if as a manager I want to
say tell me all the components that are a problem,
but I think underlying that we should have for each
of these components these files, these methods, this
line was changed, and from there do the impact
analysis and aggregate the results back to a form
that users would like to see. So for the technical
part, the smaller the change, the better it is, the
lower granularity. For the people part, you need
some kind of aggregating and moving things up.
Otherwise, it might be -- it's too many information
that is difficult. And that's one of the reasons I'm
moving away from the Palantir stuff, which was all
file based and we're trying to get to a more high
level task-based understanding.
>>: For -- to address Gina's question about
requiring people to put in a lot of metadata about
what each task might touch, could you use some sort
of requirements traceability analysis to take a task
and automatically figure out what files or functions
it's going to be touching and at least that's a first
approximation.
>> Anita Sarma: Yes, you could. I was not going to
requirements traceability. I was looking more data
mining part, but that would only work for past bugs.
So if there was a bug request that we would like to
find similar past bugs and then do a seed of these
for the files that change but with a similar bug and
then the user can refine that. But if it's a new
feature, then we could do some kind of requirements
traceability part to say these are the possible files
you might want to do, and then they can refine it.
So that's the goal. I think starting with the blank
slate, nobody would do that, but as I think we were
having a discussion if there's a false positive that
you say for this task, these five files need to be
changed, I think a user will be more apt to say, no,
not this file, but this file. So correcting
something might be easier for the user than working
from scratch. So that's something if you want to
look at.
>>: More questions?
Anita for her talk.
>> Anita Sarma:
All right.
Thank you.
Well, let's thank
[applause]
Download