>> Andy Begel: Hi, everybody. Welcome to Anita Sarma's talk. I'm Andy Begel, a researcher in the VIBE group and Anita is a friend of mine who has had a nice long history in software engineering. She did a Ph.D. at UC Irvine with André Van Der Hoek. She's the author of the famous Palantir conflict avoidance tool, which when I saw it, it was like, damn it, why did she invent that before I did, because that was a really awesome idea. After she finished her Ph.D. dealing with coordination issues for software developers, she went off to a post-doc at Carnegie Mellon, worked with Jim Herbsleb, Marcelo Cataldo, a bunch of people there doing kind of neat things in socio-techno congruence about trying to match up software teams and the work that they do together and try to keep that in sync, and now she's a professor for the last five years? >> Anita Sarma: Four. >> Andy Begel: Four years at the University of Nebraska in Lincoln, Nebraska, which is where my father-in-law works, so that's kind of exciting, too, and she is -- has been working on a lot of things continuing in conflict avoidance for software engineers and trying to help generally software engineers improve their coordination, and she's going to be giving a talk today about distributed software development and how to deal with coordination there. So I give you Anita. >> Anita Sarma: Thank you. So that was a nice introduction. Thank you. So right now I'm at UNL and our research group is called ESQuaReD Lab, empirically-based Software Quality Research, so we don't just build tools, we try to make sure that the tools are useful, and that's one of the reasons why I'm here. We have had this amazing idea of build a new tool, and then after talking with Andy, suddenly realizing what is the context in which this is actually going to be used. So one of the things I'm trying to do here is talk to developers to kind of see the kind of problems I'm trying to solve exist, and if not, what kind of problems they actually have. So a little brief introduction about what I do. I have three strands of research, so first of them is empirically based to understand what's really happening in these communities that are actually doing some kind of software development or any programming limited tasks. So in understanding online communities, because they are easy to get the information from, open source development or Q&A, programming-based Q&A sites like stack overflow, so one of the questions in that area I'm interested in knowing is what motivates them to contribute, and now we don't even have a single project anymore. Like if you are doing ER work, you realize you recognize -you need some other project, you need people's expertise from some other project maybe, so it's almost like an ecosystem. There's no monolithic project anymore. Like no eyelet, right? We have an ecosystem. People have not yet looked into what happens in ecosystem, can I learn the social norms and technical knowledge from one project and transfer them to another project, so that's this little chunk of my work is looking at these online communities, understanding how people migrate from one project to another, are they transferable knowledge, are there specialized roles. Another part is end user software engineering. A lot of guys use Excel and there's a whole bunch of people out there who do some kind of programming, not because that's their job, but that's because they need to do that to get their work done, right? So Excel Web mashups. So over here I'm interested in understanding what kind of software principles and techniques we can actually get from software development and help this class of users, looking at how people do debugging in end user software. People just don't write from scratch. They pick some example from one place, put another example from another place, try to glue them together and see if it works, but when that happens, you don't know the kind of examples that you picked, and when you put them together, where's problem? In the blue code? In the first code you've got? In the second code you got? You changed it a little bit, but, you know, you don't know what's really happening. So can we help these end users debug and looking at how they even go out, look for examples, how they are trying to look for problems, looking at information foraging query to look at the foraging behavior. The other strand, the one that I had started with in my Ph.D. project, still continuing, which is how can we support coordination in teams. And in these strands one of the things that I already talked about, you want to look at the state of practice that's existing, try to understand the theories behind what's happening. Based on those theories and insights, build tools, and then evaluate them, and that kind of encompasses my research directions. Today I'm going to talk about mainly supporting coordination in teams, so starting with software development. This is a real nice piece of, you know, jigsaw puzzle that's out there and that's really where software development is. We have these different pieces of code that we have built by different people. There's no more one person building all the project and all the code, right? We have to divide labor, we build things together in, and what usually happens is, you know, these little spaces you have is almost like workspaces. You take from the main repository or [indiscernible] the main archive code is, you take it out, you work on it. Once you are done, you try to put it back into the system, and hopefully all these interfaces line up. While you are making changes, nobody has changed any of these other interfaces, right? So you're hoping as I work and I put things back in, it will be amazing piece of technology and it will work. What happens in real life? Here is the dependency packages in the PERL language. It's a pretty simple language, right? In the pictures we show complexity, but if you see this picture over here, the blue stuff is package, and what you're really seeing is, is a whole bunch of spaghetti code. There's calls from one package to another package, calls from a file in a package to another file in another package. What that leads to is then social dependencies among people, because I'm dependent on say, Chris, for his file, I get to have to coordinate with him, find out what he's changing. If he's mucking around with his code, that might impact me. So if we look at the social dependencies among these people, this is what you get. And to make matters worse, these dependencies are not static. They are changing over time because everybody is evolving code. So here's the same PERL language. In this case you have the vertical lines, which means the particular version or revision, and between revisions you can see how much the code has changed, right? So some versions don't have as much changes, just a little bit. Some have a lot of change. It might be because of the new release or it wasn't important part of the project, right? So the take-home over here is there are dependencies, people depend on each other, and it's a moving target. Things continuously change. There's been a whole set of studies done to understand how do developers work in these kind of settings, and more specifically looking at what kind of questions should developers have to ask. Some of them is like who do I go to for help? I just started on the project. I need to work on this work. I need to work on this piece of code. Who has the expertise? Who can help me out. This is especially true if you start a new project, right? You have to understand who's the expertise. Who should be assigned to this task? Who has the right expertise who can get this done in the shortest amount of time? Sometimes bugs get -- keep getting reassigned because you assign it to someone, they don't have the expertise, then they assign it to someone else. So you call these hard little bugs. They keep getting tossed from one person to another person. Which tasks need to be completed before the others? There is dependency. If I am depending on some release or some code to be done before I can get my work done, I'm getting blocked on this other person finishing their work, right? So how do we manage or interleave these tasks. Other two questions is -- oops -- which other artifacts are affected by my change. I'm here making pieces -- making changes to my code. Who else might be affected because of that? On the other side, as I'm working, I'm working on this really important topic, who else is working in the space in my team that can affect my work, right? So these are the questions you have to ask because as I'm working in my space, I need to know what's the impact of other people's work and what's the impact of my work on others. So if you look at a bunch of the questions that's about the impact, and coordination can lead to -- a breakdown in coordination can lead to two kinds of conflict I say. First is a direct conflict, which is I'm working on this piece of code, someone else works on the same piece of code, and when we're ready to synchronize, we cannot do that. In the case of where we have configuration management systems like Subversion or Git, usually what this translates to is a merge conflict. You have changed the file and he goes in first, he's happy because he gets to check in. I go to check in second and whoop, I have a merge conflict, and then I have to kind of see how to merge them together. It might be that the same file but the changes are in different spaces, so I can merge them pretty easily, but sometimes it could be that there is some dependency that goes from one block to another block, so what happens then? I stick it in, we run build or we run test and then something still doesn't work. Or another case where I think things look fine, I built fine, everything is fine, my unit does go fine, I stick it into the configuration management system because all the files are fine, but then something fails, build or test, and that might be because something else depended on an API or some behavior got changed on the side, and when we actually ran the build test that's when things failed. So these are undesirable consequences, and literature studies in different kinds of projects, some pretty large communication projects, some were telecom projects, some were at NASA, some were smaller projects. A lot of studies has been done that shows this is a problem that keeps happening. Work is frequently restructured because what you would want is all these little pieces to work, but these little pieces should be given to separate teams with good APIs. But as you keep working, things get changed, right? So you have to restructure work to bring back the modularity. Parallel work does tend to lower the quality of the code, and that's because you will have merge conflict and sometimes when even you automatically merge it, there might still be some problems. Developers recognize the significance of conflicts. Nobody would like to do conflict resolution, even if it's just merge. Merges are hard because going back to the old cases of SVN when you actually do an update in merge, you get a lot of squiggly lines, me, them, me, them. You have to go through all of them and try to fix it, right? So people have seen that people try to race, so they would say that this is very old work, like from '95, from Becky Grinter, and what she saw was people used the CM system logs as almost like coordination time. Like, okay, Tom checked it out one day bad, the release is coming up in one week, so he would be at this stage, so I'd better hurry up because I want to check in first. I don't want to do resolution. People also sometimes use another practice called partial commits, so even though my thing is not completely done, it's not tested yet, I want to put part of my code in because at least then that part of the code will be saved from me having to do resolution, right? And these kind of informal practices kind of go against the basic software engineering principles we have that you should have tested code that goes into the system. Another thing is informal communications take place. This was from a study in aerospace company, and what they did, they had really long check-in cycles, that is, almost two weeks long. So people would check out and keep working on their code. In two weeks' time they'll be ready to finish and put it back in the system. That is a long time when things can change, so what they would do, they sent an e-mail out, hey, I'm ready to check this piece of code. This piece of code has dependencies on these other artifacts. Is there someone out there who would be affected, you know? Speak now or forever hold your peace kind of stuff. And then they have this -- all this e-mail communication back and forth outside of the CM system. Once they knew what was happening, who is going to be affected, and how changes will actually be integrated, then they will go ahead and put it in the system. And this is something I found when we were doing it in the class project with my tool workspace awareness. We were like, nobody had any conflicts. It was amazing. All this work people, project, nobody had a conflict, and then I got the e-mail logs and there was a whole bunch of actual informal communication and coordination going on outside of the configuration management system. And they only had -- in this case, the student had only one person who could check it in, so he was the manager. We would get all the code, make sure everything worked fine before checking it in. Resolution is time consuming. A recent study by Brun Netal [phonetic] and my work has shown in open source project that resolution time takes hours to days to take place. One thing I have an intuition about and has been a little bit talked about in the literature is everybody knows about these merge conflicts and, you know, that is a coordination problem, but many people recognize that indirect conflicts like test failures or build failures are also actually caused by coordination problems. They think that's the regular day life. You have nightly builds and you have all this process and setup to overcome this problem. Nobody thinks that if we could actually coordinate better, if we can understand these dependencies, maybe this whole class of problems would not be even here. That's what I want to focus more in. In my study, which we looked at four projects, that was also the four projects that Udi [phonetic] run and Escalise [phonetic] had looked at, so we wanted to see wanted to see, okay, do conflicts occur, how big is the problem, right? That was before I go trying to solve this. So if you look at these projects, we have -- this was -- all these projects were in Git. So we looked at the number of merges there were, and the way we looked at the Git tree structure was if there was a merge that is two branches actually merged together, we would run -- if it was a safe merge, it was clean merge, then we would run a build script, and these projects had the build and test scripts. If the build pass, then we would run task on them. So we will make sure at the range how good the quality of the code was. And when we look at this, you see the total number of conflicts ranged from 40 percent to 54 percent. That's quite a few number of merges that had some sort of problem, right? Further looking at breaking these conflicts into merge, build or test, we see that it ranges very differently for different projects. You have some had 14 percent or had only 18 percent. Then we also wanted to look at the time it takes to resolve these problems, and this was surprising to me because I thought merges would be the easiest to solve. In the case of PERL, it took an average 23 days and a median of ten days to resolve this merge conflict. Caveat, these are open source project. It does not mean when there was a problem, that they actually eventually went and, you know, resolve it then and there. This is how long the merge or the problem existed. Going through seems like build failures is the easiest for PERL to manage. Test and merges are difficult. And different projects show different profiles, but take-home message is you have merge conflict with an eight to 19 percent of all the merges would have problems. Build off the clean merges range from two to 15 percent. to 35 percent. there. If you look at the test conflicts, up So there's a lot of problems out >>: So I -- merge and build conflicts and test conflicts are taught among teams where multiple people are working together? >> Anita Sarma: Yes. >>: And increasingly there's lots of pieces of software that are individual developers or small numbers, I'm thinking of like apps, for instance. How do you see -- certainly there are socio-technical issues there, but they're entirely different than these type of conflicts, I think. So what -- how do you see sort of the, in general, like software development, do you see these types of conflicts growing? >> Anita Sarma: It depends on the process. So if you are doing Agile, it's less time that people have to make changes, but if you are talking about individual pieces are being developed by individual people, like in a branch? >>: No, I don't even mean that. I mean some 16-year-old is developing a game for I/O apps and he doesn't have -- he's the only person on the project, right? So he doesn't have these type of conflicts. >> Anita Sarma: So then these exactly merge conflict and build failures would not exist. What he would have problems is when he tries to have the game work on the particular operating system. That's when things will change. So this is really, if he had worked from version one and now it's version two, forward compatibility versus backward compatibility and then he'll have to fix it. So I think if -- this is based on a team setting, these kind of problems. If it is individual problems, it is the dependency that you have with one API, multiple APIs, and keeping up to date about which API you have built on and how far it has moved, so that's where I think it will go. >>: Right. So I guess I'm asking you to speculate about the API, API conflicts versus this type of conflicts in the future where we're going to see more of. >> Anita Sarma: I think these kind of projects still going to stay. If you have individual project -individual people building stuff, I think the APIs will not be as big a problem because there should be backward compatibility in the APIs and they are slower moving than the semi-structured APIs within a team. So yes, they will stay, but it will not be as bad as these. And the way you resolve it will be different because you really do not have any control over what this other API is moving towards and where it's going because you are kind of a consumer. You are a client for that API. So unlike -- until you have like a Facebook kind of application, which is very big, you probably will not be able to have an impact on the API development, how fast they're moving. Yes, sir? All right. So as part of my work what I want to do is, as I said, there's the reason these problems occur is because there's some kind of dependencies, and I want to say can individuals be able to visualize what these dependencies are, understand who is interdependent on their own work, and then can we help them coordinate the tasks. So first I'll start with visualizing the software dependencies. So built a tool called Tesseract, which is a multifaceted way of exploring your own project. What it does, it has an environment that correlates and understands relationships across different entities. In any particular software development, you have the different silos in which your data exists. You have the code versioning system where all your code exists, the bug versioning system with issues and bugs are there. Then you have e-mail communication that's going on the side. But anytime you want to look into this data, you have to go into one database, look into it, then look at the other part and try to decipher the links or relationships between them. So what this tool tries to do is kind of says, can we explicitly state how these relationships are and how they are over time. So the tool -- this is in open source data rhythmbox in GNOME, so what this does, it has four panes. On the top pane you can choose a particular project, and for that project it then shows the activity level of that project. On the top part over here you see the blue lines, and those are the code commits that happen over the course of time. And then the bottom is the green lines, which is communication, how much communication happened. And in this case in GNOME, we had the e-mail exchange from the mailing list, we had any kind of comments that were taking place in the bug, database bug tracking, Bugzilla, and anytime there were any patches, so we're assuming that if I submitted a patch to Andy, Andy looked at it, he probably knows what I was trying to communicate, and if Chris comments on that particular bug, he probably read what I had written. So that kind of shows the communication, and this is all three of us are communicating with each other. On the left pane it shows the file read bug graph and it shows how files are connected with each other, so this is open source data. What we used is if two files had been committed together, we call it logical commit or co-committed. There must be some kind of logical dependencies between these files because of which they are committed together. On the right-hand side here we show the social network of the people network, who's communicating with whom, and the bottom part is the bug database, so are the number of bugs that were open for this period of time that we have. If I look into more into this e-mail communication of the communication network, what we have tried to understand is congruence, and congruence really means if -- what is the fit between the people who have to communicate and who are actually communicating, right? So in this case if Chris, Gina, Andy are working on some project together, there is some dependencies, right? So what we are saying, if they have checked in some files together, there is dependency between the files, there's dependency between these people, so we say there is a need to speak. And if you look at the green network here, so that's the gray one here. The green network says these are the people who are actually communicating with each other. Then we do a map between the student [indiscernible] box to see if they had a need for communicating, and that's green. If they had a need to communicate but they were not communicating, that's red. And then there's this gray line which means we didn't see any technical need for them to communicate, but they are still communicating for some reason., right? So we call this the congruence. And the thickness of the line shows how many times they had communicated with each other. It's just edge weight. So going back to the picture, so I had that big cluster of files and it says there's too many for me to understand, so let me filter it out. So over here I've filtered out that only show me edges that have been committed five or more times. Then I get this network. And I say, okay, this particular node over here, shell RC, is kind of central, so I want to know who has been working on that. So if I click on it, in this communication network, it highlights the people who had ever worked on this file. And then I can see, oh, these people are actually communicating with each other, so maybe whatever changes had been done they actually knew about. >>: What's the significance of the edge lines in the file graph? >> Anita Sarma: This is just a force directed layout, so if you go out farther, it just tries to put them away. So if they have been connected multiple times, they'll be closer. It's just a layout picture. So if they are farther away, it's just trying to, like, keep the graphs as far as possible. >>: Okay. >> Anita Sarma: So here you have them closer because there are multiple connections going on. >>: Okay. >> Anita Sarma: All right. So other thing I could be interested in knowing is, all right, here is my bug list. If I click on a particular bug, this tool tells me, okay, the two people who had worked on as part of this bug are these two people, and for this particular bug, these files had been changed. So what Tesseract does is it allows you to go from one perspective of your project, say my bug reports, to kind of my developer history to kind of my project history. So it allows you to explore how things are connected and give you more idea of what these relationships are and what they might mean. When we looked at user studies as well as just interviewing with GNOME developers, what we found, especially for GNOME was, people who were seasoned developers who had been doing this project for, like, five or six years, they said I know this network. I am perfectly aware of which files changed with what. I know who's working on what. Then we ask, like, so how do you know that? And this particular person would spend morning every day at least three to four hours going through the entire mailing list kind of seeing what's the status of the project. He was that into the project, right? But here it will say, like it would be really useful for on-boarding or finding experts, experienced finding for people who do not care about the project that much. So his was like, if you really care about this project, you're an open source person, you should know this. You should have this mental model in your head. If you are not that serious or if you're new, this project will be -this tool will be really helpful. A couple of them were managers, and the thing that they picked on was like, I love when I see a red line, right? Because I know these two people should talk, but they are not talking, so I should be able to facilitate them. Another thing that came out was what we have in this data is only things that are archived. So even if I'm not talking with someone over e-mail and maybe over bug tracker, it might be I'm sitting right next to the person and I talk to them all the time. So one of the tool features they wanted to say was like, yes, red line, but I want to click it, add it to green because I now have communicated with that person, or as the manager, I know these two people talk to each other during the meeting. Another thing that's interesting was we had brokers in the communication network, not in this picture, but we will see that A and B had a red line between them, but there was this other person, C, that they communicated with. Actually, we will say like, you know, even though they're not talking directly to each other, there is a manager, there's some facilitator through which information is passing from one node to another node. Of course, the success of this project depends really on the existence of link. Oftentimes you don't find that in open source data, at least, so if you have the comments that have happened, the files that have changed, how do you link that to the bugs or how do you link that to any of the patches that have happened? So that depends on the link, and Chris Word [phonetic] over here had this tool that was allowed to actually decipher the links or add links to the comment and bug, so it will be really useful in a tool like this. So going on to the next set of tools or two tools that I'll talk about is basically how to have coordination in helping with the distributed development. So as I said earlier, we have this complicated interrelated piece of code that we need to work on. We all work on in our own private workspaces, so the idea of workspace awareness was can we monitor these private workspaces. As people are going about their daily everyday life, can we know what files they're changing, watch the dependencies that they're changing to identify these potential conflicts, which will be the merge conflicts and the build conflicts that we talked about which cause our dependency violations and can we notify developers about this as things are happening, as people are still making changes, so instead of waiting till they have made their changes and committing it, can we move forward in time and say, as you're making changes, here, this might be a problem and you want to talk to them. And that was the Palantir tool which was the Ph.D. project that Andy was talking about, so what we did was we wanted to make it really lightweight because the more stuff you have in your development editor, it distracts you, it interrupts you. So what we tried to do was we just took over the Package Explorer view in Eclipse and made this really small little icons, right? So if you look at the blue bar, blue icon over here, what we said was you are working on this particular file and someone else right now is also working on this particular file. So there was an option where you could have this on for all the files you had in a workspace, or you could have this view only for the files that you are dirty, that means you have changed it since you have checked out. And what we did was let's say someone else is doing it, we also wanted to say what is the severity, how big is the change that you will have to deal with, and we did, as the percentage of lines of code that have changed, or the entire lines of code for the file, and the other was we want to show them how this information could be percolated up the directed structure, because it's often possible in a large project you have multiple projects or packages that has collapsed, so you wanted to say even while you are in the top view, if anything in any of the underlying packages or directories had changed. And for that what we did was a very simple directory severity calculation which we said how many artifacts does this project or package have, out of that, how many have been touched by somebody else, and just aggregate that and put that in the directory and we keep going that upwards, right? So if your directory had two files, one was changed, so the percentage change for the directory will be 50 percent. If that belonged to another directory which was two other directories, it will become 25 percent. It actually just keeps going up and getting decays a little bit. The other thing which we wanted to do was merge conflicts are still easier to identify. Can we in some case find these build failures or these other indirect conflicts that are more difficult to find. So here we have this red little icons and we wanted to say is what's happening here, where is the changes in address with whatever is the modest severity there, but changes over that is causing some kind of impact into your system, so that's what the "I" for impact leading out, and here your worker credit card because you have this little arrow here which shows you're working on credit card, and the impact is coming to yours. So what we did was we did cross workspace impact analysis. It was very simple analysis, very rough analysis at this point. So for each workspace, we made a lookup table, SAT, saying from call graph which other files and which other methods you're dependent on, this particular file, and in a remote workspace if some method or some file was changed, you'd get that notification. We'll just do a simple lookup. If you are using this method, this method has been changed, there might have been impact, and this is right now at the signature level. So if anything actual behavior changes, we would not have found out. If anything of the signature changed, we would have found out. And we get more information on this little view, and this was my attempts at drawing bombs. So a red bomb would mean that someone has changed a method that we're depending on. In this case, someone has deleted this particular method, get name. It's impacted by Ellen on Pete's work, and it says -- the red means that changes have been committed, so this is definitely going to be a problem. A yellow bomb meant something is happening in the workspace, so this might be a problem, but we don't know. They might actually merge this change back. And then green was if you have changed everything in user [phonetic], in yellow code and you wanted to find out what else impact will impact any particular project. But other interesting thing was this exclamation point. As I said, what we had done was a very simple call graph signature level analysis at this point, but what we wanted to say was, hey, in this particular research you were depending on payment -- you were depending on payment, your credit card was depending on payment, and something has to be added, something new. For example, an init method has been added, so the system was not smart enough to know, like, you were depending on payment, a lot of the initialization code had been moved to this other method, so the behavior had changed. We said that's a lot of complicated analysis. The user usually has a lot better understanding of the domain knowledge, has a lot better understanding of what the systems are, what people are doing, so we wanted to say by exclamation point, like, a new method has been added, what's the details of this new method. So as a user, if I know payment initialization is important to me, I will go and check what's happened or talk to Ellen to see what she's doing. So we tried to offload some of the more complicated computation part to the developers because they would know better idea. So when we looked at the results, we saw conflicts were pretty good in being detected as they merge as long as they were the syntactic level. We found developers undertake action upon noticing a potential conflict, and in our user study, we found a whole spectrum of people, right? So some people were as soon as a little bit of a conflict information got in, they had to go look what it was, and there was one particular user who could not have anything that is in conflict thing, so they have to go and, like, find out what it was, start an e-mail communication or IM communication to see why it was. And at the other end of the spectrum of people who were just open [indiscernible] really wide and they would not care about what happened to Package Explorer view, so all the details were lost. So the only time that these people, group of people would look at the Package Explorer view or look at the conflict was when they were starting a new task or when they were looking at another file because a particular task might have multiple files so they had to go open the Package Explorer view, or when they were taking breaks. One of the tasks on the user study was they had to write comments. People always wrote comments after the fact, so they would code everything up and then they would open the Package Explorer view or take a break, look around, then write comments, look around, write comments, right? They didn't like writing comments, at least the students we had. So those were the times where people looked at the conflict icons. Fewer conflicts grew out of hand. This was -- I don't have details about the experiment setting, but this was using a configure study, so we had one person come in and we told them they're working in a group. The other two people could only be contacted through IM, but these were our configurers, so they are research helpers, and we had like every -- 15 minutes into a particular task that would be conflicting, we would see the conflict and it was automatically checked in. So we saw some people trying to race. They would have this blue icon coming and said, oh, I need to go fix it before this other person actually checks it in. So they tried doing that, racing to finish it. Then they realized it's already committed, so they were really, once they faced a conflict and they had to resolve it, they were really particular in making sure this other group of people who wanted to have no conflicts, as soon as an icon came in, they would contact this person and say, hey, I'm working on this task, these are the files I'm changing, what are you working on? How much time will you take to work on? So they were trying to synchronize and manage through the IM. So the resulting code was higher quality in the terms that fewer conflicts were left in the code base and this was based on because we had seen the four conflicts in an eight-task experiment. There was a penalty, because a lot of people were communicating over IM and it takes time to communicate, so the control group people were faster than the experimental group, but they had more conflicts left in the code base once we said, okay, now, the experiment is done, here is all the conflicts or dependency problems that could have been. One of the faults in the experiment was -which I didn't realize at that point -- we did not make the control group resolve all the build failures. We should have taken time. We didn't do that. Anyway, so that was fun. And there are a lot of other workspace awareness tools that have come after Palantir, done better jobs in UI and better jobs in the kind of analysis being done. This is FastDash, which was from Microsoft, so Mary's group. And what they did was, like here are all the files and all the people that are being worked on this in an Agile setting, so they will say which particular file, what's the status of the file. If a file is being worked on by two people, they show the names of the people there, and like this is a problem space, you might want to look into that. This was by Uri and his group, and what this has done, this is based on Git and they look at there is -- Git works in all kinds of branches, so everybody has their own branch, and sometimes people do local commits, smaller commits before they actually push the changes back into the master repository, so they had a shadow repository that pulled in all the local commits into the repository and they would be able to say, like, you know, when there is a possibility that there would be a problem, a build failure or test failure by actually running the build and test scripts on this master shadow branch. All right. So a few of the limitations of Palantir and this other workspace awareness tools are the way the approach works is conflicts are only identified after they occur, right? So as I'm making changes, as I'm doing stuff, these monitoring spaces tell me here is the file that is going to have a merge conflict, here's a dependency that has been violated, and the more changes that have happened, the more time it will take to resolve. They are usually being coarse-grained impact analysis at this point. Because of this notifications coming in, there is the case of information overload or interruption that we saw from the Palantir study, and the opportunities for improving or extending these kind of analysis to a larger unit, like tasks or like scenarios and features that Microsoft uses, the problem is because all the analysis is done at the file or the directory level, so bringing it up to a higher level like logical units, like tasks is kind of difficult from this setting. So what I wanted to do next was can we go even more forward in time. Can we proactive. Can we find these dependency problems or find when changes are going to have an issues before people have even started doing the changes. Can we look at the tasks that they are going to work and figure out the dependencies at that point. So then we can also give them the solutions at the task level. And one of the things I wanted to look at was would this avoid individualistic solutions, right? So in Palantir when we had this blue icons or these red icons, people want to go race in or do partial commits or talk to the other person and select I want to put my code in, right? So it was a lot of individualistic setting. Can we look at it so that avoid the race conditions and these kind of individual strategies to see what is good for the entire team. Maybe it is okay for Chris and Andy to have some conflicts here because they are working on a code that doesn't need to go for this release, but maybe Gina and Tom should not be affected, right? So there might be based on the team policies different strategies that this could work. So we built Cassandra, so Palantir was the crystal ball from Lord of the Rings so you know what's happening in the seven kingdoms. So we want to say can we go to Cassandra, right? Can we predict, even before tasks have been started on, what are the future problems that could occur, right? So that's the Greek mythology, Cassandra. So we can minimize the conflicts that can arise from the individuals working in the workspaces. So this is an approach for Cassandra, so first we are assuming a workflow where this is more akin to the workflow that might happen in open source. So I come in in the morning. I say, okay, these are the bugs that I have in my inbox by pulling it from Bugzilla or something. I might want to order the task based on my preferences or any kind of priority that we have. After that, what needs to be done is kind of identify the files that's going to be changed for the particular task, and once we have the files that's going to be changed, what are the dependent files and what's the files that can be impacted. And then we have to analyze these tasks to understand the dependencies so that we can understand the conflicts, and then we formalize those constraints into hard constraints or soft constraints to talk about in a few minutes and then evaluate them to find which tasks can be independently created, and we use Microsoft Z3 actually to do the constraint evaluation here. So let's give an example of a constraint example. So here we have, say, shape.java. We have three other classes that are inheriting from Shape. If we have a scenario where we have Alice, who has three tasks, TA1 A2, A3; Bob with TB1, B2 and B3. In task one, Alice is working on rectangle and shape, and task -- and B1 Bob is working rectangle and square. If you look at the example of the dependencies, right? So if Alice and Bob were to do the first tasks, there's going to be one merge conflict because they're working on the same class here, rectangle, they're modifying it. They might also be in direct conflict because whatever changes Alice might be making to shape might affect square or triangle that Bob needs to work on. And there might be some order like of task precedence, like canvas needs to be built before panel can be inherited from canvas and be built, right? So there is some precedence ordering that is also necessary in tasks. So what we do that, we can work that into hard constraints, that is, TA2 has to be done before A3. There's no other way around. And then the rest are soft constraints, so TA1 and TB1 can be done in parallel, but there will be some consequences with that. So looking at how do we get the constraints from the Fe and Fd sets, so what we are using is either we could use some kind of data mining, if there's a bug feature or bug request that needs to be done, we can look back in time to find out what other bugs are similar, what other files might have been changed for that bug, right? So we have a seed set of files that will be changed for that particular bug. And the user can definitely refine it by adding more files or removing more files, going back to like how MyLyn does this context, so usually you will have a task, you can say I'm going to change these files, I want to look into these files, so you can have some developer input for refining, and then we can do a basic analysis, like, again, going back to dependency finder, looking at the call graph analysis, one-half level to kind of say these files are also dependent on these other set of files. And then you have this Fe and the Fd set, so if there are two people and their Fe sets are intersecting, that means there is going to be a merge conflict, right? So there is going to be a direct constraint between these two tasks. If you have an Fe set and it might have some impact on this Fd set with somebody else might be working on, right? So then there will be an impact, so we have call this the indirect constraints. So right now we are only looking at Fe sets, impact on Fd set, which is being changed by someone else. It could be that there are two different sets and the impact is somewhere downstream in another set of files and that needs a little more refined analysis. So once we have these constraints, we need to evaluate them. And we're using Microsoft Z3, so we have these constraints between all these tasks. We put it into Microsoft Z3, and if there's a solution that is -- that exists for these two people, two tasks that do not constrain each other, right? Then what we want to do, there is a solution, but we want to try to match it as close as the developer's preference or the developer's priority. So to do that, we look at, okay, this is a solution that Z3 gave to us, four, two, three, one is an order which won't have any problem, but the developer wanted to do one, two, three, four, so it's kind of very far from the preferences. So what we want to do is minimize the cost. So if it is, we find out how far. So four was three halves of eight, so we said, okay, there are three units is the cost at this point. So we tried to say, okay, for this solution, how far is it from the developer's preference and we make a cost object out of that. And then we put that cost back as a constraint and say, in this case, there is a three, a two, so there's at least five places that would be changed, right? So we want to say -- we want to put that back in the constraint solver and say try to minimize this cost from, say, five to three. So we kind of go binary and see if there's any solution. So we go back and try to revalue the constraint space and we keep doing that until we find the cheapest cost function that we have, because that's the minimal set we could get. And then we want to display this information back to the user and say this is your recommended task. We do that, the UI, what we have is this is MyLyn, and over here is the MyLyn's task view. We kind of highjack that task view, task list and say, okay, if these were the tasks that existed for this particular person and one, two, three was the ordering that they were going to do. MyLyn does not allow for task reordering, so we had to put a plug in that allowed that, so the developer can say implement plot should be actually number one and not number three, so they can move this, and there is this little tick mark that says, okay, run the constraint solver now. So we run the constraint solver two times when the user says run it now. And the other part is when someone has checked in their code, what we say, like the other people who are still working on, I finished my part, what is the next task I should do. And it takes some time for people to check in their code, right? The comments and stuff like that. So we use that time to run the -- reanalyze the constraint space and saying, you know, some constraints will be added, some will be removed based on what other people are doing, what files you have changed and say this is the next best task for you. So when we give this visualization of what's the next best task for you, we kind of said this is the order which we want you to do, and this little exclamation mark that says why we want you to look at that. And over here it says, you know, this is your task ID. It conflicts with this other person. What is the task that is conflicting. It is direct or indirect conflict that could be causing this. So the idea is to kind of say like, it's not just like, hey, you have to do task number two, but saying if you do task number two, these are some of the problems you will face and seeing if developers actually reschedule and do the same task or they actually pick Cassandra's order. What happens if there is no solution, right? So this was the first case when we ran the constraint space, we found two tasks that were independent of each other. What if there's no solution? What if it's an unSAT situation. Then what that means is we have to relax some of the constraints. We cannot relax any of the hard constraints because something has to be done before these others can be done, right? The output is needed for as the input for this task, too. So we can relax some of these softer constraints. And the way we relax soft constraint could be very much dependent on the team policies. So you could have, for example, a conflict focus. Initially when we started this VSU, merge conflicts were easier to solve, so we could just say if there is no solution, just break all constraints that cause direct conflicts, right? But after looking at the data, we had MyLyn aside the PERL data. In PERL build is easier to solve than merges are. So again, it might depend on the project's profile what tasks and what conflicts you want or you do not want. It could be team focus. For example, as I said, it might be okay for two people to have a problem as long as the majority of the team doesn't have any conflicts. It could be task focus. We could say these three tasks are going to be released now, so nothing should ever impact these tasks. Or it could be these files are frozen because we're ready for release or these are a public API so all constraints relating to this particular file should be left alone. We started with looking at conflict focus. That was the easiest to do. So what we do is when we have an unSAT model, a basic model was start relaxing by you could do all direct conflicts or you can do one conflict at a time. Z3 right now does not give a minimal set. It says, okay, these are all the constraints that could happen, but it could be that if you relax one of the constraint, the solution will still work, right? So there is some optimization that we can do. Right now for starting we said, okay, first step, let's relax all the direct conflicts and see how fast and how much time it takes. The other one is empirically guided, so you can look at your particular project's profile and see builds are easy, so I'm okay with removing indirect conflicts maybe. But in this case, you have to be a little more fine-tune and say remove one conflict at a time, run the constraint solver and see if you get a solution and then keep doing that. So there are some of the underlying assumptions here. We are assuming developer selected task from a given set, so there is a set that is awaiting for them at the beginning of the day or a week or a month. There is a task context that, you know, we know for this task these kind of resources are going to be changed. We know that ahead of time. We're doing very coarse-grained analysis right now. That task assignments are done at the beginning of the period, so the weekly time I know these are my tasks for this week or a particular sprint. There's one active task per developer. It's not that a user has three or four or five tasks all open because as soon as you're working on a task, we concretize those constraints, so we said this person's working on this, nobody can touch these constraints, right? We also are assuming tasks are unique across developers and nontransferable. It's not that they start something without checking it in, i.e., mail it to someone else, so there are two tasks going on at the same time. And another thing is we're assuming developers commit their changes at task completion, and this is kind of important for the approach is, initially when we are starting there's going to be a lot of noise in the data because we are kind of looking back in time to find which files will be changed, we're looking at this call graph analysis, so there's going to be overapproximation in the dependencies we have, but we're assuming as developers are making changes, they have much more fine-tuned, fine-grained idea of what's changing in the dependencies, so when I have completed my first task, then I have a much better -the tool has a much better idea of the constraints that's out there to get a much better prediction for the next task you're looking at. But if you don't commit your changes, then there's going to be a lot of changes, a lot of constraints in the space leading to unSAT solutions. I'm not going to go into depth with this because I'm sort of running out of time. So to evaluate it, just to see if Z3 and constraint solving would work in an open source setting, we looked at the four projects we had, Jenkins, PERL, Voldemort, Storm. We looked at weekly data, monthly, quarter and six months' data, and I want you to focus on that in many cases we had the direct and indirect conflicts, so 12, 34, 49. We ran it. We saw SAT conditions as well as unSAT conditions. When we had the unSAT conditions, we relaxed some of the constraints. And if you look at it, we got solutions. There was number of conflicts that got avoided, four out of five, 33 of 35, 36 out of 40. So we're pretty good in resolving these problems. Of course, in this case we knew exactly what the changes were and what the dependencies were. There was no false positives involved because this was gotten data after the fact. Okay. Conflicts avoided. Another thing is time was pretty -- this is in seconds, so it's pretty fast. When we looked at it, it was 63 percent developer preferences matched, 23, 25, so we ran it again to see as close as we got the developer's preference. And in this case, developer's preference was just based on the order in which the changes were done. And we saw that most times we got pretty close to developer preferences matched. Only in one case it timed out, which was set for three minutes. So the solution works pretty fast. So to summarize and to conclude, what we have is we have -- what I have showed you now is there are dependencies. It's hard to understand what those dependencies are. So here is a set of tools that actually let you explore these dependencies through Tesseract, which looks at different project elements, aggregates those different types of relationships as networks that exist, cross linking, and you can look at it over time. The other set of work that I talked about is how can we -- scheduler, how can we coordinate the tasks or different changes that developers are making in the team context. And one thing is we want to eliminate seclusions. If you're in a branch, if you're in a private workspace it's not that you did not know what's happening around you, but we are insulating it so you have some information. Going from early detection of conflicts to scheduling tasks to avoid the conflicts altogether. And another thing we have not talked about is a lot of the stuff that I spoke today was very coarse analysis at the call graph level, but can we use the development context to do much more finer-grained analysis or scope it, say, for the behavioral changes. For example, if you wanted a semantic analysis like Austin, wall [indiscernible] execution, [indiscernible] for the entire code which is very expensive. Can we use the development context like code ownership or who's active now to scope this analysis space. Thank you. Work is supported by a bunch of NSF grants and an air force grant. I thank my students in the lab, Josh, Corey, Bakhtiar, Sandeep and Rafael who did a lot of the work, most of the work, which was after I graduated. Thank you. Questions? [applause] >>: We have time for questions. >>: So the patches with how -- are people willing to put in that upfront work to save, like, later costs? >> Anita Sarma: That's something I need to look at as a user study. So one of the things I think to help them understand that -- that's why we did the past study to say like how much time it takes for solution. The solution time is less. They would not put the upfront time, right? This was a study done by MyLyn, so they're rating their own dog food, and there they found that because what MyLyn would allow is putting the degree of knowledge information or degree of interest information, degree of interest information that tells you which of the files might be changed. People wanted that information badly enough that they actually put that extra time to say I will be changing this, this, this, this, and this files. So in MyLyn, it worked, but it was a research group. So it remains to be seen how it's going to work. Going back to the Palantir study, which was less upfront work, but the first time when they saw the conflict, they wanted to just race ahead and finish it, but the next time they found the problem. Before they wanted to go ahead and finish their task, they always started the communication and say I'm going to be change these things. What are you changing? So people are simply very conflict averse in the settings that I have seen. So if this would help resolve or help or not even have these conflicts, I believe they should be, but it depends how the development context and team policies are. >>: [inaudible] >>: The cultural thing. >>: More questions? >>: I can ask one. So in a way, the conflict avoidance is kind of like pessimistic concurrency in merging control where if somebody has that file out and it's locked in, you can't check it out, you can't touch it, which brought to mind the idea of offline work. So one of the challenges, if you're not connected to your source control system and something's locked, you can't check anything out, you can't do any work until you connect. So this eliminates essentially parallel development or the ability for parallel development if you're avoiding these conflicts, and so that was just sort of an observation. But the question I had around this was in the studies you did of those open source projects, the four that you put up there, how big were the conflicts? Were they multi-person? Were they more than two people usually? Or like if you needed to coordinate work between -- to avoid a conflict, was that just coordinating between two people's tasks or was it like 15 people needed to have all their work coordinated? >> Anita Sarma: These are open source, so they were smaller. And the way we defined the conflict was when two branches were merged. So by definition of a problem in this case was two people because there were two branches that were merged. So there's always two people. But we are looking at kind of branch study, and at some point in Voldemort we had 15 branches that were active, and I think there were five or six unique developers. So because it's Git, everybody has multiple branches that's going on. So I think the worst case we have seen is in these projects, six people working simultaneously, but the problems that I have shown is by definition two people. So if you broaden it out and look at this point if all the branches were to come in, there would have been five people you have to coordinate. Does that answer your question? >>: It does. Chris -- It has follow-up, but I'll let >>: It's kind of a separate -- >>: Oh, just -- >>: I'll let you ask your question. >>: The follow-up was so you put in a lot of work into this SAT system to try to automatically schedule things. If it's potentially just two people, or let's say it's a small number of people, could you get away with just a visualization of here are the -like the Palantir visualization, I'm sure the tasks might overlap with this person or that person, and if it's not too complex to see, maybe the people can just figure it out for themselves which ones to choose. >> Anita Sarma: That's possible. So if it is -- so we are going for the hard-core case where we have 15, 20 people working together, but if it's a smaller graph of people and relationships, we could just leave it as a visualization and let them choose. The only problem would be it might become common material, depending on how many tasks each person has. >>: Sure. >> Anita Sarma: So that's where having some automated too would help. Z3 was pretty easy. We plugged it in and it does magic behind those screen, so ->>: You have been looking at this kind of awareness and conflicts so you may not have called it like back in Palantir, whatever. What's -- what is your feeling about, like, the type of granularity that people should be looking at when you talk about conflicts? So in one of your tools you're looking at changes to individual signatures on methods to -I've seen things to files or components. What are your thoughts on how one should think about or what level granularity we should look at these things? >> Anita Sarma: It's a difficult question. good question, right? So ->>: It's a I don't ask simple questions. >> Anita Sarma: The granularity of the changes has two issues. One is the human levels capability of understanding I am working on this particular file, I know it's going to affect these other files much easier to do than looking at another team, another component. But if you are having a higher granularity of changes, doing any kind of impact analysis gets more complicated with a lot more possibility of false positives. So if you're looking at the component change, you have a lot of changes and it could be all these changes have a lot of effects, and if you're overapproximating, you'll have everything as being affected. >>: Even if it's deadlocked for no real reason, right? >> Anita Sarma: Yes, you can have deadlocks, because even if we are looking at this is even if you look at control flow graphs, right? Most of them get overapproximated. If you want to be sound, that is, if you wanted to catch all the problems, then it gets overapproximated and everything gets affected so you have one big lump. If you are being too tiny, then you have an easier. So the smaller the chain sets, the easier it is to understand. So one way of thinking of this, I would say, depending on the user's needs, so we are talking it's completely client needs, right? So if as a manager I want to say tell me all the components that are a problem, but I think underlying that we should have for each of these components these files, these methods, this line was changed, and from there do the impact analysis and aggregate the results back to a form that users would like to see. So for the technical part, the smaller the change, the better it is, the lower granularity. For the people part, you need some kind of aggregating and moving things up. Otherwise, it might be -- it's too many information that is difficult. And that's one of the reasons I'm moving away from the Palantir stuff, which was all file based and we're trying to get to a more high level task-based understanding. >>: For -- to address Gina's question about requiring people to put in a lot of metadata about what each task might touch, could you use some sort of requirements traceability analysis to take a task and automatically figure out what files or functions it's going to be touching and at least that's a first approximation. >> Anita Sarma: Yes, you could. I was not going to requirements traceability. I was looking more data mining part, but that would only work for past bugs. So if there was a bug request that we would like to find similar past bugs and then do a seed of these for the files that change but with a similar bug and then the user can refine that. But if it's a new feature, then we could do some kind of requirements traceability part to say these are the possible files you might want to do, and then they can refine it. So that's the goal. I think starting with the blank slate, nobody would do that, but as I think we were having a discussion if there's a false positive that you say for this task, these five files need to be changed, I think a user will be more apt to say, no, not this file, but this file. So correcting something might be easier for the user than working from scratch. So that's something if you want to look at. >>: More questions? Anita for her talk. >> Anita Sarma: All right. Thank you. Well, let's thank [applause]