Document 17889720

advertisement

>> Christian Bird: Alright, well we’ll get started. I’m really pleased to introduce Bogdan Vasilescu. I’ve known him since he was a graduate student at Eindhoven University of Technology. He’s gone to, since then got his Ph.D. at the wonderful University at UC Davis. Where he got Vladimir Filkov and Prem

Devanbu here for the summer, some also wonderful people came from there. He’s a postdoc there and he’s visiting us, visiting yesterday and today.

If anybody would like to visit with him later today after this talk you can send me and email, or I can set you up my [indiscernible] is CBird, C-B-i-r-d. He’s been doing a lot of studying of social dynamics and the great things that happen on GitHub collaboration. He’s going to tell us about that today.

I’ll hand the time over to you. But real quick I don’t know if I told you or we have term here. Generally at Microsoft when people give talks it’s generally accepted if you’re okay with it to ask questions and have a discussion during the talk. You can say wait until the end of the talk to ask a question about something.

It’s perfectly find Bogdan to prepare a talk for forty-five minutes without questions. We’ve got the room; we have it for an hour and a half. But my goal is to kind of take about an hour. That’s kind of how we run things here. Alright, thanks Bogdan.

>> Bogdan Vasilescu: Thanks a lot Chris. Thanks for inviting. Good morning everyone. Feel free to interrupt anytime, that’s fine. I want to tell you a bit about some of the work we’ve been doing recently on GitHub. This is not just me doing it. It’s work with lots of other people. There’s Yue who’s a visiting student from China in our lab currently. There’s Daryl and Baishakhi Posnett at UC Davis. There’s

Alexander my former Ph.D. Advisor from Eindhoven, and of course Prem and Vladimir who are here from UC Davis.

Now the plan is to give you a little bit of background about this work. A little bit of background about myself before I go into it. I’ll do just that but feel free to interrupt anytime. We are talking about how software development has changed in the recent years, if at all. Some of the picky ones among you will argue that it hasn’t changed at all. That you know the only thing that has happened to software engineering since software engineering was maybe agile ways of doing it, as opposed to other stuff.

There will be some others among that will be more lenient and will say; well you know maybe Git was a nice idea. It changed the way people are coding. You know they went from the central repository where you need to have access in order to push to this very distributed model where everyone can code, and so on.

I’m talking about something a little bit more subtle. I want to focus on something else which is the emergence of the social programmer. Okay and this is a person who is familiar with these venues that

I’ve listed here. Let’s do a quick round. Who is on one of these? Who has a profile among you on one of these things?

>>: [inaudible]

>> Bogdan Vasilescu: How about two? How about three? How about four, already four, four of these?

How about five? Four, who did four, that’s you, where are you on?

>>: Stack Overflow, LinkedIn, GitHub, and Twitter.

>> Bogdan Vasilescu: Cool.

>>: [indiscernible]

>>: Same.

>>: Same.

>> Bogdan Vasilescu: Yep, alright. The social programmer…

>>: This room has an age [indiscernible] there’s a four.

>> Bogdan Vasilescu: How many were you on?

>>: Not me but there were four before.

[laughter]

>> Bogdan Vasilescu: Perhaps we should have talked about the anti-social programmer.

[laughter]

Okay, so the social programmer is a person that is very visible online in all these social venues. For instance he is a person who has a profile on GitHub and shares code with other people. Exhibits a very social behavior, follows other developers on GitHub just like you do on Twitter and is followed by others in return. You know watches and stars repositories, and is interested in receiving notifications and news from these other places that people are working on.

Okay, but perhaps more interestingly the social programmer is a person who signals all the time. This is what they call the contribution graph or the activity graph on GitHub. It shows these are days of the week and these are months. It colors a day in this calendar based on how active you’ve been during that day.

This is like a game think of it as a game. The goal is to fill up this contribution graph. To have it as dark green as you can. Meaning you’ve been active all the time. You’ve been super active every single day,

right. People are playing this game. They want a signal that they’re doing a lot and they’re out there.

They’re doing this very publicly.

Okay, maybe also Stack Overflow, right. Social programming is a person who shares knowledge with others on Stack Overflow. You can see for instance, Jon Skeet. Are you familiar with Jon Skeet? Was he in Microsoft at some point? He wrote a book on C#. He might have been, right. Anyway, Jon Skeet is a person that answered thirty thousand plus questions by other people on Stack Overflow.

That’s a lot, right, very visible active social person who is also signaling, right. Stack Overflow this gamified environment where people receive points and badges depending on how active they’ve been.

Turns out Jon Skeet is the number one ranked person on Stack Overflow of all time, okay.

This is to the extent to which folklore has developed around this persona. I’ll let read this, “Jon Skeet can divide by zero”, for instance, if you didn’t know. My personal favorite is the last one. He wrote a book on the latest version of C#. In a few years Anders Hejlsberg will, going to open the book and see if the design team got it right.

Okay, so I want to point out something here. Through his contributions to Stack Overflow, Jon Skeet reached about a hundred and twenty million people. Okay, so let’s put this in perspective. That’s more people that have read Harry Potter. Okay and about as many as the Lord of the Rings. Hey, that’s a lot.

To put things even more into perspective. A survey from last year estimates that there are only about you know less than twenty million software developers in the world, right, this including hobbies, professionals and hobbyists, less than twenty million.

Through his answers Jon Skeet has reached almost reached almost a hundred and twenty million people. Okay, so this is, that’s a huge impact. Can you believe this?

>>: Well, I’m wondering how you count the [indiscernible]?

>> Bogdan Vasilescu: I think the counting views to his answers. Of course they’re counting the same person multiple times because you may come back to the answer and so on. I think that’s sort of counting views to questions and answers by him.

>>: They don’t say views they say people reached under there.

>> Bogdan Vasilescu: Yeah, I know maybe you do unique IPs or something like that based views on these answers, I think. Anyway, the…

>>: What percentage would you say those twenty millions of, use GitHub…

>> Bogdan Vasilescu: But, so that’s the thing they needn’t contribute, right. This is a public venue. You can just Google something and end up in one of his answers with a solution to one of your problems without you asking anything.

>>: Are there estimates though on…

>> Bogdan Vasilescu: How many are there?

>>: Yeah.

>> Bogdan Vasilescu: I think very, very few relatively.

>>: From, I do the inside justice on population. But an undergrad it’s everywhere. Literally everyone has GitHub. An undergrad at the level college…

>> Bogdan Vasilescu: A few years back at this MSR Vision 20/20 event we had is Kingston. There was a talk about the Dark Matter Developers. I think it was Greg Wilson that brought this up. The point that he was trying to make was that you know we’re analyzing the mining software repositories community.

We’re analyzing on these online communities and open source software, and so on. Where people are developing code and they’re doing it as [indiscernible].

But the question was how representative is this of the entire developer population? Because the point he was trying to make is that the average software developer would not be on any of these venues, right. We wouldn’t be you know maybe not analyzing a representative population. But I don’t have an estimate of how many there are in general. I don’t know. I know it’s popular in the US. But I don’t have an estimate.

Okay, what do people do with all this information from online? Well you know one thing they do is they aggregate it on profile aggregators like Masterbranch, with the goal of showing it off to both your peers and to potential employers. People are signaling their availability for hire and their interest in getting a better job or whatever. All of this online public information from all of these communities is aggregated and is accessible to recruiters. People are using this for these purposes.

Right, so the point I’m trying to make here is that the social programmer is a person that signals in all of these online venues. There’s a lot of work showing how these signals are being used for recruiting and other purposes.

Okay, so just too briefly mention some of this stuff that we’ve been doing over the years. One of it, we had a paper of social computing a couple of years ago where we looked at developers who contribute simultaneously to GitHub and Stack Overflow. We looked at this population of the people who are in both communities. We wanted to see if there’s any association between their activity in one and the other. Right, if you’re a rock star in one are you likely to be a rock star in the other, and things like this.

What do people do in these places? How you know signaling in one matches signaling in the other, so to say.

I won’t go into details. You can look at these yourselves if you’re interested. I’m just going to tell you slightly how we did this. We mapped their commits on GitHub, the C events here and they’re Q and A activities on Stack Overflow. We mapped these onto timelines, time series. We looked at the distributions of the intervals between events happening in different communities. Say distributions between commits happening on GitHub and questions are answered on Stack Overflow, and so on.

Okay and we found strong evidence for this signaling behavior. We found that experts are likely to be experts everywhere. Those who rank highly, say on GitHub are those who are likely to also rank highly on Stack Overflow, and vice versa. We found that participating in Stack Overflow has this catalyzing effect on their coding activities. Those who are on Stack Overflow tend to speed up their coding, to be more active.

Okay, another thing we did that was a CSCW last year is how people are migrating away from mailing lists. This is a plot you see from the R mailing list, anybody using R? Some of you, thanks. Right, so you can see here that in the last few years activity on the mailing list has dropping sharply. Where are people moving to they’re moving to Stack Overflow.

Here we did a similar kind of analysis. Of course the reason, some of the reasons are perhaps obvious.

It has a wider audience. It has a better user interface and things like this. But we were particularly interested in how gamification catalyzes this transition, if at all, right. The fact that on Stack Overflow people get points and badges for being active, whereas on the mailing list they don’t, and on Stack

Overflow they have these profile pages where they’re signaling, on the mailing list they don’t. Again, we were trying to see if there’s any facts related to gamification.

Similarly we looked at people, yes?

>>: I know that like Stack Overflow like positions itself as a repository of like existing answers. Not just as a place for like you know asking questions. Like you can actually look at what’s been there. Do you thing discoverability plays a role in that transition? Do you think like Miller West where the last discoverable of the Stack Overflow?

>> Bogdan Vasilescu: But not directly because you can also query stuff on mailing list. But perhaps indirectly in that search engines more likely pointed to something on Stack Overflow than on the mailing list. They’re indexed better, yeah, Chris.

>> Christian Bird: I’ve gotten some source of R programming on Stack Overflow. But it’s also a difference. I go to like I call it Cross Validated…

>> Bogdan Vasilescu: That’s right, we considered that too. I’ve simplified the plot here to simplify the presentation, but we looked at that too. We looked at the two StackExchange websites where R activities happening most often, Stack Overflow and Cross Validated. There are actually many others

MATH and so on where some R stuff is popping up, but to a much lesser extent. Yeah?

>>: But would that gamification essentially just into the rate of answered question other than asked question. But when you ask question you only care about gamification that much. You know it’s more like when you answer questions or when you answer a question on a system where gamification is present, whereas you’re less likely…

>>: Well you’re probably more likely to ask the question in a location where you think you’re going to get an answer that’s both accurate and quickly answered.

>>: Okay.

>> Bogdan Vasilescu: Because people get, first of all you can get points for good questions too. But say that’s not your motivation to ask a question on Stack Overflow. I think the main motivation is one the bigger audience. You get more people simply on Stack Overflow and two, the fact that these people answering get points motivates them to answer very quickly. You’re more likely to get a quick answer on Stack Overflow than you are on the mailing list, just because people are more likely to answer quicker on Stack Overflow. I think that’s why you would post it there.

Okay, so here we did a similar analysis as before. We looked at the people who are both on the mailing list and on StackExchange at the same time, how, and if I’d only behaved differently in the two. We found something very, very interesting. The same group of people who answer questions on the mailing lists and on StackExchange exactly the same group those in the intersection answering questions. They do it so much quicker when they’re doing it on StackExchange. The same people are doing it so much quicker on StackExchange than they’re doing it on the mailing list. This is by a factor four basically.

Okay and why? For many reasons, some of them that they’ve laid out in a survey that we run point directly to gamification. Okay, they enjoy playing the game. They like this. They get points, they’re rewarded, it’s a nice thing to do. Right, so they are doing it quicker because otherwise somebody else would beat them to it. They won’t get the points anymore. Yeah?

>>: Just a clarification. What is the profile of people who answer these questions? Are they professional scientists and physical disciplines, or are they developers? I think incentives for those two groups will be fairly different, the developers that can put this on their resume, for scientists not as much.

>> Bogdan Vasilescu: If you look at the paper you’ll find the breakdown of the kinds of people answering an asking on these venues. It’s mostly developers and R package maintainers, or package developers. It’s people who develop R that are answering the questions and who develop packages for

R that are answering the questions, mostly. Its academic to and other people, but it’s mostly people who develop R and packages for R. That are offering user support on the mailing list versus on

StackExchange.

The questions come from people like us who are using R and need help. The answers come from them who develop these packages. You know we have trouble with them and so on.

>>: The Stack was not so much monitorization?

>> Bogdan Vasilescu: Monitorization in what sense?

>>: [indiscernible] apply jobs and so on.

>>: You asked me what was working, what did it say? You asked me [indiscernible]

>>: Yeah.

>>: Yeah.

>> Bogdan Vasilescu: Of course delivering a good product and offering support to the user motivates them too. I’m not saying that. But I’m just saying they are in addition to all of these motives that were obvious. Motivated by the gamification incentives that exist, they enjoy this. Because all of the other motives were valid for the mailing list too, right. They’re trying to help the people that are using their software. They’re trying to do all these other things.

In addition to that they’re getting point here and this incentifies them to do more, to do it faster. Does that make sense?

>>: It did.

>> Bogdan Vasilescu: Okay, alright, so to conclude this, the point that I’m trying to make is that again the social programmer is a person who has a public somewhere online. Who shares code, who is social, follows other developers, connects, builds a network, and so on, and signals.

What I’d like to do today for the rest of the talk is focus on two topics. One of them is what I call the social cost of social coding and the other, the predictability of social coding. I’ll go into each one of these.

What do I mean by the social cost of social coding? Right, we saw that these platforms are very competitive. If you don’t hurry to give answers somebody else will beat you to it. You want to show off that you’re a good coder so you want to fill up your contribution graph, and do a lot, and so on. Be very

visible. But there’s a lot of literature that point to negative effects of a competitive environment on gender balance, and gender representativeness.

Okay, so there’s work saying that women to shy away from competition, whereas men tend to embrace it. Also that in mixed gender competitive environments women tend to underperform. Okay and these environments that we’ve seen like GitHub and Stack Overflow, and so on, are obviously very competitive.

>>: Right, where there are women among the offers or where those only men, cannot tell from the emissions? [indiscernible] because that might be interesting.

>> Bogdan Vasilescu: I don’t remember could you look it up and see, I don’t remember.

>>: Okay, that’s a good question.

>> Bogdan Vasilescu: Yeah, I don’t, please look it up, I don’t remember. I imagine they would have been but I don’t know. Right, this brings me to the next point, the fact that on both of these very competitive platforms women tend to be under represented. If we look at Stack Overflow a survey that they run every year. The most recent reincarnation of that showed that only about five point eight percent of their contributors are women. Yeah?

>>: Do you have, did you check it out how that relates to the gender distribution in like code, like in the industry or in like you know the mass corners of it?

>> Bogdan Vasilescu: Like this?

[laughter]

Right, so if you compare these competitive online social coding environments to what happens elsewhere. You’ll see that they do much worse, okay. The industry across the board that’s so much better. These are women, employees in tech at the big companies. Not just overall but in tech particularly.

>>: I would say industry does less poorly.

[laughter]

That’s better.

>> Bogdan Vasilescu: Sorry, yeah thanks. Okay, I rephrase industry does less poorly than the social coding environments. Right and also open source in general does less poorly than these particular two.

Yeah?

>>: I’m surprised because I’ve mostly seen open source as also highly competitive.

>> Bogdan Vasilescu: Yes, so that’s why it’s, it does poorly than the industry could do less poorly than we would like, and they should. What, I’m lost.

[laughter]

You’ve confused me now. Look what you did. I know what I said, scratch that. Okay, so perhaps, right perhaps there’s a reason. These are organic environments that stand to self organize. Perhaps there’s a reason why there are so few women on there. Perhaps it’s not useful to have more.

Okay, so the question we tried to answer in a paper that just came out at CHI this year. Is which kind of

GitHub team is more effective? Is it the homogeneous one where everyone is similar to everyone else?

Or is the more diverse team? I’ll give you the sneak preview the answer is obviously the more diverse team is the one that is also performing better, the more effective team.

>> Christian Bird: You say obviously, why is that obvious?

>> Bogdan Vasilescu: Okay, I’ll get to that.

>> Christian Bird: That’s not obvious to the [indiscernible].

>> Bogdan Vasilescu: Okay, so, thank you Chris for playing devil’s advocate. Can I get back to you in a couple of slides?

>> Christian Bird: Sure.

>> Bogdan Vasilescu: I’ll answer this.

>> Christian Bird: Okay.

>> Bogdan Vasilescu: Alright, so the reason why it’s not obvious perhaps, Chris, was just one slide away.

Is that there are documented explanations as to why diversity has negative effect on a team’s performance. There’s a lot of theory from sociology and social psychology and so on point to these negative effects of diversity.

One of them is called similarity attraction theory. This basically says that we as humans prefer to hang out with others that are similar to ourselves, in terms of anything you can think of, and demographics being one of them. You know if you’re say Macedonian you’d prefer to hang out with a Greek than with someone else, things like this.

[laughter]

Okay, the other one is social categorization theory. This says that cliques are likely to form in these diverse groups where people you know form these cliques based on things that make them similar.

Members of one’s own clique are treated preferentially, are treated better than outsiders. Imagine cliques forming in this room where you know we segregate based on whatever views we have. We would be more lenient towards people that share our views, but less so towards our opponents and so on. This happens to demographics too. Yeah?

>>: You can imagine a direct, well like you have a better idea of like whether it’s treat well someone who’s in your own label or group, rather than someone’s who’s outside of it?

>> Bogdan Vasilescu: Yeah, so…

>>: Treating like 2% the same way like it’s going to be, it’s going to work out better if it’s in your own group because you’re already like optimizing for them? For their like subjective [indiscernible].

>> Bogdan Vasilescu: I guess the point is you’re more likely to discriminate against people that are different from you. This will lead to maybe communication breakdown, somebody not speaking your language or some language as well as you. You’re more likely to you know not communicate well with them. This will overall in the end lead to friction in your collaboration. This will negatively affect the cohesiveness of your team and therefore its performance.

This is kind of what the theory says. Okay, so, Chris does this answer your?

>> Christian Bird: U-huh.

>> Bogdan Vasilescu: Yeah.

>>: As an extreme some people only work with those who share the last name for instance, as you can see in this [indiscernible].

>> Bogdan Vasilescu: Yeah.

[laughter]

Yes, exactly, yeah. That’s a very good point, thank you. You know Prem will come and say oh, Chris you’re wrong because diversity has great effects. You know there are all of these examples of how diversity helps. You know companies that are more, have more diverse executive boards they do better.

People that work in multicultural settings they’re more creative. They are able to solve problems better.

They’re more in novelty and so on. There’s lots of references pointing to this.

The reason why this works comes again from sociology. It’s called Information Processing Theory which says essentially that by having access to different cultural and educational backgrounds, and broader networks. Your group of people, your team has better problem solving skills, better adaptability, more creativity, and so on, which positively affects its performance. Teams that are better equipped to handle uncertainty if you will tend to perform better. Okay, if they can live with each other.

We did this GitHub. The reason why, so the difference between what we did and the previous work on this is that GitHub is this very organic, open source like environment. Where people are geographically and culturally distributed and communication most of the time happens through online channels. Very rarely people are actually collocated.

We also did this differently in that we did this by analyzing large amounts of historical trace data from

GitHub. Rather than what people have done in sociology where they go and observe some teams over a short period of time. They run a control experiment and see how these teams do. We did this historically from large amounts of data.

We looked at two aspects of diversity. We looked at gender diversity. The reasons I think should be obvious by now because women are underrepresented in open source, and also in GitHub. People are talking about how the hacker culture tends to be male dominated and sexist. There are examples and references that point to discrimination against women and so on.

We also looked at tenure diversity, another diversity attribute. You can think of tenure within a team both with respect to the overall experience brining very senior people and very junior people together in a team. But you can also think of this at a project level where you know regardless of how senior or junior people are. They’re all junior or senior with respect to the project that they’re doing. They’re all new to the project or not. Some people have been with the project for longer or, you know everybody’s coming in at the same time.

This is interesting in open source because open source has this natural turnover, high turnover ratio.

There are people coming in and out all the time. Okay, so it’s great that we can do this. There’s trace data publicly available from GitHub. People like George Gousios from the Netherlands have been collecting it. There’s the GitHub Archive. There’s lot of venues where people can mine GitHub data from.

This is great. But this is not a straightforward task, what we’ve done. I will try to explain why. One, some of the reasons are theoretical. First of all open source is known to be very meritocratic. It’s known to be a meritocratic culture in which nothing but the quality of your contributions should matter.

There are, you know there are plenty of reference points to this that it’s your code and how well can, how much you contribute, and how well you can code that matters. Not who you are. Right, open source advocates meritocracy.

This shouldn’t make any difference. Diversity shouldn’t make any difference. Yeah?

>>: I get an impression that that model is right within contributors. But I feel like users of open source that’s like getting to like [indiscernible] personality really quickly. Then like you’re like technical merit becomes less meaningful like than your like personal identity. Does that make sense?

>> Bogdan Vasilescu: We’ve looked at people who develop code on GitHub rather than people who use code developed on GitHub. I don’t know about the users in general. These are people who collaborate on GitHub to create something, whatever that something is.

>>: Right.

>> Bogdan Vasilescu: Alright, okay, another reason that this is not straightforward is that demographics are known to be less salient in open source in general. In GitHub in particular for instance gender is not explicitly recorded. How could we even analyze gender diversity, gender is not one of things that you record on your profile page.

Finally, GitHub has made it very easy for people to contribute to a project through the pull request mechanism, right. With a few clicks you can just fork any project that you want. You can do whatever changes you want in your local copy. Then you call submit a pull request and anyone can do this. It doesn’t take anything. You don’t need access to anything. You just need a GitHub account basically.

You need nothing. It’s very easy for anyone to contribute to these things. Therefore, it’s not entirely clear what constitutes a team anymore when you had these very controlled environments. That centralized virtual control, was imposing, you know you had people with commit access to a repository or not. But now a days you know anyone can contribute to any project, so this is very fluid. Yeah?

>>: I disagree on that. Pull request isn’t that just kind of a more funnelized tool supported version of like submitting patches to an open source mailing list? Like is there really a difference, I mean there’s tooling and that makes it easy, sure. But people weren’t locked out before they could contribute just fine.

>> Bogdan Vasilescu: People that weren’t, that’s right they weren’t locked out. But GitHub advocates this ease of contributing as one of their selling points. They’re saying you know we made it so much easier than it used to be. It used to be more difficult. You had to you know maybe talk to some people, communicate with them. Not just randomly you know maybe get access to the code. Maybe you weren’t able to clone the thing before or to create a…

>>: Oh no, I don’t want to be [indiscernible].

>> Bogdan Vasilescu: I know.

>>: Before like this look was [indiscernible] or some type of project. You could download a repository.

You could create a change, create a patch, send it to a mailing list, people would discuss it. It would be accepted or not. I mean in terms of access seems like the access was exactly the same as it is now. It’s just a tooling business.

>> Bogdan Vasilescu: Right, but I’m saying, I’m claiming, you’re right. Essentially pull requests are patches, are patches 2.0 because of the tooling. But I’m claiming the more accessible tooling makes also for a lower barrier to entry for new comers.

People have been documenting this in literature. They’re saying you know [indiscernible] that have analyzed GitHub and done interviews with GitHub developers have pointed to this. How it’s much easier for new people to come to GitHub projects than it used to be in traditional open source.

>>: Okay.

>>: I just want to point out that that related thing on Amazon for example. They’ve been taking the purchasing process down from two clicks to one, right. In a similar sense because people will spend more money if the barrier is smaller, even by one click, it’s still a noticeable difference.

>>: I think also the fact that it’s, they put tooling on it. It’s much easier as a researcher to gather pull requests than it is to gather; instead you patch contributions on mail list.

>>: Yeah, yeah.

>> Bogdan Vasilescu: We had…

>>: I had twice, to suddenly get proliferation of papers on this. Now you can audit the data collection.

But anyways go on.

>> Bogdan Vasilescu: They’re throttling you, you can’t pull too much. It takes months to pull a large amount of data. Unless you use one of these archive versions. But if you pull directly from GitHub they throttle you and you can’t do much.

>>: But the viewers are tougher now a days.

>>: That’s true, damn them.

>> Bogdan Vasilescu: Alright, couple of technical challenges to this. One was that gender is not explicitly recorded as I mentioned. The other one was that people often contribute under multiple aliases. Because these are long periods of time they’re likely to have changed their email addresses and things like this. It’s likely that their contributions are not linked to the same account. Finally that this

creates very, leads to very complicated and complex data. It’s not entirely clear how to best mobile and analyze this.

I’ll tell you how we did some of these things. We used the mixed methods approach where we tried to address some of these theoretical challenges through a user survey that we run. This came out at Chase this year. We combine this with mining trace data. Yeah? Oh, you were pointing.

>>: No, no, no.

>> Bogdan Vasilescu: I thought you had a question. Thank you. In the survey we asked many things.

Some of them are here. We asked people what they perceived to constitute a team in the teams they’ve worked on. What differences they recognize among other team members? Finally, how if at all they feel affected by diversity.

We got eight hundred and something responses from seventy-two countries. This was very, we were very happy when we did that. We sent about forty-five hundred invitations out. You know about twenty percent or so acceptance rate, response rate, sorry. Twenty-four percent of our respondents were women. This was on purpose because we targeted a stratified sample rather than a random sample because we knew women are otherwise very under represented. We wouldn’t have gotten too many unless we did a stratified sample.

Here you see the countries that, from which we had a higher responsiveness from women. The US is one of them. There tend to be more women respondents from the US than from other countries. For instance in Germany, France, and Russia there are significantly fewer. This is the median responsiveness of women which is twenty-four percent over all. The dotted line is one median absolute deviation above and below the median. Okay, so you know things like US and Brazil stand out as being very responsive. Things like Germany and Russia stand out as being not very responsive.

>>: Does the [indiscernible] higher and [indiscernible] rate here is not.

>> Bogdan Vasilescu: That’s right Brazil is not, yes, we had few responses in Brazil so that’s not very trustworthy. But we had the most responses from the US, so this is very trustworthy.

>>: Survey invites they went out in English and showed they didn’t go out in local languages as well.

>> Bogdan Vasilescu: Survey invites went out in English and we contacted a stratified sample of fortyfive hundred people. Stratified by gender was one of them. The other was I think their overall activity levels, something on this. We wanted both people who are hyper-active and less active to contribute.

>>: I’m just saying that you know setting this up in English probably [indiscernible] select [indiscernible].

>> Bogdan Vasilescu: Yes, thanks [indiscernible].

>>: Does this graph adjust for the gender differences in the different country, by generations in the industries, or is…

>> Bogdan Vasilescu: No this is just the response rate per country to our survey, no controls. Alright, otherwise in terms of age there was no physical age. There was no difference between genders. The median was about twenty-nine. In terms of experience there was a very significant difference. Women tend to have about two years less programming experience than men do in our sample of respondents.

Okay, so some of the questions. What constitutes a team? We gave them many options to choose from. Some of them were less inclusive explicitly the repository owners only and others who have write access to the repository, so less inclusive options. Some of them were very inclusive going all the way to everyone who does anything in this repository. Doesn’t matter what they do, whether they submit a bug report or comment on a pull request, or whatever, you know anyone.

What do you think came out, anyone?

>>: [indiscernible]

>>: More inclusive.

>>: [indiscernible]

>> Bogdan Vasilescu: More inclusive, right. The number one ranked option by far, that was close, was everyone who does something in the repository, right. People acknowledge regardless of, you’re doubting this.

>>: I’m just; the odds indicate devices here introduced by asking questions. This way the moment here you mentioned the word diversity you’re likely to shift answers in that direction.

>> Bogdan Vasilescu: Right, so, people don’t want to portray themselves…

>>: As you mentioned the word privacy people will say privacy is tremendously important.

>> Bogdan Vasilescu: But, so, okay when you ask them about gender diversity and say sexism or whatever. They’re unlikely to want to portray themselves in a negative light, right. Who would, why would you? If I asked you are you a racist or are you sexist, why would you say yes?

But you know this was a fairly not charged question as far as I think. In that it refers to the contribution rather than you know it would be how much you contribute to a project; perceptions of your team based on how much people contribute to a project. I don’t think that this as charged as asking about diversity directly. But you’re right, you’re right they may have…

>>: [indiscernible] fairly right it’s probably there.

>> Bogdan Vasilescu: That’s why…

>>: You know to go to a difficult [indiscernible].

>> Bogdan Vasilescu: That’s why we always try to mix the methods. We tried to mitigate the biases that one of the methods introduces by using the other method. Thanks.

Okay, in terms of differences that people recognize among others of their team. We gave them choices between some technical ones like programming skill and reputation as a programmer, and experience on GitHub, and so on. Demographics, age, gender, ethnicity, hobbies, and things like this. What came out is that of course as you would expect, programming skills are the most recognized feature among the respondents. But the second most recognized very interestingly was gender. People are aware of each other’s gender on GitHub, right.

This comes somewhat to a surprise based on literature from open source, previous literature from open source saying how demographics are not very salient in general, in open source. Well on GitHub because of these very public profile pages it turns out they are. Things like gender and one’s real name, and so on are very visible, relative to other features.

Okay, finally some cherry picked quotes on their opinions about diversity.

>>: I, thank you for your honesty.

[laughter]

>> Bogdan Vasilescu: But I gave the tutorial yesterday, so I’m trying to adjust.

[laughter]

Many people said that GitHub is meritocratic and diversity does matter. “Code sees no color or gender”.

Some people said that diversity is very positive. That it’s a pleasure to discuss new ideas with people that are different and enriching, and so on. They find it an enriching experience.

Also some people pointed to negative effects. Here I’ve listed some of the ones related to gender in particular. You’ll find more examples in the paper. Some of the quotes refer to women developers who had to quit projects because of sexist behavior against them. Or that switched to using male sounding handles to having male sounding profiles in order not to be considered a woman. Because then they would have been, they felt they would have been discriminated against.

Okay, so bottom line everyone is the team is everyone and demographics are very salient, and also mixed opinions when it comes to diversity. Some say positive, some say negative, some say it doesn’t matter at all.

We went on and mined GitHub data. In the end we had a sample of about four thousand projects after filtering and inferred gender, and so on. We inferred people’s gender based on their names and countries as listed in their profile pages. Okay, so for instance here you’d start by taking my city on

GitHub, my location on GitHub, and you would pass this through Bing Maps, seriously through Bing

Maps, seriously. Then it would tell you that this is USA, that Davis California is located in the US.

Then you would use this together with the first name to try to guess the likely gender based on the tool that we developed awhile back that has names for all these different countries. The reason why you need to know the country as well in addition to just the first name, the classical reason is Andrea who in

Italy is almost exclusively a male first name. Pretty much everywhere else is almost exclusively a female first name. That’s why you need country information to decide with higher accuracy on the gender.

Okay, then we, I’ll skip through how we solve the aliasing problem because Chris wrote the textbook on this. I’m sure you’re aware. I’ll tell you a bit about what features we computed. We looked at productivity as number of commits by these teams per unit time, per quarter, how many commits they do per quarter to GitHub projects for each team. We looked at turnover as the fraction of change that happens in a team in the current quarter with respect to the previous quarter. Right, so you’d have a hundred percent turnover if the team this quarter is completely different from the team last quarter, and so on.

Then we measure diversity in terms of gender using the Blau index. This is standard practice in sociology and so on, diversity in terms of tenure using the coefficient of variation which is again another practice.

Note that gender is a categorical variable. We only had two classes so that’s why you need the Blau index and tenure is a numerical variable. You have how many years of experience a person has and so on. That’s why we used a different measure of diversity.

Then we had a bunch of controls like team size, of course larger teams are likely to do more work. Time and project age, and so on, and project activity; projects that are overall more active are also much more likely to be more active in a particular time window and things like this.

The data looks something like this. It’s nested in that you have different quarters of data for a project, over time different windows. It’s also cross classified for instance in that the same project can find.

Sorry, and that different projects can find themselves in the same time window. This is a way to you know to say that maybe in two thousand twelve in January GitHub was overall more active than it was in two thousand eight in January, or whatever.

Okay, so to model this we used linear mixed-effects hierarchical models. You can find the gory details in the paper. Just to tell you a bit about the results controls behave as expected. Larger teams do more

work per unit time. Projects that are overall more active of course also do more work in a particular time window. You know as projects grow older they tend to slow down their activities. Perhaps they mature so there’s not as much change happening in older projects as it is in newer ones. Right, so all of these make sense.

Then in terms of diversity measures both of them have very significant positive effects on productivity after controlling for all of these things. The effect sizes are small however, so don’t get overly excited.

But the effects are very significant and positive. This is very clear. Okay, so teams that are more diverse with respect to gender, more balanced with respect to gender. When controlling for all these other things they tend to do better. They get more stuff done. Yeah?

>>: Are there any teams that you found that are, so, but in diversity you’re really comparing all, mostly male teams to diverse teams. Are there any teams that are like predominately female?

>> Bogdan Vasilescu: Very few, there are very few teams that are predominately female.

>>: Are the diverse ones more productive than the predominately female ones?

>> Bogdan Vasilescu: Diverse means more balance with respect to gender.

>>: Yeah.

>> Bogdan Vasilescu: If you have an all female team that would do worse on average than the more balanced male/female team. Just like an all male team would do more, would do worse on average than a more balanced male/female team.

>>: Is that from your mod, so the reason I asked this is you’re models looking at gender diversity versus gender not diversity, which is probably going to be drowned out by the male versus diversity.

>> Bogdan Vasilescu: That’s right, yeah, yes.

>>: Dynamic, so I’m wondering have you actually compared the all female, or mostly female to the diverse. Like just that, those two subsets to see if there’s a difference.

>>: You already have enough data…

>> Bogdan Vasilescu: No I don’t have enough data do that.

>>: Okay.

>> Bogdan Vasilescu: But what we have done in some recent experiments that we’re running right now.

Is look at the effects of adding one person to a team broken down by gender. Right and we see at least

preliminarily that adding a male person to a team has about one point three percent effect on productivity. Whereas adding a female contributor has about a five percent effect on productivity.

Again we’re…

>>: On the positives.

>> Bogdan Vasilescu: On the positive side. We’re still working on this so it’s not out yet.

>>: For to you and the diversity I feel like projects which would be more lively, productive would be able to retain more new comers, so which is that the effect could be going the other direction. You know a project that which is not productive someone’s going to try and contribute and is going to quit right away because they’ll like hell with this.

>> Bogdan Vasilescu: We haven’t seen that. But what we have seen is that projects that are more homogeneous with respect to tenure tend to stick around. Teams that are more homogeneous with respect to tenure to stick around longer. Here you see that tenure diversity has a positive effect on turnover.

But you should read this the other way around because turnover is in principal a negative thing. The more diverse the more turnover, right, that’s what this means. Teams that are more homogeneous with respect to tenure would have less turnover. That’s what we saw. Yeah?

>>: It seems like you have a perfect setting for parts like finding control groups, right. You should have enough anonymous accounts [indiscernible] anonymous accounts that don’t have you know names imbedded in them.

>> Bogdan Vasilescu: Right.

>>: Right, where the gender is unclear.

>> Bogdan Vasilescu: Right, so…

>>: [indiscernible] the apps, so have you tried to figure out whether advertising the gender versus not advertising the gender you know makes the same difference.

>> Bogdan Vasilescu: We have not, yet. We have not. Thanks, that’s a very good point. We have only looked at a sample of four thousand projects over many quarters of their history. After all of this filtering to ensure that we could know the gender for these people in these teams. But GitHub has millions and millions of projects and because of all of these unknowns it shrank down to only about four thousand of them, after all of the filtering. But we have not yet compared the projects for which we know the gender, to those for which we do not in terms of activity, not yet.

>>: Related question. Is there some research when female computer scientists were stating that they are male, for the sake of gender equality?

>> Bogdan Vasilescu: Yes, it’s called gender swapping. It has been documented somewhat in other fields, but very little in the computer science, software engineering domain. There is anecdotal evidence. There’s no…

>>: How do you find out?

>> Bogdan Vasilescu: No definitive because it’s very hard to know.

>>: Is it possible to do some data mining and, for example some semantic analysis of the comments to see some gender cues?

>> Bogdan Vasilescu: People have been trying to do this for authorship identification when it comes to writers in general. Is this written by Shakespeare or someone else? These kinds of questions and it turns out they can do so quite accurately. They can infer ones gender quite accurately from their writing.

>>: From natural text or…

>> Bogdan Vasilescu: All natural text.

>>: Because I can’t imagine that you cannot do that on for example GitHub comments messages or something. Because people tend to like pick up the kind language that only a…

>> Bogdan Vasilescu: Yeah, right, so we were very excited. We thought well we don’t need to use this name based approach at all, because these people have been doing it for years with natural writing quite successfully. Why don’t we just do that instead? We can take all their comments and process them, and figure out their gender accurately based on that. We tried this on a small scale and it works horribly.

The reasons are exactly these that these are very technical comments. They use slang and all kinds of different language than people use in proper writing. They are short as opposed to long, and so on. It’s very hard to do this accurately.

>>: Maybe it would even be hard in the natural text because I mean there are basically surrounded by.

They are basically in a predominantly male environment all the time. They might, in natural language they might grab some kind of male…

>> Bogdan Vasilescu: Although there are some documented differences in use of articles and prepositions, and things like this based on gender. Right, but we tried this very briefly and it didn’t work. That’s why we did this instead. Thanks.

Okay, so to conclude this part. I believe that we should start thinking about the social cost of these very gamified competitive platforms. Even more so as we have quantitative evidence that diversity has positive effects on a team’s performance.

Very quickly, how am I doing on time, Chris?

>> Christian Bird: You’re pretty like; I mean you’re about the end. But you can go ahead, ten minutesish.

>> Bogdan Vasilescu: Ten minutes.

>> Christian Bird: That would be good.

>> Bogdan Vasilescu: Alright, thanks. Yeah?

>>: What about different countries have you looked at that?

>> Bogdan Vasilescu: Not yet, thanks, no. We are working on that. You mean diversity in terms of different countries or?

>>: Geographical.

>> Bogdan Vasilescu: Or biases induced by say one’s country?

>>: Either way.

>> Bogdan Vasilescu: Yeah, we’re working on that. Just to remind you about the pull based model.

Right, traditionally people would contribute to have write access to a repository. Could push code directly and anyone that cannot push code directly creates a fork and submits a pull request, right. You know there likely few of the people who can push directly and many of the people who can’t and submit full requests.

Okay, and there’s still typically a development main branch that everyone forks and that everyone pushes back into. But what we see happening more recently is that even people who could push code directly stop doing that, don’t do so anymore. That’s because pull requests go through this code review process. Then why not have all code go through the preview process instead of just the one coming from pull requests.

What we see both in industry and in open source, and have been recent papers on this at ICSE this year.

Is that projects tend to migrate to the pull based model entirely. Even people who have write access to these repositories tend to submit their contributions as pull requests. Right, because they go through the code review process.

This is great right, all code gets reviewed now a days. But this also creates a huge burden on these people who have to decide what goes in and what doesn’t. This is a recent snapshot from Ruby on Rails on GitHub. They have tens of thousands of incoming pull requests. At any given time they have hundreds of them open that still need processing. Right, so there’s a huge review load for these people that need to decide. It’s typically very few people that need to decide what goes in and what doesn’t, and how to prioritize these things, and so on.

We’ve looked at this in a paper that came out at MSR this year. We saw that about a third of all pull requests get processed within one hour, so this is great, and about another third within one day so that’s not too bad. But there’s still another third or so that take you know up to years.

What do people do in order to handle this huge review load? One of the things they do is they automate the process. They automate the testing process specifically. Here you see these green flags indicating that these pull requests have been automatically tested by a continuous integration service.

That all the tests have passed. Perhaps you can talk about how you guys are doing it here. But this is how people are doing it on GitHub.

Here’s the anatomy of code review of a pull request on GitHub. Somebody opens a pull request and you can see who that was. You know there’s a title and a description associated with a pull request. You can see that this one contains one commit that touches four files. Then there’s comments start opening up from the project managers, pointing to different flaws perhaps in the contribution. There can even be inline comments on the code. Here they’re pointing directly to something that is missing in the code.

Then some more time passes and more discussion happens. At some point the continuous integration service comes back and says you know the tests are broken so you did something wrong, you screwed up something. Then you go again back and forth into this process. You keep fixing whatever failed and updating your pull request and so on. Until finally if you’re blessed then your pull request gets in, is merged by one of the integrators.

Okay, so we looked at how long this takes. As I mentioned it takes about eleven hours median for a pull request to be closed, either accepted or rejected from when it’s submitted. Turns out the first response by one of the developers comes very quickly about fifteen minutes in, median times. The continuous integration service comes back with all of the tests about forty minutes into the process.

Okay, so we tried to see how well we can model the time it takes for a pull request to be evaluated to understand this process better. People have already been doing work on this. They’ve been trying to understand how it works on GitHub. They’ve been pointing to a number of factors that influence this

process. For example how big the pull request is, right. If your pull request is huge it will take longer for it to be evaluated.

Similarly, you know if it generates a lot of discussion it will take longer for it to be closed. Finally the more ties you have with people on the project. The more connected you are with these people. The better social connections you have. The less time it will take for your pull request to be evaluated. They prioritize contributions from their friends. Okay, there’s some social effects here as well.

You know we took all of these features and we tried to model how well we can predict the pull request time. We saw that you know it works fairly well. But it’s not great, right. We get about thirty-six percent fit using these features.

We went one step further and computed a number of other things. To see how well we can improve, how much we can improve. One of them is the size of the description of the pull request. We think of this as a proxy for how complex the code that you’re submitting is. If your thing is very complex you’re likely to also describe it in a lot of detail for the people to understand what you’re trying to do. Then you know we confirmed this that pull requests with longer descriptions take longer.

Also, of course because this is a human process if people are very busy. They’re loaded at the time they have hundreds of pull requests to review. They’re unlikely to be able to process yours very quickly. The higher the workload at the time when you’re submitting yours the longer it will take for yours to be processed.

Similarly, depending on when you’re submitting your pull request. Whether it comes on a Friday or you know how available these people are at the time when you’re submitting. Maybe they’re all sleeping or whatever. We mined the history of their activity and we built a pattern of their working hours. We saw when they’re more active and when they’re less active. Using this we could estimate how available they were at the time when you were submitting your pull request.

Similarly, the less, the more busy they are the longer it takes for your pull request to be evaluated. We also saw that the first human response and the continuous integration service have very strong effects on this. The longer it takes for somebody to acknowledge your pull request with a comment saying, oh thanks for contribution, we’re going to look into this, or whatever. The longer it will take over all.

Similarly, the longer it takes for the tests to run, the longer it will take overall. Because they trust the test and they’re waiting for the test results to come back before they make any decisions.

>>: Is it enough time to response, to first response probably related to the availability and workload thing?

>> Bogdan Vasilescu: I think it’s related to prioritization. They see your pull request anyway. They don’t need to acknowledge it in writing. What I imagine happens is when they do acknowledge it in writing

they’ve already built an estimate of how much priority to assign to you pull request. You know if they don’t acknowledge you for a long time then they’re probably not very interested in your pull request at that time. Whereas if they respond immediately they’re probably very excited about what you’ve sent and they’re more likely to process it quicker.

Okay and then finally we saw that assignments happening in the pull request. People are tagging other developers in the project have strong negative effects on how long this takes. If you assign somebody already, if you know a person that is able to review your pull request, or interested in reviewing it. It takes significantly less time for it to be evaluated if you already know who to send it to when you’re submitting it. Similarly, if it comes as a response to an open issue it takes less time for it to process.

They prioritize say pull requests that answer issues that are open at the time over others.

All of these things added to the previous model we do much better. Okay we get about a sixty percent fit. The question we, right, conclusion here is that already forty minutes into the process. When you know you have received the response from someone acknowledging your contribution. Maybe the tests have run. Already this early into the process we can estimate quite well how long your thing will take. I thought this was interesting.

Finally, I want to conclude with something. Because we saw that this continuous integration service has such a strong effect on how long it takes to evaluate your pull request. We went on to see whether it has other effects too. Because anecdotal evidence points that one continuous integration should speed up development. Because people are free to focus on other more interesting things. Two they should also lead to better code, less buggy code. Okay, as people point out anecdotally.

In a paper that’s coming at FSE this year. We found that teams that have switched to using continuous integration are able to merge twenty percent more pull requests coming from their core developers. At the same time they’re able to reject significantly fewer pull requests coming from both their core developers and external contributors.

The reason why this is different is because we suspected there might be different processes affecting the pull request that get merged and the pull requests that rejected. We built separate models on those that are merged and those that are rejected. Overall from both we can conclude that there, teams that have switched to using continuous integration are able to process significantly more pull requests, right.

Significantly more get in and significantly fewer get rejected which again means that significantly more get in.

Okay, so we thought this was very interesting. Yeah, did you have question, oh sorry? Two is that teams who have switched to using continuous integration find forty-eight percent more bugs. I thought you might be interested in this. Okay, internally these are bugs reported. Again, we distinguish between bugs reported by members of the projects and bugs reported by users. We see no effect on the number of bugs reported by users. But we see a very strong effect in the number of bugs found by developers.

Chris is doubting.

>> Christian Bird: But, no, I mean, so like at Microsoft we care most about bugs that effect external users, right. I would think if people are finding so, I believe that the more bugs we find of value before something is released the fewer external customers would see. It’s interesting to me here that devs are finding more problems. But those found by people, external customers doesn’t appear to be going down.

>> Bogdan Vasilescu: Yes, there’s no…

>> Christian Bird: It makes you wonder how critical are those extra forty-eight percent that are being found if the customer experience, external experience is in fact unchanged.

>>: Maybe because there are more features.

>>: Because of more features, yeah. The customer experience is better.

>> Bogdan Vasilescu: Because they merge more pull requests means there are more features being shipped probably.

>> Christian Bird: But the bug rate is the same.

>> Bogdan Vasilescu: But the bug rate isn’t affected negatively. Let’s spin this…

>> Christian Bird: It’s not affected negatively but it’s not affected positively either.

>> Bogdan Vasilescu: Well you know it’s, things are not getting worse. More stuff gets done.

>> Christian Bird: Don’t you think it’s a matter of equilibrium, right. We could in life have bug free software and we’d have four programs we could run on it.

>>: Yeah, if it did nothing. Yeah, I see what you’re saying.

>> Christian Bird: Or we could have four hundred, right, so I bet just evening out to you know a good market balance.

>>: Or you could just take the forty-eight percent to the increased amount of pull requests scheduled in?

>> Bogdan Vasilescu: Yes, yes, yes, we’re adjusting too many things.

>>: You’re not making a causality statement here, right?

>> Bogdan Vasilescu: No.

>>: This is a correlated…

>> Bogdan Vasilescu: This is an association. This is the effect of one of the predictors we used in our regression models. There’s no causality statement at all. It’s purely an associate, a statistical association between number of defects before it hit for unit time. This was our bugginess measure and usage of CI versus not as one of the predictors when controlling for many different things.

>>: Then the usage of CI also correlated with other good things like I don’t know god complexity or like other good habits, good [indiscernible].

>> Bogdan Vasilescu: We haven’t looked at that yet. We’ve only looked at whether they can merge more code when they’re using CI versus not. As a measure of how effective the process is. Whether this leads to buggier code or not because you know arguably if more stuff gets in maybe they cannot keep up with the bugs as much. There will be more bugs coming out as a result of that.

But that’s not the case. It’s also not improving. It’s not helping either but at least it’s staying there. We haven’t looked at other measures yet.

>>: I just wanted to add it’s stronger than just correlation because we’re looking over time. Whereas we’re not confirming causality we’re also making a stronger statement. If we assume that people you know in the time after the previous time became most similar to the previous time overall. There is some continuative behavior then an affect that takes place over time is stronger than just association, because we’re looking at using CI versus not using CI which is later in time.

>>: Right.

>>: It’s an unconfirmed causality.

>>: But what you have to look at switches which is to say…

>>: Of course you…

>>: Then [indiscernible] and then toast could [indiscernible].

>>: Sure, sure, but it’s stronger than just the correlation which is no time, no temporal.

>>: Yeah, I know.

>>: [indiscernible] the continuous integration that [indiscernible] raw dynamic methods. Right, there’s no static methods here, well I don’t recall seeing any static [indiscernible], any static methods in CI. Like you know running some kind of static analysis [indiscernible].

>> Bogdan Vasilescu: Some projects are starting to do this. But I don’t have figures on how widespread this is. But they’re starting to build all this static analysis checkers into the continuous integration services and run all these.

>>: I think it’s more typical for static checking to be built into the build system rather than the testing system, because CIs were like running tests but…

>> Bogdan Vasilescu: It’s also building.

>>: It’s also building, totally, [indiscernible].

>>: Like [indiscernible] tools run over night, right. They’re not as interactive as…

>>: No.

>> Bogdan Vasilescu: The only proxy for that that we control for here is the amount of tests they have available in these projects at the time. But we don’t know exactly what the tests are doing and whether they’re static checkers or not, or other things. But we know how many tests there are in the project that…

>>: It is [indiscernible] started [indiscernible]

>> Bogdan Vasilescu: Yeah, alright, so this brings me to the end of my talk. To wrap up this part, pull request evaluation is very predictable. Then the usage of continuous integration has some interesting effects on bugginess and on how effective this evaluation process is. That’s it. Thank you.

[applause]

>> Christian Bird: Alright, any questions? Thanks a bunch.

>> Bogdan Vasilescu: Sure, my pleasure, any time.

>>: [indiscernible]

>> Christian Bird: Okay, so Bogdan is in town for the rest of today if people want to contact him afterwards. Again, I’ll send out a follow up email to any questions with his contact info. Yeah, that was really interesting.

>> Bogdan Vasilescu: Thank you.

Download