>> Tom Zimmermann: Thank you all for coming, my... Microsoft Research and it's my pleasure to welcome Tony Wasserman...

advertisement
>> Tom Zimmermann: Thank you all for coming, my name is Tom Zimmerman. I [indiscernible]
Microsoft Research and it's my pleasure to welcome Tony Wasserman to Microsoft Research.
Tony is a professor at Carnegie Mellon University in Silicon Valley and he has a very long history
with open source and he's also very engaged and very active in the open source community.
He's on the Board of Directors of the Open Source Initiative. He has many other
accomplishments, he is like distinguished and influential educator awards from [indiscernible]
Microsoft. He is also ACM fellow and IEEE fellow. We are very excited to have him here and
today he will talk to us about how to evaluate open source software and I guess he's indicating
it's not just open source, so I'm really looking toward to your talk.
>> Anthony I. Wasserman: Great. Thanks. Thank you Thomas, and thank you all for coming.
It's a pleasure to be here. It's my first time to Microsoft Redmond which strikes me as unusual
considering how long I've been doing this and how long Microsoft has been up here, but thank
you for inviting me. This talk really has its origins back in the day when I was the founder and
CEO of a proprietary software company and this is really back in the dark ages, but, you know,
we would have our product and it was an enterprise software product that maybe a few people
know. It was Software through Pictures which is an integrated development environment that
ran on a heterogeneous network of UNIX workstations. We had some competitors that we ran
into in some of these sales opportunities, and we would always wonder why we would lose,
because you don't win 100 percent of the time. No salespeople do any more than a sports
team wins all of its games. So I got interested in how do people evaluate, how is it that people
chose us? Now a marketing approach to it said well there are certain features, advantages and
benefits that we promote, but it turns out that in most cases the choice didn't have anything to
do with that. And of course it was a closed source product so people didn't go dig around and
look at the source code to see how well it was written and they didn't have a chance in general
to meet our development team. So how did they evaluate it? So that was the genesis of my
interest in the process of evaluation. And we thought that some people appreciated the
architecture of our product, which we emphasized a lot and that sometimes was, in fact, the
difference for people to choose our product over somebody else's, the customizability and the
extensibility, all of those kinds of things. The product, the Software through Pictures, actually
included open source software. We were one of the first companies to take open source
software and put it in a commercial product. Sun Microsystems was ahead of us but we
weren't far behind, and so when people looked at the license for our product it had an
acknowledgment to the University of California Regents and it had, you know, a BSD license
associated with that little piece. So my interest in history of open source goes back to the same
era being around Berkeley at the time of the development of BSD UNIX. So of course looking
back on all that years later, we see that open source is everywhere and almost any type of
application, almost any piece of the infrastructure has open source components and in some
areas is dominated by open source products, so if you are you looking at Java development you
see that Eclipse is far and away the dominant development environment. If you look even at
HTTP servers, Apache’s, HTTP server is running a majority of the world's websites. So
particularly in infrastructure and application development tool layers, open source has a very,
very strong position and has won the hearts and minds of developers everywhere for the most
part. Not exclusively but enough that people sort of assume, okay. This is how I'm going to do
things. And it really has a huge impact on the way that people start software companies these
days. You know, when I started this company way back we had to buy hardware. We had to
buy software licenses. We had to buy all this other stuff. Today, you don't have to do any of
that, because you can use posted trials of various software. You can get the developer level
resources available to you on Azure or on Amazon or on any one of a number of cloud hosting
services. So the mentality of developers has changed and the cost of building something has
changed. So the question still, though, is how do people find software that meets their needs?
Now there's some people who just go and scrounge and they look for whatever they can find
that's free, but when you're talking to a broader audience, when you're talking to people who
are the decision-makers in industry and government and nonprofit organizations and you look
around the world, enterprises as well, you know, they have traditionally used packaged
software and they've often either bought the software themselves as is the case in many
enterprises, or they've contracted with somebody to build software for them either using
existing packages like SAP, for example, or they've had custom software built. And we're not
really going to worry about custom software other than to note that, you know, most people
buy their suits off the rack to the extent that they buy suits and a very small percentage of
people go and have them made specifically to order. Now that's going to change because, you
know, you can get people's measurements now. But for the sake of our discussion today we're
going to focus on people who are choosing existing packages of software without the idea that
they are going to make major modifications to it. So when you go and you ask people so how
did you choose that piece of software? Why did you decide to use Visual Studio, or why did you
decide to use SAP for your enterprise product, or salesforce.com, or whatever. You get a whole
bunch of answers. And, so oh, I've used it before. And now here's a new project. I'm in a new
company and I think that'll meet my needs. Maybe they read a review in some publication.
You know, it used to be that we would get these monthly PC World, MAC World, Byte
Magazine, Computerworld and so on and they would do professional reviews of various
software. Today we find a lot of that online. The sources may be more or less reliable, but you
certainly can take the name of any product and put it next to it the word review and you are
still changing the rules to generate a lot of options for you to go consider. Professional review
also involves the analyst firms, so like Gartner put this in a upper right-hand quadrant; that
means it must be good. All right. Of course, the industry analysts probably haven't actually
gotten their hands on the software to use it for the intent for which it was built, so it's there
picking up that information from others or from the vendor. Some people are sold by the
salesperson. It's not that the software is that much better than anybody else's, but it's an
enterprise product and you're building a relationship with the company and the vendor and you
like the sales guy. He answered your questions. He brought you the sales engineer, got you to
talk to the executives of the company and you just say okay. I think these guys will take good
care of me. Sometimes there will be an organizational standard. Gee, I'd like to use this
database system but the corporate standard is that one and, you know, I don't want to go fight
that battle so I'll just do that. There may be a discussion in a forum. People might be talking
about content management systems, let's say and you might do a trial. You might get a
recommendation from a friend or a colleague and a lot of decisions get made that way, and not
just in software. Should I see this movie? Should I go to this restaurant? How do you like your
Ford? We all do rely on personal recommendations from trusted friends and colleagues. And
last is this idea of a detailed internal evaluation where somebody in the organization has the
job of doing a bake off among several competing products and it always sort of goes the same
way. They create a matrix of some features that they think they want and some metrics about
the product and they go beat on these things and they have some sample example that they
are going to work on and they go down and check, check, check, check and somebody gets the
highest score, but often the scores don't differ by very much and so the decision ends up being
fairly arbitrary anyway. So this whole issue of evaluation seems to be very flaky and not
anywhere near as scientific as you would like to think. There is also now as the open-source
area has grown this whole issue of evaluating proprietary versus open source software. So
when you're dealing with proprietary software you certainly have a vendor to talk to for many
products, although that's turning out to change because when you're buying a product for 99
cents you're probably not doing much of an evaluation. You're just buying it and the like. But if
we focus our discussion on software that's going to be used in an enterprise that has significant
price tag associated with it, capital expense, ongoing maintenance and the like, you know, you
have in the evaluation of any proprietary product you have a professional sales team. Maybe
that's somebody who actually shows up at your location and makes a presentation and maybe
has, is able to give a demo personally or bring along a second person who can do a demo.
Making a sales call is very expensive, so if you're selling a product for a couple of hundred
dollars you're not going to send a salesperson, but you may have also a telesales team,
somebody who smiles and dials and calls up everybody who's visited the website or expressed
interest at a tradeshow or something. One of the things about packaged commercial software
is that a lot of those reviews exist; the industry analysts follow things. You can talk to the
vendor and they'll tell you something, at least a little bit about their product roadmap if the
opportunity is large enough. When you look at open source software, the picture changes
because for the most part open source software doesn't come with a marketing team and a
system engineer and doesn't come with salespeople with field offices who are going to drop in
on potential customers. You have three categories for our discussion here. The vendor led
projects, the foundation based ones and the community based ones. And let me just take a
moment on each of them. So a vendor led open source project really looks a lot like a
traditional commercial software vendor. The reason for that is that when you're making a pitch
to an enterprise they're going to be interested in support. They're going to be interested in the
roadmap. They're going to be interested in whether you as a company are going to be around
to help them. They're going to evaluate the open-source vendors the same way that they
evaluate the proprietary vendors. The main difference being the availability of the source code,
maybe not exactly the same edition, but close enough that they can use the open source
version indefinitely, try it out, and that turns out to be very useful particularly for start ups, you
know, because I don't want to spend any money on software until I get to the point where I'm
able to generate revenue for my product. So maybe it's only at that point that I'm going to
want commercial support and the security and peace of mind that offers. So when you look at
these open-source products that come from vendors, products from companies like Talend,
Jaspersoft, SugarCRM, they provide indemnification, which is something you won't find in most
open-source software. They have an in house development team so, you know, everybody who
builds the software works for the company. There's support, training, all of the things that you
would find in a proprietary company. So when you start to evaluate those companies’
products, you know, it's very similar to the evaluation that you would do for a proprietary
product. Now as you get into foundation-based and community-based projects you don't have
the same idea of a sales team. You don't have the people that you can call up in the middle of
the night to ask for and so companies that are particularly unfamiliar with open source or are
risk averse, are going to be reluctant in general to choose these types of projects, although,
they do. They don't always know that they right doing it. They are using something that's using
the Apache HTTP server or they're using Mozilla Firefox or something like that, but, you know,
there are a number of foundations that develop and keep these open-source projects. We
know about the Eclipse foundation, the Apache foundation, Linux foundation and a few others,
so when you use software that's foundation sponsored, all the software from that foundation
comes with its own license, so all the Apache projects have an Apache license similar to all the
Eclipse products have an Eclipse license and those licenses are pretty permissive in terms of
what you can do with them in your ability to redistribute them and use them in other products
and so on. You will find in some of these cases that somebody will start a company whose job it
is to provide support for those products. So that's a way that those kinds of organizations can
generate some revenue and can also provide some comfort to enterprises that want to use
those products but want to have somebody to hold their hand and help them do that. So
Hortonworks, for example, provides commercial support and enhancement, in fact, for Hadoop,
but all the contributions that they make go right back into the Hadoop project into Apache
Hadoop, so there's no notion that here's our product and here's their product. CollabNet has
an enhanced version, sub version. There's Pivotal, there's Cloud Foundry which is commercially
supported version of OpenStack, so you have a bunch of these kinds of things. And then you
have the pure open source projects and I put pure in quotes, but in the sense that there's no, in
most of these there's no intent to commercialize. That doesn't mean that you can't use them
and in fact lots of R&D organizations and students and pieces of businesses that are not
business-critical will use these, but a lot of them have very limited histories and very small
teams. There are some that are, in fact, very well-established and widely used. Drupal is a
good example that is thought of as a community open source project, but in fact there is a
company Aquia that provides commercial support for the community-based Drupal project.
And things like MediaWiki, there's, you know, all of Wikipedia runs on MediaWiki among other
things, but it's a community-based project. So volunteers, some set of the volunteers are
committers. Those are the ones that can actually commit code to the code base. Support
comes from forums mostly, so each project will have its own discussion forum, but there are
also very interesting and useful third-party forums like Stack Overflow that field questions and
have discussions about many of these kinds of projects. One key difference on many of these
projects is that there's no roadmap and release schedule, which is what you expect on
commercial products, and on vendor supported open source; everybody says, well, when's it
going to be released and it's ready when it's ready is the notion. And, you know, with volunteer
projects and I'll say something more about that later, you're really counting on people to
devote time to it and people have family emergencies of various sorts. They have business
commitments that get in the way and gee, I can't work on that this month. So there's an aspect
of uncertainty about when some of these things will be released. Now if you are out there
looking for software to use, how do you find something that's worthy? I put worthy in quotes,
but what I mean of course is how do I find some open source software that's going to meet the
needs that I have whether that is as an end-user application to do video editing or a component
that I can include in some product that I'm going to build and distribute to others considering
the appropriate licenses. But how do you find it? You go out to get hub and last time I checked
there were 6 million projects out there. Not all of them have any license description at all so
I'm not going to use any of those, but there are still millions that do have licenses and then I can
go to source forage and a bunch of other sources that host open-source projects. So going out
and finding something that meets my needs is not trivial. If I go to CMS matrix which is a
website that keeps track of all of the content management systems, the last time I checked
there were about 400 open-source content management systems. I mean that's mind-boggling.
Where do you start? So the first thing I would ask informally is what does it mean to be worthy
in this sense. Well the first is that if you're going to use it for something, you know, where you
are expecting it to work and to support your business or your organization or your government,
you probably want it to have relatively widespread use. People who use it early on will be
research labs and universities and other people who are not doing business-critical kinds of
things, but you want people to use it. And you want some documentation, support and
training. Now maybe the documentation is a book. Maybe it's a website with Q&A. Maybe it's
a video or some combination of those things. You want an active project. When you, when I go
look at themes and modules for Drupal, I'll look at when was this last version released. Oh it's
7.0 Beta 12 released in July of 2010. You know, to me that doesn't look like it's a project that
somebody is continuing to work on, so I'll be reluctant to pick that up and use it for any purpose
where I might want support. I'd like to have a stable core team, so I can go in fact and look at
who are the committers on that team and how long they been there and how many people
have committed code over time and so on. And then of course it has to do what is I want it to
do and it has to perform relatively well even if it's open source and free, because I'm going to
work on what I want to work on and I'm not coming to this evaluation for the purpose of
finding a project to work on. That's a completely different kind of search. Because people do
that all the time, right? They say gee, I'd like to go contribute to an open source project and
then they will go and look at things and find things that have long issue lists and are looking for
people to contribute. And last thing it might be that it has some innovative features, but that
may not be what I'm looking for. So what I took away from this and maybe what I'd like you to
take away from this discussion so far is that it is all very soft. There's not much in the way of, a
way to measure, to do an evaluation that's quantitative, and so that's what we looked at in the
issue of evaluation and we came up with this project that was called the Business Readiness
Rating. That's what we called it in 2005 and it was started by people at SpikeSource which is a
company that is no longer with us. They got acquired by Blackdoc, but really they were never
successful at their original mission. There was somebody from O'Reilly. There was some
funding from Intel and there was me. The idea was, the business readiness rating was to be
able to quantify and help people do some kind of quantitative evaluation of open source
software so that if they did want to use open source software, they had a way to figure out
what was going to be worthy from the standpoint of business commercial government use. So
one of the things that always comes up when you talk about acquiring software or building any
kind of project is risk. So when you go and work with a startup, they are willing to accept a lot
of risk in terms of using new technology if it helps them get their product to market faster or if
it has some cool feature that they see as giving them an advantage. But as organizations get
larger and they have a larger user base and there's more revenue, then they become
increasingly risk averse. And you see that even, you know, as some of our favorite startups
have grown up. I mean Yahoo now is a totally different company than it was when it started in
1994 and we could argue as well that Facebook and Google and other companies that have
grown quickly, they're still willing to accept a fair amount of risk, more so than insurance
companies and medical instrument companies and banks, but one of the issues for them is to
reduce the risk in their choices. And that's one of the reasons why many of these companies
and governments rely on a relatively small number of vendors because those are the ones in
which they have business relationships. Those are the ones that speak the right language and
they are willing to pay a premium to use those. When I first looked at SAP, you know, I was just
appalled at what people pay to have an SAP installation for their own organization. I mean, this
is millions of dollars, even for the basic functionality, and fewer people are doing that now
which is not such a good thing for SAP’s future. But anyway, the idea of the business readiness
rating was to provide a trusted and unbiased source for evaluating open source software. The
business readiness rating turns out for me to not be a very satisfactory name and so I thought
well, can we come up with a better name? And so I chose recently OSSpal and the reason for
pal is it's a double meaning. One of the, the guy from SpikeSource who worked on this at the
beginning was Murugan Pal, who is no longer with us, but I thought that I would take advantage
of the nice same OSSpal that would also serve as a tribute to his memory. So the idea is to be
empirical, to be able to come up with some way to calculate from a piece of open source
software which projects are worth exploring. We came up with, originally came up with 12
categories and my basic understanding of cognitive science has told me well that's clearly too
many. And then my own personal behavior, as somebody said, well, what are the 12
categories? And I would get to seven or eight and then the rest would kind of, ha. So I reduced
it to seven and grouped them a little bit and I don't think there are any surprises here in terms
of the categories that we chose. In functionality, nonfunctional aspects, support, services,
licensing, project management, documentation and community. So those are important areas
for an open source project. For example, licensing may be a showstopper. Some companies
just won't use any software at all that has a GPL license, for example, and some companies
won't use software that has a BSD license because they are going to distribute that software
and they don't want somebody to then take that source code and replicate it and close it and
put it in some other product. So in some circumstances the licensing carries a very, very heavy
weight. So the general concept then it is here are the seven categories and when you do an
evaluation you are going to assign relative weights to each of these categories. Functionality is
probably going to get a relatively high number. The licensing may get a very high number or a
very low number, depending, and you may or may not care about how that project is managed,
because sometimes you just say okay. I'm going to take the software that exists right now and
that's what I'm going to work from and I'm not going to worry about how that project works,
because we all know that people come and go on projects. Things get dropped; things come
back. It may or may not be important. You may want to read the code. Some people do.
That's an advantage that you have choosing open source. You can say, you know, this is good
code by whatever measures you like, you can evaluate it. And one of the things we talked
earlier about, we want an active community. Well you can actually see that and you can go into
the forge and look at the project and you can see how many contributors there are and how
frequently the code has been updated and you can see how many downloads there are. You
can see how many open issues there are and whether they been assigned or picked up or
volunteered. So there is a lot of quantitative information that you can dig out from the project
itself. So we had the idea that you do a four phase process. The first one we call the quick
assessment filter which is really a way to rule out things. So for example, I have these 400
content management systems and I'd say, well I'm going to rule out all of the ones that are GPL.
Or I'm going to rule out all of the ones that are written in C. Why? Just because I feel that way,
you know, so it's just a quick assessment to narrow things down. Now the target usage
assessment lets you think about what am I going to use it for? If I'm going to use it in an R&D
setting than my priorities by those aspects are going to be different than if I'm going to use it in
a production environment. In the production environment I'm going to be much more
concerned about the functional and nonfunctional aspects, the support that I can get and the
response time and so on. And then the third phase involves the data collection, pulling the
information that you need to do the evaluation mostly out of the forge, but not entirely. And
then normalizing the data and that gives you there's this business readiness rating or the
OSSpal score or whatever. So here's a picture of how it might work. You got open-source
project data over there on the left and, you know, you have to sort of take the data that you've
got and group it somehow, so, you know, if I have downloads, and I can have anywhere from 0
to a hundred million downloads, so I might want to have a normalized score which says all right.
Look. If it has more than a million downloads I'm going to treat those all as a score of 5 let's
say, and if it's got a fewer number, then you normalize those, each of those categories into
some kind of a metric that reflects the category. And then those metrics actually are used to
calculate the various functionality quality adoption metrics. So sometimes you'll use more than
one metric from the project data to help you compute the functionality or the quality.
Adoption is going to draw on the number of downloads primarily, but it may draw on the
number of comments or something else as well. So you can see how those metrics map their
way into the different categories and then you can have weights associated with the category.
If you've got something that's in this case some kind of internal component, usability may get a
score of 0 and there are other situations, of course, where usability is going to get a very high
score by contrast. Those numbers can be tuned for each evaluation that you do and they can
be tuned by each organization, and even within an organization because you have different
purposes for the software. So that's the model. It's being able to draw on hard data to help
you assess as much as you can the suitability of that software for your intended use. So that's
the concept, and we tried it. We used some examples that it worked. Getting the data out it
turns out to be an interesting problem. I'm going to skip over some of this in the interest of
time, but within a forge you can go in and of course extract data from a particular project. The
people who manage these forges don't like the idea that you kind of go in and suck data out of
100,000 projects. That's considered unfriendly in terms of the use of the forge, but the idea is
that you are going to pull that data. But there are also other sources of data. There may be
books, for example. You might want to go and look at Amazon for instance and, you know, if I
go and I am doing an evaluation of Drupal, I can go out to the Amazon site and discover, oh,
okay. There's some number of books on Drupal. I can go to Google Scholar and see maybe
there are articles written, you know, papers are written about MapReduce and Hadoop. So
there are a lot of different sources you can use in addition to the forges. You can also use some
of the techniques that are used by some of the search engines. You can find how many links
are there to a particular project, how many people point you to this. Very often when you have
a content management system and you set up their page it says powered by [indiscernible] or
powered by Drupal and so you actually can get some metric as to the number of inbound links
to a page. So there are just a whole lot of things that you can do that give you additional data
on the popularity and the adoption of a particular open source project. So where do you get
the data from? Well there are some people who have actually gone out and gotten the data.
So there's a project called Ohloh which was started by four ex-Microsoft guys and over time the
project got acquired by BlackDuck. If you go to Ohloh.net it runs as an independent site and
you can see for any project how many lines of code and how much has been committed and
how many people committed and how many lines of code each people wrote and how many
different contributions they made and so on. There's a lot of data there and it is available. The
one that's easier to use in some respects is an academic project called FLOSSmole and
FLOSSmole really is able to, through agreements with different forges is able to extract data
and store it in a handy way for future analysis and use, so that's a very, very good source of
data. And then there's a site called FLOSShub which has links to lots more projects. So I'm
going to skip over the slides but Ohloh lets you show the lines of code and the number of
commits and the number of developers and those that represent subversion versus material, so
you can see some crossovers. You can get project detail, here's the code, how many files are
there? What languages are they written in? How many lines of code are there? How many of
those are code and how many are comments? That kind of thing. FLOSSmole gives you kind of
summary data about projects and all of this is just presenting information that you can extract
through a programmatic interface from the FLOSSmole libraries. We tried doing this in an
earlier version of the business readiness rating. This is like the quick assessment piece where
you could say okay. Well I want something that's in a particular language or I want it running
on a particular database or running on a particular operating system, so you could check those
off and then I would screen for you. So what happened to the project? We think the project
was a good idea, but it was an open source project and the community, we didn't grow the
community. That happens to lots of open-source projects. They kind of fade out. In our case it
was the guy from O'Reilly moved back to New Zealand and was starting a business out there
and so this was off of his radar. Murugan Pal who was the guy from SpeckSource, he got sick
and so that left me and, you know, I didn't have enough resources. So the other thing is that
we can talk about how to do that evaluation and you can do it on paper. You can access that
data, but in fact, that's a pain. If you're doing two or three tools, yeah, sure fine. And, you
know, I had student teams do it and they came up with results that were believable and as you
would suspect the products that had vendor support, the vendor-based open-source projects
came out with higher scores than the foundation ones which in turn came out with higher
scores than the community ones, no surprise. And so, but… So that was part of it. The next
thing is we like to have tools to build, to support the evaluation so that you could actually take
that FLOSSmole data and do the computation and be able to display the data in the same kind
of way that the Ohloh data is displayed. So we've been negligent in going out and trying to
raise research support for that, but that'll happen. The other thing that happened is a very
actually strange phenomenon, which is that people don't really, didn't want just a number and
that ties us back to the very first slide. The way that people choose is not systematic
quantifiable. It's like when you choose an automobile, you can go read Consumer Reports all
day or you can go to the various automobile manufacturers’ websites and pull up all this data
about the specifications of the automobile, but at the end of the day that helps you reduce your
choices but it doesn't help you make your choice. So as we talk to people they said well yeah,
it's nice to know that this is a 4.3 and this is a 4.1 or whatever, but we want to know more. We
want to know what people thought of it. It's like a movie review or restaurant review. If you
look at Zagats’ restaurant reviews, you know, here's a restaurant to get a 28 for food and a 26
for service and that's good. I'm interested in that. But is that the only metric I'm going to use?
And the answer is no. And that's why sites like Trip Advisor exist and Zagats’ site exists and
people put reviews. One of the great things about social media is that you can be buried in
personal opinions of things and in today's world people don't want just the quantitative piece.
They want the subjective data. They want a review from somebody. Now it would be nice if it
was somebody that they trusted, but even so, when I go when I look at, you know, I need a
plumber, an electrician and I go to yelp and I don't know anybody who's reviewed it, but, you
know, if I have 1000 people that have all said that this is a great place, then I probably am going
to check it out and if half the people say that don't ever do business with these people, that's
probably going to drive me away. So the same kind of thing seems to apply in software
evaluation, that we can do the quantitative assessments, but that the qualitative aspect of it in
many cases will override the systematic empirical data driven selection. And that's, you know,
for me was a bit surprising, but that is what we discovered. So we would like to come back and
revitalize the project and build tools and relaunch it and try to have a community, so that's
where we are with it, but I think that there are still a lot of lessons that we learned from this.
The next step -- we're getting towards the end here, is that, you know, are people evaluating
just open-source software or are they evaluating all software products. So are there
comparable metrics? So when we looked at that list of evaluation categories, you can see that
some of them work a lot better for open-source software than for proprietary software and in
other cases it's the other way around. So if I want to study architecture and code quality,
commercial vendors are probably not going to let me look at the source code, whereas, in an
open source project I can look at it and decide whether it's good or bad. Support and services,
I'm going to get much more information out of a vendor than I'm going to get from a
noncommercial product, so there are, if you look at these categories you see that some of them
allow you to evaluate commercial and noncommercial software side-by-side and others don't
match up quite as well. But there was a fair amount, and the reason why this is interesting is
that software is different today. We've had this huge growth in the use of open-source. We've
had a very significant acceptance of open source. You look at the majority of mobile phones
running open-source operating systems. You look at cloud computing and database
management systems the no SQL database systems, open source dominates in these
categories. Part of this came about because we had this economic downturn where companies
no longer had the luxury of big IT budgets to buy commercial products and pay the
maintenance fees on them and they said well, we've got this project that we've got to get done.
How are we going to do it? And they would go and use software as a service, or they would go
use open-source products and make them work. So five years ago people would go and say I'm
looking at open source software. I don't want to use any of this commercial stuff, or vice versa.
We don't use this open source here. That turned out not to be true, but that's what people
thought. So there were two camps and very little overlap. But today, people have used opensource. They know that they have it in their company. They have developers who have been
using it for years and years who are now moving up in the organization and having more impact
on decision making processes. So the answer today is I need the best solution. So it's got to be
fitness of purpose; it's got to be cost. It's got to be stability or reliability, whatever those things
are so people are looking at proprietary software that they are going to run on their own
machines. They are looking at open source that they are going to run on their own machines
and they are also looking at hosted solutions, that they are going to run on cloud and most
organizations have a mix of these. So a company like SugarCRM they offer a community version
of their software which you can download and install on your own servers and use forever,
open source. They have a version that is proprietary that they will host so you can pay them
some number of months in the same way, same model that is used for the salesforce.com, a
little less expensive. So they're in the position of addressing open-source from an open source
version and a proprietary version with a commercial license, a hosted version, so they're trying
to cover the bases for customers who are choosing a mix of these things. So in terms of where
we'd like to go with this idea is to focus initially on the open-source side of things drawing on
the ideas and the measurements stuff that I showed you that were part of the original BRI idea.
We'd use the FLOSSmole data sources because they collect them. They have them every month
and you can just go get them. We’d look at the various kinds of weightings, maybe a finer grain
than the ones we showed you there. I just showed you seven areas, but in the nonfunctional
area, particularly, people may want to have sub weightings where different things are
important. And then the idea is to be able to, you know, pay some graduate students or
research assistants, you know, this is what you're going to do right now and then do it in the
open as a community-based project so that over time volunteers will join in as well. So that's a
wish list for where we'd like to take this and I think that's probably a good place to stop. This
phone both, I took a picture of the other day. When was the last time you saw a phone booth?
So if you find yourself at Lake Crescent on the Olympic Peninsula on the west end of the lake,
there's a phone booth. It looks like it comes from the 1970s or so, but I think it's newer.
Anyway, that's it. This finds me by the various usual means and thanks for your attention. I'll
be glad to answer some questions. [applause] there being none, yes, Emerson?
>>: So I'm wondering, you said you got some community feedback about the people using this
system when they wanted to evaluate open source projects. Did you get any feedback from
the projects themselves?
>> Anthony I. Wasserman: We didn't talk very much to the projects. You know, it's a funny
thing. Maybe the people in community open-source projects are not, they should be
concerned with how many people are using it and how people are reacting to it at a macro
level, but they tend to be more concerned at a micro level. You know, here's this bug that
needs to be fixed. And maybe it's the personality, the human factors around the people that
run these projects, but you can't go to them and say, you know, if you use this tool it'll help you
get more users. That'll get you a yawn in most cases. We don't need them. They bother us.
They just asked us newbie questions. Let us do our work. We are building this software. Go
away. It's like this um, it's a different personality. That's been my experience. Have you talked
to a lot of different open source projects?
>>: I'm thinking of this in terms of, I'm sticking to the parallel between this and rating students,
right?
>> Anthony I. Wasserman: Yeah.
>>: It's great for the organizations outside that want to see the grades and more evaluated
students, but the students realize that they are being evaluated that way so they don't want to
see their scores tweaked. So I'm wondering if this becomes an influential thing whether the
organization is going to disagree with you about the amount of quality of documentation that
they have.
>> Anthony I. Wasserman: I think that could happen. That could be wonderful if we got there.
I think, you know, you can always game the evaluation criteria. And the vendors, the
commercial open source vendors may in fact want to be tweaked that way, but in fact, the
commercial vendors in general get the highest scores because they have the documentation
and the support and the roadmap and the issue tracking that are there and they tend to have
both paid support and unpaid support. So any time we tried scoring by SQL or any of the widely
used vendor supported open source projects, they ended up with very high scores. Yes, sir?
>>: So when an organization decides that they need some software built [indiscernible]
component, right? They have some idea of who the big players in the area are. Let's say I want
to use databases. I need databases in open-source and [indiscernible] do you see any cases
where you got feedback on this tool where people went in with some idea and then came out
with some kind of surprising result? Like we thought it would be close to MySQL but in the end
what we found was based on your tool this other project that I was completely unaware of was
exactly a better fit for our needs? Here's what I'm saying. Are you getting information that you
don't already have?
>> Anthony I. Wasserman: I see, yes. If we had the tool I think we would have much more of
that. But because people were using it manually they would tend more or less to create a short
list of familiar names. The place where it's the most interesting to look is in the content
management systems because there are so many of them and even if they say okay. I'm only
using the ones that are written in PHP, that will throw out a clone and some other things. It
gives them a list then. I look at that list and gee, I think I know a fair amount about content
management systems, but, you know, the concrete five and easy publish and a lot of those,
they have followings and big audiences. So I think in that category particularly there are so
many that people would often come up with things that were not immediately intuitive. Good.
All right. Thanks everybody. I appreciate it. [applause]
Download