Hsiao-Ming Tsou

advertisement
The FreeBSD Project:
a Replication Case Study of
Open Source Development
1
“Our goal is to identify 1) the common characteristics in the
development processes of successful OSS projects and 2) the
quality of software that was produced using these processes.”
2
Agenda



Background
 Open Source Software (OSS) development
 Mockus et al.
Research
 Objective
 Data of interest
 Data sources
 Data extraction tool
 Findings
Conclusion
3
OSS Development


Claimed advantages:
 Anybody can contribute to improvement of system
 Developers can learn from each other
 Developers can work without interference, in their own time
Claimed disadvantages:
 Lack of formal process
 Poor design and architecture
 Development tools are inferior
 Developers don't understand users' needs
4
Mockus et al.


Case study on Mozilla and Apache projects
 ...addresses the requirements of development process for
successful OSS project
 ...addresses quality of such an OSS software product
 ...measures product defect density
 ...quantifies many aspects of OSS development
Seven hypotheses
 H1) “Open source developments will have a core of developers
who control the code base, and will create approximately 80%
or more of the new functionality. If this core group uses only
informal ad hoc means of coordinating their work, the group will
be no larger than 10 to 15 people.”
5
Mockus et al. (2)

Seven hypotheses (continued)
 H2) “If a project is so large that more than 10 to 15 people are
required to complete 80 percent of the code in the desired time
frame, then other mechanisms, rather than just informal ad hoc
arrangements, will be required in order to coordinate the work.
These mechanisms may include one or more of the following:
explicit development processes, individual or group code
ownership, and required inspections.”
 H3) “In successful open source developments, a group larger
by an order of magnitude than the core will repair defects, and
a yet larger group (by another order of magnitude) will report
problems.”
6
Mockus et al. (3)

Seven hypotheses (continued)
 H4) “Open source developments that have a strong core of
developers but never achieve large numbers of contributors
beyond that core will be able to create new functionality but will
fail because of a lack of resources devoted to finding and
repairing defects.”
 H5) “Defect density in open source releases will generally be
lower than commercial code that has only been feature-tested,
that is, received a comparable level of testing.”
 H6) “In successful open source developments, the developers
will also be users of the software.”
 H7) “OSS developments exhibit very rapid responses to
customer problems.”
7
Research Objective


A replicated case study with FreeBSD to obtain further evidence to
help determine whether or not the seven hypotheses are valid
Two case studies (Apache and Mozilla) are not enough to
conclusively determine the nature of OSS development
8
Data of Interest
Lines of code
 Project growth rate
 Who worked on what?
 How many worked on what?
 How many core/non-core developers?
 Who reported bugs?
 Who fixed bugs?
 Did people tend to have multiple roles?
 Defect density
 Code ownership
 ...and more!

9
Data Sources



Concurrent version control archive (CVS)
 Developers can check out code, modify, and commit back into
source tree
 Logs and developer names are recorded for every modification
 Stores history of modifications
 FreeBSD has two source code branches:
 Stable: source tree at time of major release
 Current: continuous snapshots of on-going development
Developer e-mail archive (freebsd-bugs@FreeBSD.ORG)
 A source for bug reports and reporter names
Bug report database (GNATS)
 Contains details of reported bugs, such as priority, state,
problem reporter, problem fixer, reported date, etc.
10
Data Extraction Tool

Procedures:
 Read a CVS log file
 Search for segments
of changes to
desired files
 Extract all deltas to
these files
 Retrieve information
from the deltas
 Save the information
into text files
11
Data Extraction Tool (2)
12
Findings
13
Findings (2)

Revised seven hypotheses:
 H1’) A core of 15 or fewer core developers will control the code
base and contribute most of the new functionality. A group of 50
or fewer top developers at any one time will contribute 80
percent of the new functionality. The group will represent less
than 25 percent of the set of all developers.
 H2’) As the number of developers needed to contribute 80
percent of OSS code increases, a more well-defined
mechanism must be used to coordinate project work.
 H3) [ no changes ]
 H4) [ N/A ]
 H5) [ no changes ]
 H6) [ no changes ]
 H7) [ N/A ]
14
Conclusion
“Our goal is to identify 1) the common characteristics in the
development processes of successful OSS projects and 2) the
quality of software that was produced using these processes.”
15
Download