The FreeBSD Project: a Replication Case Study of Open Source Development 1 “Our goal is to identify 1) the common characteristics in the development processes of successful OSS projects and 2) the quality of software that was produced using these processes.” 2 Agenda Background Open Source Software (OSS) development Mockus et al. Research Objective Data of interest Data sources Data extraction tool Findings Conclusion 3 OSS Development Claimed advantages: Anybody can contribute to improvement of system Developers can learn from each other Developers can work without interference, in their own time Claimed disadvantages: Lack of formal process Poor design and architecture Development tools are inferior Developers don't understand users' needs 4 Mockus et al. Case study on Mozilla and Apache projects ...addresses the requirements of development process for successful OSS project ...addresses quality of such an OSS software product ...measures product defect density ...quantifies many aspects of OSS development Seven hypotheses H1) “Open source developments will have a core of developers who control the code base, and will create approximately 80% or more of the new functionality. If this core group uses only informal ad hoc means of coordinating their work, the group will be no larger than 10 to 15 people.” 5 Mockus et al. (2) Seven hypotheses (continued) H2) “If a project is so large that more than 10 to 15 people are required to complete 80 percent of the code in the desired time frame, then other mechanisms, rather than just informal ad hoc arrangements, will be required in order to coordinate the work. These mechanisms may include one or more of the following: explicit development processes, individual or group code ownership, and required inspections.” H3) “In successful open source developments, a group larger by an order of magnitude than the core will repair defects, and a yet larger group (by another order of magnitude) will report problems.” 6 Mockus et al. (3) Seven hypotheses (continued) H4) “Open source developments that have a strong core of developers but never achieve large numbers of contributors beyond that core will be able to create new functionality but will fail because of a lack of resources devoted to finding and repairing defects.” H5) “Defect density in open source releases will generally be lower than commercial code that has only been feature-tested, that is, received a comparable level of testing.” H6) “In successful open source developments, the developers will also be users of the software.” H7) “OSS developments exhibit very rapid responses to customer problems.” 7 Research Objective A replicated case study with FreeBSD to obtain further evidence to help determine whether or not the seven hypotheses are valid Two case studies (Apache and Mozilla) are not enough to conclusively determine the nature of OSS development 8 Data of Interest Lines of code Project growth rate Who worked on what? How many worked on what? How many core/non-core developers? Who reported bugs? Who fixed bugs? Did people tend to have multiple roles? Defect density Code ownership ...and more! 9 Data Sources Concurrent version control archive (CVS) Developers can check out code, modify, and commit back into source tree Logs and developer names are recorded for every modification Stores history of modifications FreeBSD has two source code branches: Stable: source tree at time of major release Current: continuous snapshots of on-going development Developer e-mail archive (freebsd-bugs@FreeBSD.ORG) A source for bug reports and reporter names Bug report database (GNATS) Contains details of reported bugs, such as priority, state, problem reporter, problem fixer, reported date, etc. 10 Data Extraction Tool Procedures: Read a CVS log file Search for segments of changes to desired files Extract all deltas to these files Retrieve information from the deltas Save the information into text files 11 Data Extraction Tool (2) 12 Findings 13 Findings (2) Revised seven hypotheses: H1’) A core of 15 or fewer core developers will control the code base and contribute most of the new functionality. A group of 50 or fewer top developers at any one time will contribute 80 percent of the new functionality. The group will represent less than 25 percent of the set of all developers. H2’) As the number of developers needed to contribute 80 percent of OSS code increases, a more well-defined mechanism must be used to coordinate project work. H3) [ no changes ] H4) [ N/A ] H5) [ no changes ] H6) [ no changes ] H7) [ N/A ] 14 Conclusion “Our goal is to identify 1) the common characteristics in the development processes of successful OSS projects and 2) the quality of software that was produced using these processes.” 15