Running head: USING OPEN SOURCE TO SCAN OPEN SOURC Using Open Source to Scan Open Source: A Review of FOSSology Ryan A. Arnold Coleman University April 23, 2015 1 USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 2 Abstract Using Free and Open Source Software (FOSS) code in conjunction with proprietary software code can accelerate development and improve functionality, however, failure to comply with the terms of a license can result in costly litigation, compromised intellectual property, a weakened image within the open source community, and loss of consumer confidence. Software proprietors have migrated towards source code scanning for license content and origination using a combination of automated tools and processes to help manage their use of open source. Companies like Black Duck, Protecode, and OpenLogic have developed business models based on scanning code and providing consultation, but this comes at high cost and disruption to the software development lifecycle. FOSSology, which is itself a free and open source project, was created by Hewlett Packard as an open source compliance tool. Although FOSSology has benefited from years of development, in-depth testing must be conducted to see if it is robust enough to support the needs of a proprietary software company. A mixed method style of research was conducted to ascertain this supposition. First, open source litigation and estimated cost are discussed to establish the necessity to scan code. Next, a series of tests were conducted using sample files with various licensing and origination and results analyzed to gauge the strengths and weaknesses of the FOSSology tool. Finally, a sample open source compliance program targeted for an enterprise is outlined and recommendations are offered for enhancing the FOSSology Project. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 3 Using Open Source to Scan Open Source: A Review of FOSSology “Open source is doing for mass innovation what the assembly line did for mass production. Get ready for the era when collaboration replaces the corporation” (Goetz, 2003). From sharing hand-written recipes to worldwide project collaboration, open source software is in more places and products than ever before. While many may consider free software as that which can be acquired without cost, this is not entirely correct. The truth is actually a bit more abstract and can be compared to freedom of speech or assembly more so than free-of-charge. As the name suggests, open source software means that the source code is “open” and the user has the right to, “…run it, to study and change it, and to redistribute copies with or without changes, (Stallman, 2009) which is a freedom not usually associated with proprietary software code. The Open Source community is arguably the most successful example of a User Innovation Network in existence (Trott, Duin, & Hartmann, 2013), and is integrated into more than 85% of enterprises in one form or another (Ullah & Khan, 2011). Recognizing potential to save time and money in development, proprietary businesses often borrow software code from Open Source projects. In fact, a recent Forrester research report shows that 76% of developers have used some form of open source technology for development operations including code reuse, database, testing, debugging, and distribution (Baldwin, 2014). In a business sense, time is money, and when software can be developed more quickly by implementing open source components, businesses stand to make more money. One could compare using a word processing program and a template to send correspondence to customers and clients rather than recreating each letter from scratch on a typewriter. As word processing became an industry standard, so it seems using open source software is following the same path. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 4 While this all seems like an idyllic solution to the needs of many enterprises, there are often competing interests that hold patents, licenses, and copyrights on similar technologies. When a company uses open source in a product, they must adhere to the terms of the license for the material that is being leveraged less they risk the consequences of non-compliance. Customers may decline to accept the product or sections of code may have to be rewritten and tested which will delay release times. If an enterprise is found to have violated the terms of an open source license it can result in undesirable press coverage and removal of good standing within the open source community. Finally, the integrity of intellectual property can be jeopardized by lawsuits, injunctions, and monetary penalties which may drastically impact the ability of an enterprise to remain profitable. With several benefits and risks associated with the use of open source software, any company that leverages its use is presented with a dilemma. How can an enterprise most accurately, effectively, and economically, mitigate risks of using open source packages in a proprietary business model? Implementing a fully open source solution, originally developed by Hewlett Packard, the FOSSology scanner may be an answer to this question. FOSSology, when used in a proprietary business model, may help to ensure open source license compliance and the protection of intellectual property while allowing a company to meet deadlines in conjunction with rapid lifecycle development. To effectively assess the legal ramifications of noncompliance, variations of open source licenses, as well as pertinent case law will be studied and analyzed. Similar technologies from other vendors will be compared and an in-depth assessment of the FOSSology tool will be conducted using practical instances. Finally, a suggested program for open source compliance will be discussed and the FOSSology tool will be critiqued as to its viability in a proprietary environment. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 5 Open Source Licensing To those who do not generally work with open source, it may appear that engineers are simply offering up their work for nothing; quite the contrary. With very few exceptions, when someone creates, modifies, or improves a work and contributes it to an open source project, they retain their copyright or patent and can still have a say in how that work is allowed for others to use (Chapman, 2010). A license is a statement that gives permission to use copyrighted material, and an open source software license simply means that the source code is available to be viewed. All software files should contain an explicit copyright statement to inform individuals on their rights to reuse or distribute the code. If a file does not have a license, one cannot assume that it is not copyright protected; in fact, even if a file does not have a copyright notice, rights are implied and the file cannot be copied or distributed without permission. The only way to disclaim copyright and thus, remove any caveat from the use of a file or files, one can place the code into the Public Domain (Free Software Foundation, 2014), which means that it is disclaiming copyright protection. Open source licenses which number in the hundreds, can be roughly categorized into three main types: academic, reciprocal, and hybrid (Fontana, 2010). Figure 1 below lists popular examples of each of these styles of licenses. Academic, or “permissive” licenses, as the name suggests, got their inception in colleges and universities. These licenses generally require that the original author’s copyright statement be maintained in all derivative works or copies and that the software is available without warranty. Reciprocal licenses, also known as “copyleft,” require that anyone distributing copies or derivative works must also release their code under the same terms. The Free Software Foundation (FSF) has been the main proponent of this sentiment as they are responsible for authoring the most widely used “copyleft” license, the GNU General USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 6 Public License. They seek to secure freedom of software and operating systems for future generations and advocate against patenting software (Free Software Foundation, 2015). For proprietary business models, this is generally avoided as entities do not usually wish to release their confidential, and often patented, material. Hybrid licenses contain language that place them in between the aforementioned, which means that they can be implemented into proprietary products, but must allow for the original and modified files to be available as open source. These licenses may also be referred to as “weak copyleft.” Of the most common licenses, the GNU General Public License (GPL) 2.0 is at the top of the list making up a quarter of all the open source community, the MIT license follows at 19% and Apache 2.0 makes up 16%. Figure 2 below lists the most commonly used open source licenses (Black Duck Software, 2015). Weak Copyleft Permissive (academic) MIT BSD (hybrid) BSL LGPL MPL Strong Copyleft (reciprocal) CPL GPL AGPL FIGURE 1. Categories of open source licenses with examples With 16 billion lines of code, or 27% of all open source licensed material falling under the GPL, it stands to reason that there is a statistically higher chance of encountering software which is subject to this (Radcliffe, 2014). This is alarming for entities who incorporate open USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 7 source libraries into their proprietary software code as they could inadvertently expose themselves to risk. The two portions of the GPL license (1991) that require a company to grant their IP freely are: 1) The copyright license grant, which reads, “You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License.” 2) The patent license grant, which reads, “If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all.” This means that if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. As an example, suppose that fictitious company ‘A’ owns a patent and copyright on their proprietary software that serves as their primary source of income. Company ‘A’ decides to incorporate a GPL licensed library into their source code which is a deliverable to customers ‘B’ and ‘C’. Since Company ‘A’ has proprietary material that they distribute on a contract basis, they release their software code in binary or “object” form as a safeguard. This ensures that the patented code that is the lifeblood of their business model is not revealed. If Company ‘A’ does release the GPL licensed code within their product, they are obligated to supply the source code as well according to the copyright license grant of the GPL. Company ‘A’ can no longer charge a fee for a patent license according to the patent license grant of the GPL. This essentially renders their patent and corresponding trade secret code moot and may damage, possibly fatally, the viability USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 8 of the company to perform. This is not all, however. Non-profit organizations like the Software Freedom Law Center (SFLC) and the Software Freedom Conservancy go to bat for developers who cite violations of the GPL license. In addition to be forced to release proprietary source code to the public, entities must also consider the possibility of court ordered injunctions and payment of monetary damages (Software Freedom Conservancy, 2015). GNU LGPL 2.1 5% GNU LGPL 3.0 2% Artistic License (Perl) 6% BSD License 2.0 8% Eclipse Public License 2% GNU GPL 2.0 27% GNU GPL 3.0 11% Apache License 2.0 18% MIT License 21% FIGURE 2. Top ten open source licenses Open Source Litigation An increasing trend in open source license litigation shows that the courts have been ruling in favor of the open source license terms. Penalties and subsequent losses range from poor publicity to filing for bankruptcy. The following case summaries are represented in Figure 3 below. Jacobsen v. Katzer First, and probably most well-known, is the case Jacobsen v. Katzer (Arne, 2008). Robert Jacobsen and Matthew Katzer, plaintiff and defendant respectively, were both involved in USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 9 developing software which was used to control model trains. Jacobsen was a developer who initiated the open source project, Java Model Railroad Interface (JMRI) and a software package known as “DecoderPro,” which was licensed as open source under the Artistic License (JMRI, 2015). Katzer developed a similar software product which as known as “Decoder Commander” and was made available for purchase with a proprietary license. He also obtained patents related to this software. Katzer sued Jacobsen claiming infringement and Jacobsen responded with a countersuit regarding the validity of the patents in question. It was discovered that Katzer’s product, Decoder Commander, had used portions of Jacobsen’s product, DecoderPro, and had failed to abide by the terms of the Artistic license which said that copies must contain attribution and the license text, amongst other things (Artistic License, n.d.). Claims on issues such as copyright, patent validity, and cybersquatting were presented to the United States District Court of Northern California. Jacobsen sought an injunction against Katzer, which was initially denied, but, the Federal Circuit Court reversed the decision. As a result of the reversal, Katzer was subject to a permanent injunction and was enjoined from copying, redistributing, or modifying the JMRI software and enjoined from registering a trademark or domain name for a name used by the JMRI project. Katzer was additionally stripped of two patents, and ordered to pay $100,000 to Jacobsen. (Perens, 2010). Progress Software v. MySQL While Jacobsen v. Katzer is one of the better known cases, the majority of open source litigation involves violation of the GNU General Public License (GPL). The first case to try the GPL license involved MySQL, a popular open source database, challenging Progress Software Corporation, a Software as a Service (SaaS) business. Progress Software initiated the suit by claiming that MySQL breached contract and MySQL shot back with allegations that Progress USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 10 had violated the terms of the GPL license. Progress distributed an additional, proprietary add-in to MySQL, known as “Gemini,” but did not release the source code as the terms of the GPL license requires. Progress did release the source code in a later release, but the offense had already been noted. In 2002, the Judge Saris of the District Court of Massachusetts ruled that Progress cease distribution of any MySQL products or using any of their trademarks. While this ruling galvanized the validity of the GPL license, it also left some lingering questions. It was yet to be determined if software linked to code released under the GPL qualified as a derivative work, as well as if providing source code after-the-fact was sufficient to relieve an entity from copyright violation (Majerus, 2008). Busybox v. Westinghouse Digital Electronics, LLC Another GPL violation case involves BusyBox (2008), creators of an embedded Linux utility, who were defended by the SFC against Westinghouse Digital Electronics, LLC. Westinghouse was proven to be copying, modifying, and distributing BusyBox code in firmware and software that they developed for their line of high definition television products. The offending software was released into dozens of products without complying with the terms of the GPL license. Southern District of New York Judge Scheindlin granted damages in the amount of $90,000 for willful copyright infringement in addition to legal fees totally $47,685. Also, an injunction against Westinghouse was filed and they were ordered to relinquish all infringing equipment which was to be donated to charity and subsequently had to file for bankruptcy (Jones, 2010). Busybox v. Monsoon Multimedia USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 11 Busybox was involved in another dispute with Monsoon Multimedia, also involving GPL infringement, which was settled under more agreeable terms. BusyBox agreed to abandon the lawsuit as Monsoon was willing to settle and appoint a compliance officer in their organization (SFLC, 2007). Free Software Foundation v. Cisco Similar to the Monsoon case, Free Software Foundation v. Cisco Systems was also settled with the agreed upon action of appointing a GPL compliance officer (Lee, 2008). It should be noted that although appointing a compliance officer is not a direct payout, it still involves hiring at least one person. SimplyHired.com (2015) reports that the average base salary for an open source compliance officer is $51,000 annually. Case Jacobsen v. Katzer License The Artistic License Decision Ruled in favor of Jacobsen Damages $100,000 cash & Katzer enjoined from using JMRI Progress Software v. GPLv2 Settled 2002 Injunction - Cannot MySQL sell products using MySQL Busybox v. GPLv2 Ruled in favor of $90,000 for willful Westinghouse Digital Busybox copyright Electronics, LLC infringement, lawyer's fees ($47,685), injunction against Westinghouse, donate infringing items to charity Busybox v. Monsoon GPLv2 Settled Appoint compliance Multimedia officer - release source code Free Software GPLv2, LGPL 2 and Settled Appoint compliance Foundation v. Cisco 2.1 officer FIGURE 3. Litigation involving open source software licenses USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 12 Industry Competitors It has been established that failing to comply with the terms of an open source license, particularly the GNU GPL v2.0, can be extremely detrimental to an entity’s business model. There are tools currently available for scanning open source code, which come as a business expense. Protecode Protecode, founded in 2006, is a privately held company headquartered in Ontario, Canada, and offers open source license management solutions aimed at ensuring compliance. Protecode offers “Enterprise System 4,” a comprehensive open source license management solution which can be implemented throughout the software development lifecycle. When a company uses System 4, they can scan and identify open source components, develop a process for the acceptance of open source software, and establish licensing policies. The backend of this service leverages a proprietary database of publicly available open source projects (Global IP Signatures Database) which can be compared against an organization’s software codebase. This is done in conjunction with the software development lifecycle to find open source components and determine if license obligations are conducive to the business model. This option is likely better suited to a larger organization. For smaller organizations, Protecode offers “Protecode Compact,” a single-seat license management system, and “Protecode Cloud” allows code scanning an analysis without having to install anything on a local machine. These products all yield a software summary report which indicates corresponding licensing obligations. In addition to these three solutions, Protecode also supplies an audit service which highlights software bill of materials and a license obligations report. These are popularly used before business transactions like an acquisition or delivery of a third party USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 13 software product. Protecode Compact costs $9,000 annually, and Protecode Enterprise sells for roughly $12,000 for one scanning installation. Additional scanning installations can be purchased for $5,000 individually (Protecode, 2015). Black Duck Software Perhaps the industry frontrunner and best known for code scanning is Black Duck Software out of Burlington, Massachusetts. Black Duck is also a privately held company established in 2002 with a focus in consultation for enterprise open source adoption. The company develops tools and services to aid an organization in its ability to evaluate open source components within their code base, identify reusable code, and manage open source and thirdparty code approval and subsequent legal obligations. Like Protecode, Black Duck maintains a database of open source and third party components. According to Black Duck (2014) the, “KnowledgeBase™ is the industry’s most comprehensive database of open source software and associated license and other information.” Black Duck does offer some significant and compelling features that their competitors do not. Their flagship product, the Black Duck Suite, focuses on open source compliance across the software development lifecycle by automating code scanning, authentication, approval, classification, and monitoring. It has the ability to support teams of any size and can be installed onsite or as software as a service (SaaS). Black Duck also offers a service called “Deep License Data,” which identifies possible license information and code snippet matches in files with no stated license while comparing in-house projects against over one million projects from more than 7,500 sources (Black Duck Software, 2015). The cost of service is $9,500 per year for supporting two developers and $12,500 for five developers. Cloud-based services begin at $3,000 for 10 MB of scans (Morgan, 2008). USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 14 Open Logic Open Logic, a code scanning acquisition by Rogue Wave Software, has two main products offered. The free version grants scanning access to upwards of 330,000 open source packages for comparison and review, as well as offering an open source license reference guide. This allows engineering, legal, and product support teams the opportunity to examine the licenses and understand subsequent responsibilities associated with each open source package. The paid, or enterprise edition, is a more comprehensive guide to open source governance which allows a company to define open source policies. The software allows the users to annotate packages as, “requires approval,” “prohibited,” or “pre-approved,” as well as providing the functionality to block the download of prohibited packages. The enterprise edition also incorporates features to accurately track request to approval workflow with a web-based tool that can be adapted to suit the needs of the particular industry. Open Logic utilizes “OSS Deep Discovery” to help identify licenses that may not be explicitly stated (Open Logic, 2014). As far as cost, Open Logic offers their services from $5,000 to $30,000 annually, depending on the project (Kanaracus, 2009). The FOSSology Project FOSSology, the name coined for the study and proliferation of Free and Open Source Software, or FOSS, is a free toolset aimed at achieving open source compliance and copyright discovery. The program had its inception in Hewlett Packard in 2003 where a custom scanner, then named “Nomos” after the Greek word for “law,” was developed to help HP manage their rather large open source portfolio. After much iterative development and inclusion of additional scanning criteria, FOSSology was released in 2007 as an open source project licensed under the GPL v2.0. The scanner is capable of analyzing license data from a single file, all the way to an USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 15 entire software image. FOSSology searches every file looking for licenses using a heuristic license scanner. Heuristic scanners are better known for their use in anti-virus and malware software as they detect modified forms of files and programs. Additionally, the scanner can search for terms it has never seen before by identifying markers and delimiters in files that may suggest licensing or content origination (Gobeille, 2008). FOSSology Usage Currently, there are two options for using FOSSology. First, files can be submitted to a public repository maintained by the University of Nebraska, or alternatively, if a local install is desired, a Linux machine is required. The install process creates a blank file system, a PostgreSQL database for metadata storage, a command line interface, a web interface for uploading files and viewing reports, and a variety of scanning agents targeted for different criteria (Cornec, 2010). While there is the option to run command line prompts and database queries, Gobeille (2010) recommends the web-based GUI will be better suited to most users. The user can choose between a “one-shot analysis” or a “conventional analysis” review model. The “one-shot analysis” enables the user to upload and review a single file without interacting with the FOSSology database. Any discovered licenses or copyrights will be returned instantly for review, but cannot be archived for later use. The conventional analysis can cover any number of files, which must be unpacked to component files, and return results on the web or local GUI interface, if applicable. When the file(s) are added using the web GUI, they are automatically queued in the order submitted. In addition to providing the results, these files will also be added to the FOSSology repository where they are available to other users of the tool. Once a file is uploaded to the database, it is assigned a hash value (sha1 + md5 + file size) which is used to prevent duplicate files from taking up server space as well as allowing the USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 16 user to note differences between versions. The scan results are viewable in the form of a table showing the number of files which have a particular license and a hyperlink to the group of files with common attributes. Once common files are listed, the user may click on individual files to view the source code, which helps with legal analysis. When determinations are being made, the user may add a tag to a single file, directory, or even a full software package. Tags can be any information the user deems valuable regarding the file or files, but would usually contain something regarding the license name, requirements, or any instructions on distribution. In addition to referencing the licenses on a table, the user can also display other attributes that may provide useful content origination information such as, copyright dates, email addresses, and web sites. The user has the option to “diff” or view the differences between two versions of the same file by using a built-in license difference tool. FOSSology also allows the user to isolate specific and custom attributes for review into groupings known as “buckets.” For example, the user may have been tasked with identifying which files in a package were copyright from the year 2005, under the terms of the GPL v2.0 license, and authored by a particular person or company. FOSSology allows a user to change the declared license that it returns on a file if it is determined to be incorrect. The data can be presented in whatever means best suits the user or target audience. Most will prefer a spreadsheet, although using diagrams within presentations may also be a popular choice (Gobeille, 2010). FOSSology Technical Overview The basis of FOSSology involves a user invoking machine processes which communicate to databases as seen in Figure 4 below. The user interacts with the system via a command line interface or web GUI and loads software to the repository. This repository then talks to a job scheduler which is responsible for running various “agents” or automated scanning and analysis USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 17 functions that are saved to the database. This process is depicted in the figure below. The first agent to run is called “wget” as it retrieves the software package’s source files from the internet (unless the user directly uploads the files). When wget completes successfully, the “unpack” agent extracts the source files. Once the source has been unpacked, the agent, “adj2nest,” runs and establishes keys which allow the files to be browsed in a hierarchical or file tree layout. At this point, the “copyright” agent runs which extracts copyright text, email addresses, and URL links. Next up is the “mimetype” agent which classifies each file into a type, such as video, application, or text file, and a subtype or file extension. MIME types (Multipurpose Internet Mail Extensions) are an internet standard, such as email or HTTP traffic, which describes the contents of files. An example of this would be a picture file classified as image/gif (Kyrnin, 2015). The “Nomos” agent tags keywords and text within files that pertain to copyright and licensing information and formats it into a list, and the “monk” agent searches for full license text. The “pkgagent” agent simply parses package headers and the “bucket” agent organized user defined data together for analysis (Gobeille, 2010). Nomos will scan for all standard open source licenses as they appear in the Software Package Data Exchange (SPDX). If the scanner is only partially certain on the validity of a license, it will report “license-possibility” and if it does not know at all, it will output “unclassified license.” It can also report on license versions such as the General Public license version 2.0 or later, which is notated as “GPL-2.0+. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 18 FIGURE 4. FOSSology architecture (from Gobeille, 2008) Methodology Three test cases were conducted with a primary goal of measuring the accuracy of the FOSSology scanner. The primary scanning agents, Nomos and Monk, were reviewed for each file scanned and if no licenses were present in the file, the “copyright” agent which seeks copyright text, email addresses, and URL links was relied upon as a backup. Any failure of an agent to tag the intended data, or if an agent incorrectly categorizes any data is tallied as an error. Test Case 1 In order to use FOSSology as a resource when vetting code, it must be thoroughly tested to ensure viability and robustness. For the first test case, a sample set of license files was uploaded to FOSSology.org for scanning and review. The sample set included commonly used USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 19 open source licenses including: the Affero General Public License (AGPL) version 1.0 and version 3.0, the Apache License version 1.0 and 2.0, The Artistic License version 1.0 and 2.0, the Berkeley Software Distribution (BSD) License two-clause, three clause, four-clause, and University of California specific, the Common Public License (CPL) version 1.0, the General Public License (GPL) version 1.0, 2.0, and 3.0, the Lesser General Public License (LGPL) version 1.0, 2.1, and 3.0, the Libpng license, the Massachusetts Institute of Technology (MIT) license, the Mozilla Public License (MPL) version 1.0 and 2.0, and the OpenSSL License (SPDX, 2013). In addition to the unmodified versions of these licenses, a file was created with a Public Domain statement and a mock commercially licensed file was created to see how the scanner handles them. Test Case 1: Results. The results of the scan, seen in the Figure below, show that FOSSology using both Nomos and Monk scanners were mostly accurate. Nomos correctly identified all 24 files with the appropriate license while Monk missed five including the GPL v2.0 licensed file. Nomos identified the file, CommercialLicense_Microsoft.txt, as “No_license_found,” which is not unexpected as the target is for open source licenses. The “copyright” agent did tag this file, so it would not have been unnoticed. The output of the FOSSology scan, as seem in Figure 5 below, denotes Monk as [M: %] and Nomos as [N]. File name AGPL-1.0.txt AGPL-3.0.txt Apache-1.0.txt Apache-2.0.txt Artistic-1.0.txt Artistic-2.0.txt BSD-2-Clause-FreeBSD.txt Scanner Results (N: nomos, M: monk) AGPL-1.0 [M: 98%][N] AGPL-3.0 [M: 99%][N] Apache [N] Apache-2.0 [M: 99%][N] Artistic-1.0 [M: 100%][N] Artistic-2.0 [M: 100%][N] BSD-2-Clause-FreeBSD [M: 100%][N] USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO BSD-3-Clause.txt BSD-4-Clause-UC.txt BSD-4-Clause.txt CommercialLicense_Microsoft.txt CPL-1.0.txt GPL-1.0.txt GPL-2.0.txt GPL-3.0.txt LGPL-2.0.txt LGPL-2.1.txt LGPL-3.0.txt Libpng.txt MIT.txt MPL-1.0.txt MPL-2.0.txt OpenSSL.txt PublicDomain.txt 20 BSD-3-Clause [M: 82%][N] BSD-4-Clause-UC [M: 100%][N] BSD-4-Clause [M: 88%][N] No_license_found [N] CPL-0.5 [M: 99%], CPL-1.0 [N] GPL-1.0+ [M: 99%], GPL-1.0 [N] GPL-2.0 [N] GPL-3.0+ [M: 100%], GPL-3.0 [N] LGPL-2.0+ [M: 100%], LGPL-2.0 [N] LGPL-2.1+ [M: 100%], LGPL-2.1 [N] LGPL-3.0+ [M: 100%], LGPL-3.0 [N] Libpng [M: 99%][N] X11 [M: 95%], MIT [N] MPL-1.0 [M: 100%][N] MPL-2.0 [M: 95%], MPL-2.0-no-copyleft-exception [N] OpenSSL [N] Public-domain [N] FIGURE 5. FOSSology scan results: Test case 1 Test Case 2 The second test case involves slightly different criteria, although the desired result is still identification of license information within files. A file was retrieved containing a copyright to Microsoft (2014) with the MIT license text, another with a Twitter (2012) copyright and the Apache license, and a third file containing an Intel copyright with the BSD license from the OpenCV Project (2014). This is to see how FOSSology handles files with proprietary copyrights and open source licenses. Files with the MIT license, Apache license, and GPL license were included, but had keywords that identify them removed. For example, the GPL v2.0 license had the terms “GPL” and “Free Software Foundation” removed to see if the scanners capture the license text. One file contained a license-choice statement, permitting the use of either the GPL v2.0 or later, or the Affero GPL v3.0 license. A file from the OpenSSL Project (2012) with no apparent copyright or license was included to test the scanner’s reaction. Finally, the GPL license text with “Lesser” added as well as the Lesser GPL license with the keyword “Lesser” removed USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 21 as code is sometimes released under these circumstances in an attempt to change the licensing terms. Test Case 2: Results Of the ten files scanned, FOSSology found twelve licenses in total. The Nomos scanner identified matches in all ten files with the exception of OpenSSL_makefile.txt, which it categorized as “no_license_found.” Figure 6 below shows the outcome of this test. The Monk scanner missed six files and returned incorrect results for three. For the file Apache2.0_modified.txt, the Monk scanner returned a 93% match to the ImageMagick (2015) Project, which is an open source program aimed at editing bitmap images. ImageMagick is distributed under the Apache 2.0 license, but there is no explanation of why the Monk scanner returned this project instead of a match to the Apache 2.0 license itself. The Nomos scanner did correctly identify the Apache license in this file. The two files with MIT licenses, Microsoft_MIT.txt and MIT_modified.txt, were both correctly identified by Nomos, but Monk tagged them both as “X11.” The FSF (2014) describes the X11 license as, “permissive non-copyleft free software license…sometimes called the MIT license.” Given this, it is reasonable that the Monk scanner identified these two files, yet begs the question as to the difference between it and Nomos. File name Apache-2.0_modified.txt Apache_Twitter.txt GPL-2.0_modified.txt GPL_withLESSERadd.txt JUCE_GPL_Affero.txt LesserGPL_Lesser_removed.txt Microsoft_MIT.txt MIT_modifed.txt OpenCV_BSD.txt OpenSSL_Makefile.txt Scanner Results (N: nomos, M: monk) ImageMagick [M: 93%], Apache-2.0 [N] Apache-2.0 [N] GPL [N] GPL [N] GPL-2.0+ [N] LGPL-2.1+ [M: 99%], GPL-2.1[sic] [N], LGPL-2.1 [N] X11 [M: 97%], MIT [N] X11 [M: 95%], MIT-style [N] BSD-style [N] No_license_found [N] FIGURE 6. FOSSology scan results: Test case 2 USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 22 Test Case 3 Having reviewed and discussed two test cases comprised of sample files only, an actual open source software package will be analyzed for the third test case. This is a more practical example and will highlight part of the open source review process that will be discussed in greater detail following this section. The open source package selected was Zlib, a lossless datacompression library written by Jean-Loup Gailly and Mark Adler (Zlib Software, 2014). The latest, stable Zlib library, version 1.2.8, was scanned with FOSSOlogy using the same scanning agents as the previous test cases. Test Case 3: Results There are a total of 235 files broken down into 16 categories. FOSSology calls these categories, “unique licenses found,” but it should be noted that not all of the categories are actual licenses. There are 140 files that were reported to have “no_license_found,” but some have references to licenses, such as buffer_demo.adb, which contains the text, “Open source license information is in the zlib.ads file.” Browsing the directory for zlib.ads shows that this file is licensed under the terms of the GPL v2.0 license, so it is fair to say that the file buffer_demo.adb falls under the same license terms. Recall that distributing files licensed with GPL v2.0 will likely obligate the entity to make available the source code to the entire distribution, as it could be considered a derivative work. Also, the files that did not contain actual license text, yet did contain a reference that the “copyright” agent was intended for, in fact did tag those files appropriately. There are two other categories that reference separate documents which are “see-file” and “see-doc(OTHER).” Notice in figure 7 below that the “Concluded License Count” is 0; this indicates that a user has not applied any notes or annotations regarding any of the files. If a USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 23 proper review is to be conducted, all files must be researched to discover what license obligations exist and notated accordingly. FIGURE 7. FOSSology scan results: Test case 3 (from FOSSology, 2015) Certain files in the category, “no_license_found,” did have other origination indicators. For these types of files, the “Copyright/Email/Url” agent feedback should also be viewed. Figure 8 below shows file, iowin32.h, which does not contain an actual license, but it can be noted that there are copyrights present, as well as website references. Following the URL provided in the file leads to Minizip (2010), a Zlib software utility which is licensed under the terms of the Zlib license. The concluded license can now be updated to “Zlib,” rather than “No license found.” USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 24 FIGURE 8. “copyright” Agent output (from FOSSology, 2015) By clicking on a file and analyzing its contents, it is observed that no license data is present, therefore, a note may be added to the file’s metadata as seen below. A review of the file content shows that there is no license, thus it is fair assessment that this file falls under the terms of the top-level directory; Zlib. The file metadata is amended with a notation shown in figure 9 below. FIGURE 9. Sample notation 1 (from FOSSology, 2015) USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 25 As previously mentioned, there are occurrences of files that reference other files for licensing information. The example file, buffer_demo.adb, which referenced zlib.ads, did contain the GPL v2.0 license. The file should be notated to reflect the disposition of the licensing, as shown in figure 10 below. FIGURE 10. Sample notation 2 (from FOSSology, 2015) As the FOSSology scanning agents are developed to search for certain criteria to help identify licensing, copyright, and other terms that indicate ownership, it is possible that a user may encounter a false positive. The file, assemblyinfo.cs, depicted in figure 11 below, shows that Nomos tagged the keyword “trademark” as a possible reference to a proprietary object. It can be seen that the term, “trademark” does not actually appear as part of a license, copyright, or author statement implying that a particular trademark is claimed. Further, Microsoft (2015), explains that this is a Visual Studio auto-generated file and that the assembly trademark field describes the trademark information for the assembly. In this case the line of code reads, “[assembly: AssemblyTrademark("")].” Had there been an actual trademark claimed, there would have been a USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO value entered between the quotations, whereas there is no value seen here. Based on this research, it is fair to conclude a false positive and notated as shown in Figure 12 below. FIGURE 11. Editing Concluded License (from FOSSology, 2015) 26 USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 27 FIGURE 12. Sample notation 3 (from FOSSology, 2015) Once all files have been notated appropriately, due diligence should be taken to assure that any offending files are removed from builds that are to be distributed externally. If these issues are identified early in the development process, risk can be mitigated at lower cost and disruption to workflow. The final concluded license, with all user notations should be saved to a local repository for reference. The final report may also be used to generate a notice file that will be further discussed in the following step. Developing an Open Source Compliance Program A sample workflow employing the FOSSology tool has been outlined and combining everything into a viable compliance program shall now be discussed. Integrating open source code into proprietary source code and eventually shipping a safe product is the desired outcome. Open source compliance is an amalgamation of the policy, process, training and tools that allow an organization or an individual to use OSS and offer contributions back to the community. This involves ensuring that license terms and obligations are adhered to, copyright and trademark are respected, as well as safeguarding the organization’s intellectual property. The Linux Foundation USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 28 (2010) publishes the Open Compliance Program, which serves as a good reference when developing a compliance model. All stakeholders should have the opportunity to actively participate and fully understand their roles. The compliance program is comprised of workflows and policies that help to educate employees on what is appropriate for the business. The policies are enforced through a combination of training and tools which act together to establish the best processes to follow. Figure 13 below is expounded on in the following sections. Name OSC Manager & Team Conduct FOSSology Code Scan Perform Due Diligence Identify Stakeholders Implement Project Tracking System Final Review & Notice FIle Generation Define Requirements Develop Documentation Distribute Product FIGURE 13. Open Source Compliance Workflow Open Source Compliance Team The first step in implementing an open source compliance program (OSCP) is to name an individual or team that will be responsible for overseeing the new program. These individuals are often referred to as Open Source Compliance (OSC) managers or FOSS managers, depending on the organization. The former title will be referred to from this point on. This person or team should be familiar with the software development lifecycle process, particularly in the manner in USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 29 which it impacts deliverables. A good working knowledge of the software products and offerings that the company has, as well as the customers and vendors, is important. While experience with fields like engineering and legal are useful, these two professions tend to fall on opposing ends of the spectrum when it comes to software development and licensing. Someone with good project management skills would likely succeed in this position as the ability to relate to people from different disciplines is paramount. Stakeholders and Requirements Once the manager or team has been established and a good working knowledge of products and offerings, they should begin actively identifying key roles in other groups. Relationships should be fostered with all key stakeholders to include: legal, engineering, IT, product management, procurement, and marketing. The Linux Foundation (2010) suggests that the OSC manager, “identify team leaders for each product sub-system and component and ask them to go through their source code looking or open source.” The author of this paper, however, believes that this responsibility should fall to the compliance team. Engineers will be focused on developing code and products and it makes sense to have the OSC team act as a partner rather than simply oversight. Documentation Documentation should be developed to show how common licensing situations impact company IP and what software licenses are not compatible with the intended distribution model (Germonprez et al., 2012). Simply because a license is compatible with a proprietary business model does not mean that there are no actions that must be taken. Many opens source licenses require attribution, meaning that credit is given to the original authors and contributors by maintaining the copyright in all subsequent versions. The enterprise can still affix their company USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 30 copyright to each file as long as the original license is maintained. Under the guidance of legal counsel, a License Compatibility Matrix, as recommended by Haddad (2013), should be developed and implemented as a reference material for all stakeholders, but especially engineering and members of the OSC team. Figure 14 below outlines this. Reciprocal Hybrid Permissive Download Yes Yes Yes Evaluate Yes Yes Yes Redistribute No Yes Yes Modify No No Yes FIGURE 14. License use matrix Project Management and Tracking The inclusion of a ticketing or tracking system is of extreme importance as there will be many different roles working together on resolving the scan findings. Redmine (2014) is a popular open source project management software that lends itself well to code reviewing. The OSC team can create a project based on the product or component to be reviewed, assign user roles, track workflow, send comments, questions, and actionable items, log time to projects, and archive data for reference. There are also hundreds of user-developed plugins that can be loaded for customizing features and other options. A new ticket should be opened for each open source scan and only closed when all questions and issues are addressed. The closure of the ticket will indicate when the product has been cleared to be distributed and all scan reports, correspondence, action items, and distribution decisions should be maintained. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 31 Conduct FOSSology Scan When these steps are complete, an initial code scan should be conducted using FOSSology. Important findings that the scan should identify and document are open source package name and version number with their associated license names and version numbers, source that the code was received or downloaded from, whether or not the code was altered or left in its original state, and the intended means of distributing the code. The code scans may be broken down by product or down to individual components and should be performed as early as possible in the development cycle. The initial scan will be the most time consuming since new information about every file will be gathered, as evidenced in the third test case discussed. All subsequent scans on newer versions of code will be much faster to review since only changes and new files will be focused on. The OSC team should work with product management to track project milestones and intended release dates to ensure that rescans are conducted at the most strategic time in the development cycle. Due Diligence The next step involves leveraging the FOSSology scan report, as outlined above, and generating a scan report where files with questionable licensing or origin can be identified. Elements from the scan that are important are the license list and the copyright/email/URL report. Any files with no licensing information should be reviewed further to locate origin, and any false positives should be notated accordingly. The OSC team can then perform due diligence to discover if the file or files are needed for the application. If not, the simplest answer may be to instruct the engineering counterpart to remove the file. If the file or files are needed and questionable licensing exists, all stakeholders should research a possible alternative. If there is no feasible alternative, the matter should be escalated to legal for further analysis. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 32 Final Review Nearing the final stages of the review process when all software components are reviewed and agreed upon, a notice file should be generated. A notice file contains a listing of all open source projects used or referenced and clear instructions on how to obtain the source code, if the license requires that. For most proprietary distributions, the notice file must contain attribution in the form of full copyright notices that appear in the original files, if the license requires it. The company and OSC manager’s contact info should also be listed in case of questions. Finally, when all files are approved and the licensing model is satisfactory, the product may then be released to customers. Since the files in the distribution and all pertaining research are maintained on a tracking ticket, they can be referenced and leveraged for future projects, thus saving time for all employees involved. Conclusion and Recommendations As a free and open source project, FOSSology has proven to be a valuable and reliable tool. In all three of the test cases, the Nomos scanning agent performed at 100% accuracy, while the Monk agent trailed. It seems that as Nomos has proven to be so effective, especially when utilized in conjunction with the “copyright” agent, that running Monk may not be necessary. The real value in Monk comes from its ability to identify pristine, unaltered license text. It is fair to assume in most cases the presence of a Monk hit indicates an unaltered license. Recall, however, that in test case one, Monk missed a file that contained an unaltered GPL v2.0 license. Had this been relied upon during a review, it could have been overlooked and the file shipped to customers, thus resulting in legal action. One possible course of action seems to be analyzing files that contain hits from both Nomos and Monk, and if the results are the same, a determination can be reached with ease. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 33 Perhaps most telling of the strength of Nomos was its ability to find licenses that were deliberately altered in an attempt to change the terms of those licenses. Specifically, the actual language differences between the GPL license and the LGPL license are considerably more than the term “lesser.” This does take place in practice, but unfortunately, would go unnoticed had the code not been scanned. In all three cases, as referenced in Test case 2, the Nomos scanner correctly identified the licenses despite the alterations. The file, LesserGPL_Lesser_removed.txt, which was altered to remove the keyword “Lesser,” produced an interesting result. As mentioned above, Nomos correctly identified the license as LGPL2.1, but it went a step further and identified the error by notating “GPL-2.1[sic].” The usage of [sic] is a literary editing identifier that denotes the term or phrase that it follows, which is interesting and appropriate in this context. One downfall is that for FOSSology to work, a file must contain a license, copyright, or keyword that has been defined. This means that someone could theoretically copy a function or snippet from an open source project and purposely delete the original copyright. If that file then ends up in a distributed product and is discovered, the company could suffer any or all of the non-compliance penalties that have been discussed. FOSSology, as well as virtually any open source project, can be improved by promoting more user involvement from disciplines apart from software development. Project managers, procurement, attorneys, executives, IT, and more all contribute to the success of projects in businesses and this work ethic and ingenuity can and should certainly transfer to open source projects. There is an incredibly vast and knowledgeable community that works developing and testing what is now the most innovative code available and there is no reason that the same model cannot be applied to open source compliance. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 34 Code scanning is extremely important because most GPL violations are reported to be unintentional. Bradley Kuhn (2011), President of the Software Freedom Conservancy estimates that, “98% of violation incidents are cases of negligence and not malice.” In addition, Murphy (2010) notes that company open source use ranges from 76% to 98% but only 29% actually contribute back to projects. This is a compelling emphasis as to why an Open Source Compliance team is necessary and beneficial to any company that distributes or works with open source. USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 35 References Arne, P. H. (2008). Jacobsen v. Katzer--Open source license validation: How far does it go? Computer & Internet Lawyer, 25(11), 27-31. Artistic License. (n.d.). Retrieved March 28, 2015, from http://opensource.org/licenses/Artistic1.0 Atlassian. (2015). JIRA - Issue & project tracking software. Retrieved April 1, 2015, from https://www.atlassian.com/software/jira Baldwin, H. (2014, January 6). 4 reasons companies say yes to open source. Retrieved March 2, 2015, from http://www.computerworld.com/article/2486991/app-development-4-reasonscompanies-say-yes-to-open-source.html Black Duck Software (2015). Top 20 most commonly used licenses in open source projects. http://www.blackducksoftware.com/oss/licenses Black Duck Software. (2015). Black Duck suite. Retrieved March 18, 2015, from https://www.blackducksoftware.com/products/black-duck-suite Chapman, C. (2010, March 23). A short guide to open-source and similar licenses - Smashing Magazine. Retrieved March 3, 2015, from http://www.smashingmagazine.com/2010/03/24/a-short-guide-to-open-source-andsimilar-licenses/ Cornec, B. (2010). The Fossology project. Retrieved March 20, 2015, from https://www.projetplume.org/files/fossology-v5.2.pdf Free Software Foundation. (2014). Categories of free and non-free software - GNU Project – Free Software Foundation. Retrieved March 25, 2015, from https://www.gnu.org/philosophy/categories.en.html#PublicDomainSoftware USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 36 Free Software Foundation. (2015). Retrieved March 25, 2015, from http://www.fsf.org/about/ Fontana, R. (2010). Open source license enforcement and compliance. The computer & internet lawyer, 27(4), 1-15. FOSSology. (n.d.). [Computer software]. Retrieved March 29, 2015, from https://fossology.ist.unomaha.edu/ Gardler, R. (2010, September 7). OSS Watch provides unbiased advice and guidance on the use, development, and licensing of free software, open source software, and open source hardware. Retrieved March 13, 2015, from http://oss-watch.ac.uk/resources/ssmm Gartner. (2008, November 17). Gartner says as number of business processes using open-source software increases, companies must adopt and enforce an OSS policy. Retrieved April 8, 2015, from http://www.gartner.com/newsroom/id/801412 Germonprez, M., et al. (2012). Risk mitigation in corporate participation with open source communities: protection and compliance in an open source supply chain international research workshop on IT project management 2012. Paper 3. Retrieved from http://aisel.aisnet.org/irwitpm2012/3 Gobeille, R. (2008, May 10). The FOSSology Project. Copyright 2008 ACM 978-1-60558-0241/08/05 Gobeille, B. (2010). The FOSSology Project: Overview and Discussion. The Linux Foundation: The Open Compliance Program, 1-12. GNU General Public License v2.0. (1991, June 1). GNU Project: Free Software Foundation. Retrieved March 28, 2015, from https://www.gnu.org/licenses/gpl-2.0.html Goetz, T. (2003). Wired 11.11: Open source everywhere. Retrieved February 27, 2015, from http://archive.wired.com/wired/archive/11.11/opensource_pr.html USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 37 Haddad, I. (2013, October). Scaling open source legal compliance support (LinuxCon Eu 2013). Retrieved March 29, 2015, from http://www.slideshare.net/ibrahimhaddad/linux-coneu2013final Huang, C., Simon, P., Hsieh, S., & Prevot, L. (2007). Rethinking Chinese word segmentation: tokenization, character classification, or wordbreak identification. Retrieved March 22, 2015, from http://www.aclweb.org/anthology/P/P07/P07-2018.pdf ImageMagick. (2015). Convert, edit, and compose images. Retrieved March 25, 2015, from http://www.imagemagick.org/ Jones, P. (2010, August 3). Groklaw - BusyBox and the GPL prevail again. Retrieved March 13, 2015, from http://www.groklaw.net/articlebasic.php?story=20100803132055210 Kanaracus, C. (2009, March). OpenLogic revamps open-source support offerings. Retrieved March 29, 2015, from http://www.pcworld.com/article/161686/article.html Krawetz, N. (2008, January 11). FOSSology: Symbolic alignment matrix. Retrieved March 22, 2015, from http://www.fossology.org/projects/fossology/wiki/Symbolic_ Alignment_Matrix Kuhn, B. (2009, November 8). GPL Enforcement: Don't jump to conclusions, but do report violations. Retrieved March 4, 2015, from http://ebb.org/bkuhn/blog/2009/11/08/gplenforcement.html Kyrnin, J. (2015). MIME Types by Content Type (Web Design). Retrieved March 21, 2015, from http://webdesign.about.com/od/multimedia/a/mime-types-by-content-type.htm Lee, M. (2008, December 11). Free Software Foundation files suit against cisco for GPL violations. Retrieved March 28, 2015, from http://www.fsf.org/news/2008-12-cisco-suit USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 38 Linux Foundation. (2010). Open Compliance Program Data sheet. Retrieved March 26, 2015, from http://www.linuxfoundation.org/sites/main/files/publications/lf_foss_compliance_ datasheet.pdf Majerus, L. (2008, March 26). Court Evaluates Meaning of "Derivative Work" in an Open Source License. Retrieved March 10, 2015, from http://corporate.findlaw.com/intellectual-property/court-evaluates-meaning-of-derivativework-in-an-open-source.html Microsoft. (2014, June). Microsoft.github.io. Retrieved March 29, 2015, from https://github.com/Microsoft/microsoft.github.io/commit/33be7b9d176b4315bab437d29f 1bb59be4ca0ae8 Microsoft. (2015). Common Properties: Assembly. Retrieved April 1, 2015, from https://msdn.microsoft.com/en-us/library/ee277162(v=bts.10).aspx Minizip. (2010, March 15). [Computer software]. Zip and unzip additional library. Retrieved April 1, 2015, from http://www.winimage.com/zLibDll/minizip.html Morgan, T. (2008). Breaking news: Black duck offers free software IP scanning until 2006. Retrieved March 4, 2015, from http://www.itjungle.com/breaking/bn101905-story02.html Murphy, D. (2010, August 15). Survey: 98 Percent of companies use open-source, 29 percent contribute back. Retrieved April 8, 2015, from http://www.pcmag.com/article2/0,2817,2367829,00.asp OpenCV. (2014). OpenCV 2.4.10. [Computer software]. Retrieved March 29, 2015, from http://www.nuget.org/packages/OpenCV Open Logic. (2014). Open source confidence. Retrieved March 18, 2015, from http://www.openlogic.com/ USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 39 OpenSSL. (2012). Retrieved March 29, 2015, from https://chromium.googlesource.com/chromiumos/third_party/openssl/ /factory2368.B/crypto/ripemd/Makefile Perens, B. (2010, February 22). Bruce Perens: Inside open source's historic victory. Retrieved March 6, 2015, from http://www.datamation.com/osrc/article.php/3866316/BrucePerens-Inside-Open-Sources-Historic-Victory.htm Protecode. (2015). Open Source Compliance Products. Retrieved March 18, 2015, from http://www.protecode.com/our-products/ Radcliffe, M. (2014, December 15). GPLv2 goes to court: More decisions from the Versata tarpit Opensource.com. Retrieved March 4, 2015, from http://opensource.com/law/14/12/gplv2 -court-decisions-versata Redmine. (2014). [Computer software]. Retrieved April 1, 2015, from http://www.redmine.org/projects/redmine SCO Group, Inc. v. Novell, Inc. (2011). Order and judgment. United States Court of Appeals Tenth Circuit, No. 10-4122. Retrieved from http://www.ca10.uscourts.gov/opinions/10/10-4122.pdf Sen, R., Subramaniam, C., & Nelson, M. L. (2008). Determinants of the choice of open source software sicense. Journal of management information systems, 25(3), 207-239. SFLC. (2007, October 30). BusyBox Developers and Monsoon Multimedia agree to dismiss GPL lawsuit – Software Freedom Law Center. Retrieved March 13, 2015, from http://www.softwarefreedom.org/news/2007/oct/30/busybox-monsoon-settlement/ USING OPEN SOURCE TO SCAN OPEN SOURCE – IS FOSSOLO 40 Singh, P. V. (2010). The Small-World Effect: The influence of macro-level properties of developer collaboration networks on open-source project success. ACM Transactions On Software Engineering & Methodology, 20(2), 6:1-6:27. doi:10.1145/1824760.1824763 Software Freedom Conservancy, Inc. (2015). Software Freedom Conservancy. Retrieved March 4, 2015, from https://sfconservancy.org/about/ SPDX. (2013, January 1). SPDX License List. Retrieved March 23, 2015, from http://spdx.org/licenses/ Stallman, R. (2009). Viewpoint: Why "Open Source" Misses the Point of Free software. Communications of The ACM, 52(6), 31-33. Trott, P., Duin, P. D., & Hartmann, D. (2013). Users as innovators? Exploring the limitations of user-driven innovation. Prometheus, 31(2), 125-138. doi:10.1080/08109028.2013.818790 Twitter (2012, April). Add LICENSE and README. twitter/twitter.github.com@598efe4. Retrieved March 29, 2015, from https://github.com/twitter/twitter.github.com/commit /598efe48ceb530696551e507d3773bdd44cd63d7 Ullah, K., & Khan, S. (2011). A review of issue analysis in open source software development. Journal of Theoretical & Applied Information Technology, 23(2), 9-108. Retrieved February 27, 2015. Zlib software. (2014). Retrieved March 25, 2015, from http://www.zlib.net/