Migrating Open Science Grid to RPMs* Alain Roy1 on behalf of the Open Science Grid 1 Open Science Grid Software Coordinator, University of Wisconsin–Madison, USA Email: roy@cs.wisc.edu Abstract. We recently completed a significant transition in the Open Science Grid (OSG) in which we moved our software distribution mechanism from the useful but niche system called Pacman to a community-standard native package system, RPM. In this paper we explore some of the lessons learned during this transition as well as our earlier work, lessons that we believe are valuable not only for software distribution and packaging, but also for software engineering in a distributed computing environment where reliability is critical. We discuss the benefits found in moving to a community standard, including the abilities to reuse existing packaging, to donate existing packaging back to the community, and to leverage existing skills in the community. We describe our approach to testing in which we test our software against multiple versions of the OS, including pre-releases of the OS, in order to find surprises before our users do. Finally, we discuss our large-scale evaluation testing and community testing, which are essential for both quality and community acceptance. 1. Introduction In this paper, we describe how, and more importantly why, we changed how we distribute the Open Science Grid (OSG) [1, 2] software stack from being distributed with a useful but niche system called Pacman [3] to RPM [4], the widely used native package system that originated with the Red Hat Linux distribution, but is also used on other Linux variants. (We use the term native package system to mean the packaging system that is most natural to use on a given platform, generally the same one used by the operating system installation mechanism.) The OSG is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to help share computing and storage resources of research and academic communities at all scales. It provides common services and support for more than 100 resource providers and scientific institutions using a distributed fabric of high-throughput computational services. The OSG does not own computing resources but provides software and services to users and resource providers alike to enable the effective use and sharing of their resources based on the principles of Distributed High Throughput Computer (DHTC). In this paper, we will focus on the OSG Software Stack (also known as the VDT, or Virtual Data Toolkit), which provides the software to enable the implementation and use of DHTC services. The OSG Software Stack provides both computing and storage middleware as well as client access to these resources. This software is intended for installation by each site’s system administrator for use by domain scientists, who will use the OSG to run scientific workflows and access data. The stack includes software to: * Published in Computing and High Energy Physics (CHEP) 2012. • accept jobs at a site (i.e., Globus GRAM [5]), • manage storage resources (Hadoop File System [6], XRootD [7], Bestman [8]), • manage membership in virtual organizations (VOMS [9]), • manage authorization at a site (GUMS [10], edg-mkgridmap), • manage user jobs (Condor [11, 12]), and • manage pilot jobs (GlideinWMS [13]). There are many other software packages included as well—this list shows the breadth of included software. When we first began distributing software as part of projects that preceded OSG (GriPhyN and iVDGL [14]), we used Pacman, a packaging system that is used to download, install, and manage software distributions. Pacman has a number of very desirable features that made it attractive to us, including the ability to install software as either a privileged user (root) or a regular user; the ability to install software multiple times in multiple locations; and support for multiple operating systems. This combination of features allowed us to support a broad mix of users. For example, previous versions of the software stack supported up to seven operating systems (including Scientific Linux, Mac OS X, and AIX) on four different architectures (x86, x86-64, IA64, and PowerPC). [15] Also, because software installations occur in a single non-system directory, users could backup an installation before doing an upgrade. If the upgrade was a problem for any reason, they could easily move back to the previous version. Users would regularly install software once on a shared filesystem to be shared between computers in a cluster. This combination of features was particularly powerful. In late 2011, we moved away from Pacman and switched entirely to RPM in order to better meet the needs of our community (see more below). Earlier we had made forays into the world of RPMs, but they were unsuccessful. Those efforts attempted to leverage our existing software build mechanisms, and therefore we only produced so-called binary RPMs. RPMs come in two flavors: source and binary. The community standard is to use source RPMs, which contain both the source code as well as all the instructions needed to both build and install the software. Our early binary RPMs only allowed users to install the software, but they could not easily modify or rebuild the software. Therefore this time we chose to create source RPMs, which allow our RPMs to be easily reused and modified. (See more details on this in Section 2.) Providing both source and binary RPMs was seen as a significant improvement by our users. While as a result of moving to RPMs, we no longer have some of the desirable features of Pacman (easy rollback, installation on shared filesystems, natural support for wide array of operating systems), we believe that this move was the right choice for both our users and us. 2. Why move to RPM? If we lost important technical advantages in using Pacman, why did we switch? The key observation is that technical superiority is not always the primary reason to choose a solution. Instead, finding a solution that works well within a community is more important. Understanding the constraints, operations, and use cases that our community has helped us decide to transition. The portion of the Linux community that uses RPM (especially Red Hat and clones) has clear guidelines [16] on the right way to package software. The guidelines include rule such as the need for anyone to be able to build the software from source, the proper locations for files, the right way to version software, and much more. Pacman’s primary disadvantage is that it is nearly completely unknown outside of our user community. When an institution hires a new system administrator, there is a good chance that they know about RPM; if they do not, it is easy to find many books and articles to teach them what they need to know. [17, 18] However, there is nearly no chance that a new hire will know anything about Pacman, and there are few community resources to help them. Perhaps more importantly, it is very hard to share labor with a wider community, which is essential to our ability to sustain software distributions. For example, if someone creates a new source RPM, it is usually straightforward to share it with many communities: it can usually be reused with little or no work. In addition, it is easy for people to learn how to create RPMs due to extensive existing documentation. However, with Pacman it is not so straightforward. Due to the tight coupling of packages within a Pacman distribution, it is fairly challenging for one person to create a Pacman package and have it be easily usable by other groups. Therefore, by providing RPMs that match the community standards, we gain two significant advantages. First, it is easier for people to donate software packages to us. Second, it is easier for us to donate software packages to other software distributions (such as Fedora, Red Hat, or third-party software repositories such as EPEL [19]). When our software packages become part of these distributions, not only can we continue to maintain them but other people in the larger Linux community may also help to maintain the software, thus sharing the labor with us. Note that our earlier efforts with binary RPMs could not benefit from this community effort at all: not only would they simply not have been accepted as part of other software distributions, but the missing source code and build instructions prevented anyone else from easily helping, even if they were willing. There are a few potential downsides to this approach. First, we support a smaller set of operating systems than we did with Pacman. Today, we only support Red Hat Enterprise Linux and clones (Scientific Linux and CentOS), with some minimal support for Debian Linux. However, in the last several years the majority of our users have converged on exactly our supported systems. While it would be nice to support a wider variety of systems (and we may do so in the future), the lack is not inhibiting most OSG users. If we expand our support to support a wider variety of operating systems, it will take substantial effort. This is because properly fitting into a community requires understanding that community deeply. Although packaging systems like RPM and deb (used on Debian Linux) appear superficially similar, there are many differences in the tool chain that is used, the definition of a “good” package, proper location of files, and many more technical details. The next downside is the difficulty in using RPMs to install software on a shared filesystem for user in a cluster. While this is certainly true, most sites are already using cluster management software such as Puppet [20], Chef [21] and CFEngine [22, 23] to manage their sites, including installing RPMs. Thus, extending their sites to install our software stack via RPM is not a serious difficulty. There are a few sites that have found it to be a hardship, and for them we are considering alternative approaches to install a subset of our software (the subset commonly installed on the cluster, called the worker node software) without the use of RPMs. This would not be a completely new implementation but would leverage the existence of our RPMs. It is useful to think about this change, not in terms of RPM and Pacman, but in terms of in terms of leveraging and sharing with existing communities. When beginning our work on transitioning to RPMs, we defined our Principle of Community Packaging: [24] When possible, we should use packages from existing and/or broader communities; Otherwise we should make our own packages but contribute them back. Therefore, we should package software only when one of the following is true: 1. The software is not already packaged; or 2. The software is packaged but needs significant changes to be acceptable to our users. The implied corollary of this principle is that we can do this only when we adopt the same packaging mechanism (in our case RPM) that is used in the wider community. 3. How we build and distribute software 3.1. Building the software We have chosen to use standard community mechanisms to build our RPMs. Because they are well described elsewhere, we only describe them briefly here. We use a system called Koji [25] to manage our builds, which is a standalone build system that allows people to submit source RPMs to be built. It enforces compliance to some rules, such as not re-using a version number on a package. Internally, Koji uses other standard mechanisms to build the software. In particular, it creates a chroot [26] environment, installs a minimal operating system only the declared pre-requisite software, and converts the source RPM to a binary RPM with the rpmbuild software. The chroot with minimal OS is important because it ensures that the RPM declares all needed pre-requisite software: if it is not declared, it will not be installed and the build will fail. We have implemented a small layer on top of Koji to help enforce our own internal policies. For example, we require software builds to be reproducible, so therefore all builds that might be released to production must be pulled from our Subversion source code repository instead of from a developer’s personal workspace. The source code repository contains the unpatched software source code, any patches we with to apply to the software, and the RPM’s “spec” file, which is used to create the source RPM. 3.2. Distributing the software Users install the OSG software stack with yum [27], the standard mechanism for installing RPMs on Red Hat-like operating systems. The Koji software also allows us to manage yum software repositories. We maintain a set of repositories per supported operating system version: 1. osg-development: The “wild west” of our work, where OSG Software Team members can place software freely without worrying about the impact on users. 2. osg-testing: When the software has had basic internal verification, it can be moved from osgdevelopment to the osg-testing repository. This is the staging ground for future software releases. Brave early-adopter users may get software from here. 3. osg-release: The current released version of software. Software must have gone through osgtesting first. This is the standard software repository that most users will use. This repository contains all the software we have released. Although yum by definition installs the most recent version, users can choose to downgrade to older versions in case of problems. 4. osg-contrib: This is an auxiliary repository for software that the OSG does not fully support but may be useful to some users. (‘Contrib’ is short for ‘contributed’, since much of the software may be contributed by third-parties.) This allows OSG members to quickly share useful software without going through the usual testing and release processes. It is particularly useful when the software is meant for a small subset of OSG. Although our Koji service hosts these yum software repositories, it is not intended to be a highly available resource. Therefore we mirror the repositories to a central repository hosted by the OSG Grid Operations Center (GOC), which provides a scalable and highly available service. In addition, users can make their own mirrors if they want to ensure reliability in the face of network outages or want to make snapshots of the repository at certain times. Currently three sites provide public mirrors for the yum repository, which improves the scalability and reliability for all of OSG. Yum, unlike Pacman, transparently supports the use of mirrors. Figure 1: An illustration of the build and distribution process as described in the text. 1. Developers submit builds to Koji based on unmodified source code and (possibly) patches to the software. 2/3. Koji builds the binary RPMs from the source RPM and saves information about the build. 4. Koji puts the RPM into a local yum repository, which is then (5) mirrored to the OSG’s central yum repository. 6. Users can either install the RPMs directly from that repository or mirror them to their own local mirrors. We ensure that every time we modify the osg-release repository we increment the OSG Software version number and write release notes. [28] Users can feel assured that no changes are made without this documentation; thus, we provide clarity for our users. The “version number” is a bit odd, because it covers a set of RPMs, and because users may change the exact set of RPMs they have. For instance they might have installed, say, OSG 3.1.2, but installed some RPMs from OSG 3.1.0, 3.1.1, and 3.1.2. Therefore the version number refers to the set of RPMs that were provided at the time of the release, not necessarily the set of RPMs that someone has installed. That is, the version number is primarily a label that we use to describe the set of RPMs, but users may choose to mix software from different releases (at their own peril). In unusual circumstances, users may wish to get exactly the set of RPMs that were in a previous release. This is hard to do without extensive archaeological work even though all RPMs are available in the osg-release repository because a user has to know exactly which RPMs should be installed. To simplify this, we have a set of versioned repositories that have exactly the set of RPMs that were provided as part of each release. These are not widely used, but are available as a failsafe mechanism when needed. When users install the OSG Software, they cannot rely on just our software repositories. Our software also requires software available from the standard operating system distribution as well as a commonly used third-party software distribution called EPEL. 4. How we test software We have two kinds of software testing that we use to ensure that our releases are of high quality: internal automated tests and wide-area testing in a testbed. 4.1. Internal daily testing We do daily, automated tests of the full software stack. These are not unit tests of the sort that developers create when writing software, but end-to-end functional tests of the entire software stack. For instance, one test may submit a job from Condor to Globus GRAM, authorize the running of the job with GUMS, and then run the job with PBS. Given the newness of our conversion to RPMs, we are still expanding the set of tests but have reasonable coverage to catch basic problems. Currently the tests are run within a single machine, not across multiple machines. This simplifies the deployment of tests at the expense of being unable to test more complicated scenarios. However, it is still able to catch a significant number of problems. We test various combinations of our software against the operating system. In particular, we test both the pre-release of our software (from osg-testing, above) and the current release. The purpose of the pre-release is clear—it allows us to see problems before we release. The reason we test the current release is to ensure that there are no surprises and that we have not accidentally broken our software repositories. We also test against a variety of operating systems (Red Hat Enterprise Linux 5 & 6, Scientific Linux 5 & 6, and CentOS 5 & 6) to ensure that there are no subtle differences between platforms. Although Scientific Linux and CentOS are supposed to be compatible rebuilds of Red Hat Enterprise Linux, we test to ensure there are no problems.† We also plan to start testing soon against prereleases of the operating system. This is particularly important to us because we want to discover problems caused by changes to the operating system before our users do. Daily emails are sent to all members of the OSG Software Team to notify them of problems. In addition, the results can be viewed on our website. Operating System Red Hat Enterprise Linux 5 & 6 OSG Release Current Pre-release CentOS 5 & 6 Current Pre-release Scientific Linux 5 & 6 Current Pre-release Scientific Linux 5 & 6 pre-release Current (planned) Pre-release Table 1: The software combinations we test against 4.2. Wide-area testing When we have completed initial verification of internal testing, we proceed to end-to-end testing in a wide-area testbed. We have created a testbed for this purpose called the Integration Test Bed (ITB). The ITB is composed of four sites, each maintained independently. The ITB site administrators install software from the osg-testing repository and run a variety of real jobs in realistic scenarios to ensure the software works. While we could in theory replace them by automated scripts, they provide valuable insight we do not get from automated testing. They complain (sometimes loudly) when there are inconveniences in the software installation or the software has bugs we missed in internal testing. Because they are also administrators for production sites, they provide valuable insight and help us avoid problems that would aggravate many users before we ship the final release. In addition to basic verification of the software, we also run actual scientific workflows through the testbed, to ensure that end-user needs will be met. While we do not require all scientific workflows to be run for all releases, we do run them for major upgrades. While there is a core of four regular sites in the ITB, we will occasionally gather more sites for our significant updates. For example, the initial release of RPMs involved significantly more sites who provided testing and feedback. Only after the testing has completed do we release the software stack be to production (the osgrelease software repository). The time from building the software to the time it is deployed on users sites can vary considerably depending on the complexity of the software. In some cases, such as urgent security releases, it can happen in a week or less, while in other cases considerable testing may be done and it may take months. 5. Future Work and Conclusions The RPM-based OSG Software Stack has been in production release on RHEL 5 (and clones) since December 2011. We added support for RHEL 6 (and clones) in April 2012. In the future we will add software and new operating support to support our users. † To date, we have not found any incompatibilities. There is one particularly important lesson to draw from this paper: Lesson: When there is a strong community standard, it is often wise to follow the standard even if you have an alternate solution that may be technically better. While not always true (there is often a need for disruptive technologies!), in the case of software packaging we were able to better leverage the existing community packages as well as the skills of people by adopting the community standard. This allowed us to focus our time and energy on other aspects of our work (testing, user support, documentation, etc.) that are more central to our mission. 6. Acknowledgements Thanks to the entire OSG Software Team, who worked so hard on this transition. They are: Tim Cartwright, Scot Kronenfeld, Terrence Martin, Matyas Selmeci, Neha Sharma, Igor Sfiligoi, Doug Strain, Suchandra Thapa, and Xin Zhao. Significant contributions were also made by Dave Dykstra, Tanya Levshina, and Marco Mambelli. Brian Bockelman and Derek Weitzel were instrumental in both kicking off the transition and guiding us through the technical processes. The Build and Test Facility at the University of Wisconsin–Madison’s Center for High-Throughput Computing provided the installation an support for our Koji service. The OSG Operations Team, particularly Soichi Hayashi, set up our central yum repository and dealt with regular requests when we changed our mind in how it was set up. Chander Sehgal and Dan Fraser and the OSG Executive Team provided management consulting. Fermi National Laboratory provided the FermiCloud service that allowed the OSG Software Team members to use virtual machines to develop and test. And, of course, we could not have done any of this work without the vibrant Linux open source community that provided so many excellent tools. The Open Science Grid is funded by the National Science Foundation and the Office of Science, Department of Energy. References [1] M. Altunay, B. Bockelman, and R. Pordes, “Open Science Grid”, available from http://osgdocdb.opensciencegrid.org/cgi-bin/ShowDocument?docid=800 [2] Open Science Grid web site: http://www.opensciencegrid.org/ [3] Pacman web site: http://atlas.bu.edu/~youssef/pacman/ [4] http://en.wikipedia.org/wiki/RPM_Package_Manager [5] K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, and S. Tuecke, “A Resource Management Architecture for Metacomputing Systems”. In Proc. IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing, pg. 62-82, 1998. [6] HDFS Web Site: http://hadoop.apache.org/hdfs/ [7] XRootD web site: http://xrootd.slac.stanford.edu/ [8] Bestman web site: https://sdm.lbl.gov/bestman/ [9] R. Alfieri, R. Cecchini, V. Ciaschini, and F. Spataro, “From gridmap-file to VOMS: managing authorization in a Grid environment”, Future Generation Computer Systems, pg. 549-558, 2005. [10] GUMS web site: https://www.racf.bnl.gov/Facility/GUMS/1.3/index.html [11] D. Thain, T. Tannenbaum, and M. Livny, “Distributed Computing in Practice: The Condor Experience”, Concurrency and Computation: Practice and Experience, Vol. 17, No. 2-4, pages 323-356, February-April, 2005. [12] M. Litzkow, M. Livny, and M. Mutka, “Condor–A Hunter of Idle Workstations”, Proceedings of the 8th International Conference of Distributed Computing Systems, pages 104-111, June, 1988. [13] I. Sfiligoi, D. Bradley, B. Holzman, P. Mhashilkar, S. Padhi, and F. Würthwein, “The Pilot Way to Grid Resources Using glideinWMS,” Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering – 2 (2009): 428-432. doi:10.1109/CSIE.2009.950. [14] P. Avery, K. Baker, R. Baker, J. Huth, R. Moore, “An International Virtual-Data Grid [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] Laboratory for Data Intensive Science”, 2001. VDT 1.10.1 system requirements web page: http://vdt.cs.wisc.edu/releases/1.10.1/requirements.html Fedora Packaging Guidelines: http://fedoraproject.org/wiki/Packaging:Guidelines Fedora RPM Guide: http://docs.fedoraproject.org/enUS/Fedora_Draft_Documentation/0.1/html/RPM_Guide/index.html Maximum RPM: http://www.rpm.org/max-rpm/ Extra Packages for Enterprise Linux (EPEL): http://fedoraproject.org/wiki/EPEL Puppet web site: http://puppetlabs.com/ Chef web site: http://www.opscode.com/chef/ M. Burgess, “CFEngine: a site configuration engine”, USENIX Computing systems, Vol8, No. 3 1995. CFEngine web site: http://cfengine.com/ A. Roy, T. Cartwright, et. al. “Community Packaging: A Proposal”, available from https://twiki.grid.iu.edu/bin/view/SoftwareTeam/CommunityPackagingProposal Koji web site: http://fedoraproject.org/wiki/Koji Description of chroot from Wikipedia: http://en.wikipedia.org/wiki/Chroot Yum Package Manager web site: http://yum.baseurl.org/ OSG Software 3 Release Notes: https://twiki.grid.iu.edu/bin/view/Documentation/Release3/