A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers May 2010 A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Table of Contents Executive Summary .............................................................................................. page 3 Introduction..................................................................................................... pages 4-5 What is Cyberinfrastructure?.......................................................................... pages 5-7 How Does Campus Cyberinfrastructure Interact with National and Global Cyberinfrastructure? …………………………………………………………... pages 7-10 What Best Practices Can Be Identified for Campus Cyberinfrastructure?....................................................................................... pages 10-15 What Specific Actions and Investments are Timely Now?..................................................................................................... pages 15-18 How Can CIC Be Leveraged for Better Value from Cyberinfrastructure Investment?.................................................................................................pages 18-20 Notes……………………………………………………………………………………………….………pages 21-22 CIC Chief Information Officers Sally Jackson University of Illinois at Urbana-Champaign Klara Jelinkova University of Chicago Ahmed Kassem University of Illinois at Chicago (retired May 1, 2010) Brad Wheeler Indiana University Steve Fleagle University of Iowa Laura Patterson University of Michigan David Gift Michigan State University (Chair) Steve Cawley University of Minnesota Mort Rahimi Northwestern University Kathy Starkoff Ohio State University Kevin Morooney Pennsylvania State University Gerry McCartney Purdue University Ron Kraemer 2|Page University Wisconsin-Madison A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Executive Summary Prominent researchers in many fields have written about the disciplinary imperatives behind investment in cyberinfrastructure for research and education. In “A Research Cyberinfrastructure Strategy for the CIC,” we provide a CIO perspective on how to respond to these imperatives, including ideas about how to monitor the changing environment and continuously adjust to it. Our goal should be to enable scholarship at the cutting edge of every discipline, while getting as much value as possible from every dollar spent on cyberinfrastructure. The CIC campuses are very richly endowed with cyberinfrastructure resources but can be even more effective by adopting good practices that support greater coordination at every level: Plan. Share (at the highest level possible). Design funding models that promote scholarship and stewardship. Rely on user governance. Conduct cyberinfrastructure impact analysis. Over the long run, these practices should help our institutions produce more scholarship per dollar invested. In the near term, the perspectives underlying these recommended practices allow us to identify six high-priority enhancements of campus cyberinfrastructure that will be necessary if we are to meet the present needs of researchers and the expectations of funding agencies. We should all currently be investing in: Preparing for “federated identity management” and other enabling technologies for virtual organizations. Maintaining state-of-the-art communication networks. Providing institutional stewardship of research data. Consolidating computing resources while maintaining diverse architectures. Expanding cyberinfrastructure support to all disciplines. Exploring cloud computing to manage long-term financial commitments. The CIC’s well-established pattern of IT-related collaboration puts our institutions in a position to contribute more together than the sum of our individual contributions. We recommend leveraging this strength to amplify the value of every dollar of cyberinfrastructure investment along with the following Provostial actions: Build interest among other Provosts in federated identity management. Review recruiting and retention practices that affect campus cyberinfrastructure. Continue supporting competition as a consortium. Influence federal spending rules. 3|Page A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Introduction Meet Donna Cox, the director of Illinois’ eDream Institute. A leading figure in digital arts, Professor Cox has not simply appropriated digital media for artistic expression but has engaged deeply with the question of what art is for. She collaborates with computational scientists to help them gain insight into their data through visual representation, and her related outreach activities (such as exhibitions, documentaries, and animated media) aim to make these insights fully accessible to a broader public. In other words, she has actually redrawn her discipline’s boundaries to include creation of scientific understanding and meaningful translation of that understanding for nonscientific audiences.i Meet Ruth Stone, Professor of Ethnomusicology and director of Indiana’s Institute for Digital Arts and Humanities. Professor Stone has led multiple grants that are creating new tools and skills for humanities research. The thousands of hours of video she and her team have collected through field work from all over the world requires massive digital storage systems, and the data must be made available to other scholars to code for use in education and research. Her pioneering work on video software was recently adopted by the Kelley School of Business for use with their online courses.ii As we think about how we invest in technology to support research, education, and outreach, we need to think beyond stereotyped images of computational research. We need to think about people like Donna Cox and Ruth Stone: new kinds of scholars doing new kinds of scholarship. Wikipedia defines cyberinfrastructure as “the new research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services over the Internet.” Along with our most inventive scientists and engineers, these new kinds of scholars require access to a staggering array of advanced resources: high performance computing systems that can model enormously complex processes, huge data sets formed from instruments generating countless observations, advanced visualization tools that help create meaning from otherwise unfathomable complexity, sophisticated telecommunications networks capable of moving large streams of data and supporting synchronized distance interaction, and collaboration support platforms that allow formation of virtual organizations drawing experts from many fields working all over the globe. Though only a small number of really creative people generate this level of demand on an institution’s support structure, many others have special needs of some kind. The perfect poster child for 21st Century scholarship is one whose work cuts across intellectual disciplines and depends on assembling a complete cross-section of what is now known as cyberinfrastructure. Science and scholarship drive the growth of cyberinfrastructure, often through grants to individuals and teams, because cyberinfrastructure components are vital to cutting-edge research. But external funding for a project never provides all that is needed by the project. Cyberinfrastructure is heavily “layered,” and institutions bear 4|Page A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers the responsibility for continuously building layers that researchers and their funding agencies expect. If we want to produce as much creative work as possible for every dollar spent, we must respect and leverage these layers, planning each new investment to contribute as much as possible to, and cohere as completely as possible with, a continuously advancing global cyberinfrastructure for research and education. What is Cyberinfrastructure? Cyberinfrastructure includes data networks, computational facilities and computing resources, large data sets and tools for managing them, and specialized software applications including middleware.iii Arguably, it also includes people: technicians involved in creating and operating all of these resources and specialized consultants who assist researchers in using the resources.iv Some of these resources are owned by individual campuses and serve the researchers at that campus; many others are shared across campuses and made available through competitive allocation processes. Cyberinfrastructure encompasses many elements (see figure below) of what is normally thought of as research computing or research technology, but extending far beyond any physical or organizational boundary. Figure adapted from Educause-Live presentation by Curt Hillegas. 5|Page v A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Networking. Data networking for research and education is more complicated than just being connected to the Internet. The typical arrangement for a research university is a campus network connected to a regional Research & Education network that connects to a national R&E network (such as Internet2, National LambdaRail, or ESNET). The CIC is heavily invested in all layers of R&E networking. Individually and as a consortium, we have attempted to exert influence on the direction of national R&E networking, sometimes succeeding and sometimes failing. Computing. Every research university must somehow address the needs of its researchers for high performance computing (known to all computational scientists as HPC). HPC is not defined by a specific set of system characteristics, but by position with respect to the leading edge: HPC is not restricted to the "top of the pyramid," but nor is it inclusive of all supercomputing. For example, very small computer clusters in a researcher's lab are probably not HPC resources; however, these small clusters, which are very common at all of our institutions, play an important part in an institutional strategy and must be considered while planning how to meet needs at the leading edge. When we talk about investment in HPC, we are talking about investing in capabilities, not just buying objects. HPC does take place on hardware, in some sort of operating environment, but in addition, HPC requires appropriate software applications and code development, technical consulting and support, and often, training for the computational scientist. Buying a machine is definitely not the same as investing in HPC; a credible HPC strategy will involve balanced buildout of a whole complex system. Data. New technologies have greatly amplified the amount of data being collected, analyzed, and saved for re-use. All disciplines need access to data storage, and beyond that, all disciplines need data curation and stewardship. In many fields, the most important digital assets are large datasets that must be accessed remotely. High-profile examples of large datasets held within our institutions include the Hathi Trustvi collection of digitized library holdings and the data to be transmitted by the Large Synoptic Survey Telescope currently being constructed in Chile.vii Of course our researchers also require access to datasets produced and stewarded elsewhere, and in these cases the researcher’s home institution is responsible for the infrastructure support needed to gain access. Applications. The applications required for 21st Century scholarship go far beyond number-crunching. Some of the most dramatic breakthroughs in the disciplines are being achieved through advances in visualization, making this area an important priority for investment. Staffing. Virtually all cyberinfrastructure-enabled research depends on skilled technical staff, often PhD-trained in the discipline. As noted above, this is particularly true of high performance computing, which requires special programming for parallel processing. “Blue Waters” is the world’s first petascale computer, being built at Illinois with an NSF award that involves all of the CIC institutions as partners. To take advantage of the Blue Waters architecture, researchers will need significant improvements in software code. Part of the preparation for Blue Waters is to train a 6|Page A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers new generation of programmers, which the CIC is doing through its Virtual School for Computational Science and Engineering.viii Beyond research programming, there is an increasing need for new kinds of expertise in helping researchers negotiate the complexity of the sociotechnical environment.ix How Does Campus Cyberinfrastructure Interact with National and Global Cyberinfrastructure? As illustrated in the accompanying figure,x research support depends on “stacked” technology components. The stack differs from discipline to discipline, and even from researcher to researcher. Ideally, every researcher should experience all resources needed for his or her research as a single fullyintegrated system. Very important background knowledge for every Provost is how campus cyberinfrastructure components interact with components above the campus level. NSF and many higher education organizations are deeply interested in the relationships among research support resources at all levels.xi The network as experienced by the researcher is, in nearly every case, the campus network connected to a regional network connected to a national backbone connected to other worldwide networks; the better the alignment among the providers of each link, the better the experience for the researcher. We are very privileged in the CIC to have nearly seamless relationships between the campuses and the CIC-controlled OmniPoP, a regional connector to Internet2, the primary national research and education network backbone (headquartered at Michigan, with Network Operations Center managed by Indiana). We have had a more troubled relationship with National LambdaRail, a national network backbone from which we withdrew in 2008 after a decision by the National LambdaRail Board to take on a significant commercial partnership. StarLight, a primary point of connection of US networks with global research and education networks, is operated by Northwestern. With the ascent of high-speed networking, computing resources have become nearly place-independent. A researcher in need of high performance computing does not necessarily need direct local access to computing equipment. Many computing resources are made available to the science and engineering community through national labs, national supercomputing centers, and TeraGrid (an NSF-sponsored project coordinating mechanisms for individual HPC resources). Researchers can request allocations on any of the individual computing resources that are part of TeraGrid, matching system characteristics to their computational problems, using the TeraGrid User Portal (http://portal.teragrid.org) a snapshot of which is shown on the next page). 7|Page A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Illinois, Indiana, and Purdue are TeraGrid resource providers. All of the CIC schools are TeraGrid users, and collectively we account for 20% of all usage and 20% of all researchers with allocations on the TeraGrid (see Table below showing snapshot of CIC usage for one recent quarter). What any individual institution provides to its own researchers should be understood as complementary to what is available through these national resources. The CIC campuses are rich in computational and communications resources, housing many of the above-campus resources that support scholarship worldwide (NCSA, TeraGrid, Nanohub, StarLight, Internet2 headquarters, the Global Research Network Operations Center (the “NOC” for TeraGrid Usage both Internet2 and National LambdaRail and many others) as 2nd Quarter 2009 well as resources dedicated to their own researchers. Abovecampus resources are critically important, especially in Institution CPU usage Users defining the cutting edge, but they do not meet all needs even in the relatively narrow realm of high performance Illinois 632 M 234 computing. NSF’s Cyberinfrastructure Vision for 21st Century Indiana 389 M 19 Discovery points toward federal investment in leadershipPenn State 92 M 15 class resources (typically resulting in an above-campus Wisconsin 51 M 26 service) and makes clear that individual research institutions Michigan State 42 M 22 are expected to complement these resources with as much Minnesota 33 M 18 additional capacity as their researchers need. And this Chicago 28 M 61 capacity should be architecturally diverse, because not all Michigan 23 M 12 scientific problems lend themselves to the same Purdue 21 M 43 computational methods. Northwestern 14 M 37 UI-Chicago 9M 16 Iowa 8M 16 Ohio State 3M 1 8|Page A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers The accompanying diagram (courtesy of Illinois computer science professor Michael Heath) illustrates one of the most crucial architectural distinctions among HPC systems: whether memory is associated with each processor (“distributed memory”), or globally available to all processors (“shared memory”). The clusters common on all our campuses today are distributed-memory systems, often created by assembling commodity computers and connecting them with a dedicated very high-speed network. Distributed-memory architectures work well for a wide range of algorithms, in which a part of the algorithm, accompanied by the needed data, is handed off to each processor. However, not all algorithms map easily to a distributed-memory architecture. For example, algorithms whose data-access patterns are random or not easily predicted work best on a shared-memory architecture. Because shared-memory systems are much more expensive than distributedmemory systems, they tend to be national, not campus-specific, resources, even though they may be located on a campus. Distributed-memory systems are less expensive, and they can also be incrementally expanded, so they can be purchased with a small investment and then expanded as needs and resources warrant. Distributed-memory clusters remain the norm at American research universities, and we should plan for this to be so for some time to come, partly because of their costeffectiveness, but partly for operating advantages associated with not having to share (that is, with being able to command full use of the resource for extended periods of time, without waiting in a queue). This said, there are better and worse ways of managing computing cluster inventory for a campus, a point to which we will return shortly. “High performance” is often contrasted with “high throughput.” High throughout computing (HTC) includes many clusters, but also includes other ingenious ways of expanding total capacity, notably those that aggregate surplus cycles from many heterogeneous and geographically distributed systems. Wisconsin researchers created Condor (http://www.cs.wisc.edu/condor/htc.html), one of the most widely used software programs for aggregating cycles from large numbers of individual systems. A researcher with the right kind of problem can submit jobs to a Condor pool, just as jobs can be submitted to any other system. Because these systems typically lack dedicated network connections and cannot guarantee availability of cycles for scheduling of jobs, they tend to be most usable where the problems involve loosely coupled operations. While Condor pools and other forms of grid computing are important ways to increase capacity, they are not equally welladapted for all problem types. They do not eliminate the need for investment in the underlying systems, but they do offer the possibility of getting more cycles from any level of investment with little added cost. Most likely, any one campus will have significant numbers of users for each of these various forms of HPC and HTC; it is inadvisable to structure decisions about HPC and HTC around majority rule, because this is not a mere matter of taste or brand 9|Page A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers preference but a matter of matching architecture to problem structure. In sum, every institution should begin its planning for campus cyberinfrastructure investments with the most comprehensive picture possible of the national and global cyberinfrastructure to which any new resource will add, and with the richest possible understanding of problem diversity within the institution’s own mix of research programs. While a campus should not assume that all needs can be served by abovecampus resources, neither should a campus duplicate the capabilities of these resources. For HPC in particular, the goal should not be to have the biggest or fastest supercomputer, but to extend the capabilities of the institution's researchers, building campus resources that complement and connect nicely to above-campus resources. What Best Practices Can be Identified for Campus Cyberinfrastructure? Five Good Campus Practices for Managing Cyberinfrastructure 1. Plan. 2. Share (at the highest level possible). 3. Design funding models that promote scholarship and stewardship. 4. Rely on user governance. 5. Conduct cyberinfrastructure impact analysis. Attempting to describe best practices for managing campus cyberinfrastructure is a little like attempting to describe best practices for faculty recruitment or retention: There is a changing competitive landscape that is also highly variable by discipline, and specific tactics that give the first adopter a competitive edge may become huge liabilities for higher education as a whole once everyone tries to use the same (formerly advantageous) tactic. The best current thinking in cyberinfrastructure centers on the growing importance of coordination and cooperation, and on the crippling cost of trying to gain competitive advantage through accumulation of material. We recommend five practices we consider to be well-justified, even though they lack the experience base to be described as industry best practices. Good Practice 1: Plan. Maintaining a viable campus cyberinfrastructure is an ongoing process of responding to the co-evolution of technology and the scholarship it enables. This requires context-sensitive planning. If no other planning framework has yet taken hold on the campus, planning for cyberinfrastructure can be modeled loosely on the processes many institutions use for creating long-term facilities master plans. This planning model is appropriate because it does not outline actions to take in a predetermined 10 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers order but focuses on keeping many independently motivated projects aligned with an overall direction for the institution. For facilities, individual building projects constantly amend the campus master plan. The role of the master plan is to assure that, in addition to being evaluated programmatically, new buildings are evaluated for their fit with an overall aim and for their impact on other future possibilities. We can think of a cyberinfrastructure master plan as a methodology a campus uses for planning to meet the ever-changing IT needs of its researchers. This methodology would include continuous projection of technology directions and disciplinary trends (analogous to projection of shifts in student enrollment and student demographics), and it would include continuous updating of a campus cyberinfrastructure “map” to track significant changes in the environment and expose the preparation needed to accommodate these changes. The campus CIO should be the steward of this cyberinfrastructure master plan as the CFO is normally the steward of the facilities master plan. Some universities have such plans, but we know from in-depth conversations with peers across the nation that many research universities do not plan their cyberinfrastructure the way they plan for other institutional assets; they simply leave it to individual researchers to meet their own needs with little constraint, little commitment, and little or no review, from the campus. Researchers with high-end computational needs are assumed to be capable of attracting external funding, and when they do so, they are often left to manage the computational resources within their own labs, often without prior assessment of either the adequacy of power and communications infrastructure and without review of financial impact on the campus. This prevailing practice is becoming less viable over time: Letting cyberinfrastructure grow primarily from successful funding proposals enriches our campuses, but it also places stress on other elements of campus IT and adds needless operating cost that can accumulate to very large numbers in any university with significant research activity. Good Practice 2: Share. Share at the highest possible level, especially for high performance computing. As we have pointed out, coherent connection to national and global cyberinfrastructure is far more important today than is having individual control over resources. Sharing is not only a good way to control costs, but also a way to improve the resource itself. Resources that become more valuable as more people adopt them include communication platforms (where increased adoption means that each individual can reach more people), software (where large communities of coders and users can lead to much more rapid and sustainable development), and large communitycurated datasets (where the more people using the dataset, the more layers of information build around the data). Some resources actually lose value if they become too locally inflected; for example, value is diminished for datasets, applications, and services that use idiosyncratic methods of controlling access. One resource type particularly receptive to sharing is cluster computing. Clusters 11 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers maintained independently within individual labs are very common, and federal agencies encourage the proliferation of clusters (inadvertently, through how they make awards for computing equipment). Campus leadership also encourages researchers to build individual clusters through common practices in the awarding of startup funds. The clusters themselves are quite affordable at the level of individual labs, but they can have huge unanticipated costs to institutions if allowed to proliferate all over campus. When located in a researcher’s lab space rather than in purpose-built data centers, computing clusters not only take up more volume but also tend to increase energy costs and other costs of operation. In response to these issues, new models of cluster computing have emerged in which computational nodes are contributed to large shared systems by many individual researchers. In these new models, each researcher has direct control over his or her own resources, but when the resources are not being used by their “owner” they revert to a pool that can be used by others. This is not only beneficial for the contributors, but also far more benign for the institution. Purdue has been particularly forward in developing models for shared cluster computing, and reception by faculty has been very positive. Earlier we introduced a distinction between distributed-memory architectures and shared-memory architectures. Shared-memory systems are, as we pointed out earlier, much more expensive than clusters relative to their performance. While many institutions still invest in such systems, the largest and most powerful sharedmemory systems are (and should be) shared across institutions at national or consortial supercomputing centers. A good institutional strategy will include many forms of sharing with many levels of aggregation. HPC facilities may be shared by a few researchers, perhaps doing related work, or may be shared at the institutional level, or at the multi-institutional level among several cooperating universities, or at national or even international levels. Later we will make a specific recommendation for creating shared clusters at the campus or near-above-campus level, for any institution that has not already done so, and we will make another recommendation for looking at cloud computing as a form of demand aggregation. Institution-level investments can also make it easier for researchers to use national-level resources, and one of these is preparing to manage login credentials in a less cumbersome way. Good Practice 3: Design Funding Models That Promote Both Scholarship and Stewardship. While most research universities now have funding models that work adequately for maintaining their data networks, fewer have well-worked-out models for other components of campus cyberinfrastructure such as HPC, data storage and management, and specialized technical support. This is a serious concern, because a funding model is not just a way to pull in the money needed to operate the resource, but also a system of constraints within which individual researchers must reason about how best to spend. Some funding models encourage spending that serves all interests well, and other funding models encourage spending with negative 12 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers unintended consequences. Most university CIOs now believe that legacy funding models for HPC are responsible for significant fragmentation of campus cyberinfrastructure. Traditionally, NSF and other funding entities have provided ready support to investigators to purchase their own HPC resources, and campus leadership has willingly allocated space for these resources. Besides leading to proliferation of small HPC facilities tucked into spaces that lack the power and cooling necessary to support HPC systems, this way of doing business also leads to fragmentation of computing operations, to ineffective management of research support workforce, and to higher risks in security and management of research data. Very commonly, project-specific computing resources use graduate students or post-docs as system administrators, diverting them from the science they should be doing. From an institutional perspective, this is undesirable for an additional reason: It works at cross-purposes with developing a stable professional competence in managing HPC resources. The Coalition for Academic Scientific Computing and the Educause Campus Cyberinfrastructure Working Group issued a joint report in 2009 urging that research universities and federal funding agencies work toward policies that encourage sharing to replace policies that encourage fragmentation.xii One approach that is already known not to work very well is to fund HPC resources centrally and charge fees for service. For one thing, this model limits usage to disciplines with ready access to research funds, closing out disciplines with few funding sources. Because the true cost of operating lab-level resources is often concealed from the researchers by hidden subsidies, a service offered at a fee set to recover true cost will look too expensive to researchers (even though it is in fact less expensive to the institution), and this appearance of costliness discourages use of the resource. Still worse, charging by use may have a dampening effect on innovation and on risk-taking with grand-challenge problems. Charging some users (those with grants) and not others (those without) runs afoul of research budget compliance and creates resentment over inequity. Bluntly, fee-for-service models do not encourage sharing; they would have financial advantages for the institution if researchers chose to use them, but researchers avoid them either because they have no discretionary funds or because they perceive the service as too expensive relative to what they could do on their own. Providing centrally funded HPC resources without fees for service solves the problem of providing for researchers without discretionary funds. It does not solve the problem of what to do with additional resources brought to the campus through grant funding. Institutions with centrally funded HPC systems often have large numbers of researcher-operated clusters as well. Buy-in approaches have been piloted at a few institutions (notably Purdue) with generally good results.xiii In a buy-in program, investigators may invest their personal grant funds for HPC in a shared HPC facility, perhaps purchasing a set of computational nodes for which they receive top priority of use (but which may be used by others when the “owners” are not using them). These programs have many 13 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers advantages for both the institution and the individual researcher. For the researcher, advantages may include system administration resources provided by the HPC center instead of (as is quite common) by a graduate student or post-doc; better security and data protection; access to the HPC network fabric and storage facilities (which researchers may not include when they price their grant budgets); access to shared software made available through the HPC center; access to specialized technical support personnel; and participation in a broader community-of-practice of the HPC users. For the institution, the major advantages are conservation of space and conservation of energy. Importantly, buy-in models can be flexibly managed to include both researcherpurchased resources and campus- or department-funded resources. A base system subsidized from institutional funds may be freely available to all users, while researcher-purchased resources are automatically allocated to their owners on demand. Any buy-in program must be structured to comply with federal research audit requirements regarding the use of grant funds, but there are already several working models at successful research institutions that have passed muster with funding agencies. Good Practice 4: Rely on User Governance. IT Governance refers to “decision rights and accountability framework for encouraging desirable behaviors in the use of IT.”xiv As noted earlier in this paper, many research universities leave spending decisions to anyone with money and hold no one accountable for the overall quality of campus cyberinfrastructure. While it may seem perfectly reasonable to allow researchers to spend their own grant funds as they choose, these expenditures may encumber the campus as a whole with greater expense (e.g., greater energy expense from location of computing clusters in unsuitable academic space). Some form of accountability now seems appropriate to weigh impact on campus against benefit to individual projects. But expecting researchers to give up individual control over resources carries with it an obligation to protect their interests through strong governance methods. Setting priorities among users is an especially critical governance task. This is well understood for shared HPC resources, where decisions have to be made not only about which computational architectures to provide but also about how to assign computing jobs to positions in a queue. Institutional values, goals, and purposes will affect governance in complex ways. For example, an institution may invest in shared HPC systems to increase their availability to all disciplines, or it may do so to support specific research thrusts, and these contrasting purposes will give rise to different ways of prioritizing use. Our sense is that virtually all academic disciplines are now finding HPC to be a useful tool for scholarship, and that most institutions will want to find ways to balance an interest in supporting research across the spectrum of disciplines with an interest in providing adequate support for high end users. Governance and queue/schedule management will need to address both sets of needs. 14 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Cyberinfrastructure elements most in need of governance include the data network (including prioritization of investment), the HPC environment (including selection of computational architectures and institutional arrangements for funding and managing them), policies for balancing the massive parallelism needed by some researchers and the throughput needed by the entire community of users, and data stewardship (including curation). Any resource that must be shared among many users must also have some form of user governance to balance competing priorities and reconcile conflicting interests. Good Practice 5: Conduct Cyberinfrastructure Impact Analysis. Cyberinfrastructure impact analysis is to research expenditure what environmental impact analysis is to large development projects: an effort to understand how any proposed investment will affect its context.xv As explained earlier in this paper, the global research and education cyberinfrastructure is complexly layered, with heavy interconnection of computing, data storage, and data communications. For example, a PI’s decision to build a computing facility within an individual department or center may force later investments by the campus in upgraded building networks or even the campus backbone. Cyberinfrastructure impact analysis can be incorporated into many common decision workflows, including creation of new programs, construction of new facilities, establishment of new centers, or acceptance of major grants. The earlier this takes place, the better the opportunities for the institution to control costs. For example, by routinely conducting cyberinfrastructure impact analysis on grant proposals, a campus may be able to produce lower impact ways of accomplishing a researcher’s goal. But at a minimum, this form of analysis allows greater predictability and financial preparation. What Specific Actions and Investments are Timely Now? Six High Priority Investments in Cyberinfrastructure 1. Preparing for “federated identity management” and other enabling technologies for virtual organizations. 2. Maintaining state-of-the-art communication networks. 3. Providing institutional stewardship of research data. 4. Consolidating computing resources while maintaining diverse architectures. 5. Expanding cyberinfrastructure support to all disciplines. 6. Exploring cloud computing to manage long-term financial commitments. Meeting the needs of researchers and the expectations of funding agencies will require near-term attention to six high-priority enhancements of campus 15 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers cyberinfrastructure: Enhancement 1: Federated Identity Management and Other Enablers of Virtual Organizations. Our researchers, and our students, interact with people and resources all over the world, and they are thus often required to identify themselves through some form of login. Federated identity management is technology for passing credentials from one organization to another, using a trust federation to which the organizations subscribe. To a student or faculty member, federated identity management means that they can log in to remote resources using their home institution credentials. All CIC institutions belong to the trust federation known as InCommon, and all have implemented the technology that enables us to identify our students and faculty to remote resources. We have active projects underway to make our own systems more accessible to researchers and students coming in from other institutions, and we should take additional steps to make our resources highly permeable to people outside the institutions with whom we wish to collaborate. Enhancement 2: State-of-the-Art Communication Networks. Bandwidth is not our only concern; network technology continues to evolve, and although we must be planning years ahead for gradual replacement of old technologies with new technologies, we must also be prepared to respond instantly to the first demands for these technologies. There is still a very long tail in the distribution of need for high-end network resources; who is in the long tail at any point in time is highly unpredictable, so campus networks must be comfortable managing exceptions to whatever their current standards may be. Although we may one day consider outsourcing our data networks to commercial providers, our nearterm sourcing strategy for data networking is entwined with regional and national networks dedicated to research and education, to which we must be active contributors. Enhancement 3: Institutional Stewardship of Research Data. Data stewardship means storing and maintaining data for as long as it may be needed and controlling access to it with appropriate policy. We are all accustomed to doing this task for business data, but not for research data. The emerging need for stewardship of research data involves not only storage that is professionally managed, but also curation (things like metadata, collection management, and finding aids). Funding agencies are beginning to require researchers to present plans for long-term data availability. It is expected that data management plans will be required for all proposals NSF-wide before the end of 2010.xvi Leaving this unfamiliar task to individual researchers guarantees that the solutions will be heterogeneous, high-risk, and hard to sustain over time. 16 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Enhancement 4: Consolidating Computing Resources. Against a goal of producing as much science as possible for every dollar of investment, shared clusters are far more beneficial than individually operated clusters. As explained above, funding models play a critical role in how broadly cluster resources are shared, and Provosts can exert considerable influence, especially through how faculty startup packages are structured. The common practice of including funds for computing resources in startup packages can have long-term impacts, positive or negative, on the campus as a whole. We recommend looking closely at strategies that guarantee the faculty member access to resources needed for success but also contribute to a sustainable campus resource. For example, instead of committing funds for an independently maintained compute cluster, a startup package could be structured as a commitment of a certain number of cores on a shared system, similar to what Purdue has done to aggregate many researchers’ resources into a shared cluster. It is common on research university campuses for individual researchers to maintain their own computing resources in their own space. This is extremely wasteful of scientific capacity and extremely expensive for the campus. Avoiding needless operating expense means providing consolidated data center space; increasing the yield from all resources is possible through consolidating PI-specific clusters into shared clusters. Absent some compelling scientific rationale for physically distributed computing, all campuses have a compelling financial interest in physical consolidation. Enhancement 5: Expansion of Cyberinfrastructure Support to All Disciplines. We began this primer with two exemplars of 21st Century cyberinfrastructureenabled work: one an artist, the other a humanist. We chose these examples from many other stellar examples on the CIC campuses to make a point: There are no longer any noncomputational disciplines—just disciplines that have not yet solved the resourcing problems created by scholars who are attempting things that were not possible previously. The launch event for Dream, for example, required network capabilities unprecedented to that point on Illinois’ campus: a network connection good enough to allow physically separated musicians to respond to one another improvisationally in real time. Other disciplines in the arts, the humanities, and the professions are demanding new forms of instrumentation, new computational resources, new access to datasets and collections, and new visualization tools. Investment in cyberinfrastructure is now necessary for all forms of scholarly work. Enhancement 6: Exploration of “Cloud Computing” to Manage Long-Term Cost Commitments. As noted earlier, sharing of cyberinfrastructure resources can occur at the abovecampus level, and today’s “cloud computing” resources may fill this role. Cloud computing is computing that happens “in the Internet cloud,” an expression that is generally applied to on-demand resources or services accessed via the Internet. 17 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Typically the user of cloud services gets storage or computing time, but has no knowledge of the total size, configuration, or location of the computing facility being used. One of the earliest providers of HPC cloud services has been Amazon, with its Elastic Compute Cloud service. Specialty providers of HPC cloud services (such as R Systems, an NCSA spinoff located in Illinois’ Research Park) cater to even very high end users with services that can be bought “by the drink.” One opportunity that we believe is worth evaluating is the use of cloud services to manage the long-term financial impact of faculty startup packages. At present, it is common to structure startup packages to include funds for purchase of computer equipment, which generally leads to additional unbudgeted costs (for space and for ongoing operation). Including equivalent dollar value to spend on commercial providers of HPC not only provides a faster startup but also allows continuous access over a period of years to state-of-the-art resources instead of rapidly aging resources. This strategy works equally well with an internal cloud or a cloud shared within the CIC, by the way. Cloud HPC services may offer good alternatives to satisfying HPC needs, but attention will have to be paid to a number of contract issues that should not be left to one-off negotiation by individual faculty. These contract issues include assurances by the service provider regarding ownership and security of the data or services; the full cost of the service (considering not just the compute cycles, which are very inexpensive, but also collateral charges such as transport for data, which can be very expensive); and geographical location of the actual service being used (e.g., use of cloud services located outside the U.S. may be a problem for ITAR-subject research data or algorithms). Although due diligence is required for commercial cloud services, we find the general concept attractive enough to be worth an investment in well-conditioned contracts. How Can CIC be Leveraged for Better Value from Cyberinfrastructure Investment? CIC Collaborative Opportunities in Cyberinfrastructure Requiring Provost Action 1. Build interest among other Provosts in federated identity management. 2. Review recruiting and retention practices that affect campus cyberinfrastructure. 3. Continue supporting competition as a consortium. 4. Influence federal spending rules. Every research institution needs its own cyberinfrastructure strategy to promote its researchers’ ability to contribute at the leading edges of the disciplines. We conclude this paper by pointing out opportunities to leverage the strength of CIC to increase the value of investments we make on each campus. Given that federal funding agencies count on every institution to make significant campus-level investments 18 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers over the shared national cyberinfrastructure, the CIOs believe that we should be looking for institutional and interinstitutional strategies that make the best possible use of every dollar spent. We already regularly benefit from the following practices within the CIO group: Rapid spread of innovation within the CIC. Strategies that succeed at one CIC school tend to spread rapidly throughout the consortium, not only through the CIOs, but also through the main Technology Collaboration workgroups that comprise a dense social network among our campuses. The shared cluster model is one example; based on Purdue’s success, some other members have already adopted this innovation and others are preparing to do so. Shared investment in shared infrastructure. CIC already has a well-established practice of shared investment in cyberinfrastructure. We jointly own and operate the OmniPoP, a connector to Internet2 and other R&E networks. An important current initiative is an effort to implement shared distributed storage infrastructure. Shared contracts and shared contract language. Especially as our campuses explore cloud services, we may benefit from joint negotiation of both purchase price and contract terms. All higher education institutions share a core of concerns around privacy of personal data, around ownership of intellectual property, and around federal compliance. Our interests are better served by addressing the commercial sector as a unified market segment rather than by attempting to negotiate terms one institution and one vendor at a time. Influence as a consortium. The CIC CIOs have on occasion been successful in exerting influence within organizations dedicated to cyberinfrastructure (such as the Internet2 organization). Our most noteworthy success has been in raising the visibility of InCommon and federated identity management. Coordinated support for research and education. Leveraging previous collaborative effort around networking, the CIC CIOs were able to rapidly coordinate the videoconferencing technologies needed to run the Virtual School for Computational Science and Engineering. Several of the recommendations made earlier in this paper would benefit from the direct support of the Provosts. Build interest among other Provosts in federated identity management. Anecdotally, we know that some IT organizations feel that their campuses are not yet demanding federated identity management. The real benefit from this new technology is in none of us having to manage credentials in ad hoc ways. The CIC Provosts, all of whom enjoy the benefit of being able to log in to a growing list of resources using home campus credentials, can exert positive peer pressure on others through organizations like the AAU and the APLU. The CIOs are very 19 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers committed to this enabler of interinstitutional collaboration and would happily develop some talking points. Review recruiting and retention practices that affect campus cyberinfrastructure. We have argued that including funds for computing clusters in startup packages has negative unintended consequences for each campus, but we are aware that these practices persist because academic leadership believes that they are necessary for competitiveness in recruitment. A deep and candid discussion among an influential group of Provosts might open opportunity for rethinking this wasteful practice. The Provosts could also consider joint statements of support for change in federal agency funding practices, another point of advocacy on which the CIOs could provide content. Continue supporting competition as a consortium. One recent and dramatic success has been winning the petascale computing award for the Great Lakes Consortium (CIC plus an expanded set of collaborators). The proposal leading to this award required commitment from all institutions, not from the CIOs but from the academic leadership. We suggest that a natural extension of this would be to pool advanced technical support resources through some form of virtual help desk for researchers, taking advantage of the varied strengths of the individual campuses. A practical step toward this would be a CIC-wide summit meeting, sponsored by the Provosts, to consider whether and how we could pool advanced research technology expertise, engaging the research faculty to identify needs and resources and engaging campus CIOs to plan the enabling structures for sharing. Influence federal spending rules. Although NSF has great interest in how to coordinate campus investments with agency investments, certain funding rules and directorate-specific practices work against coordination at the campus level— creating perverse incentives that encourage wasteful use of resources. Campusspecific rules for charging overhead, and for distributing recovered overhead, may create additional perverse incentives at the lab level. Problems in our current model are broadly acknowledged, but so far, no one has been ambitious enough to try to solve this problem on behalf of the whole community. The CIOs recommend developing a CIC posture on the coordination required between institutional and national investment and using this posture to exert influence at the federal level. The long-term trust relationships among our campuses and well-worn paths to collaboration around technology benefit our researchers directly. We urge the Provosts to consider how we can leverage this strength to amplify our separate and joint influence on the shape, and the overall quality, of national and global cyberinfrastructure. 20 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers Notes i Donna J. Cox is the Michael Aiken Chair and Director of the eDream Institute at University of Illinois at Urbana-Champaign. To read more about her work at NCSA, see http://avl.ncsa.illinois.edu/. To read more about eDream (Emerging Digital Research and Education in Arts Media), see http://eDream.illinois.edu/. ii Ruth Stone is the Laura Boulton Professor of Ethnomusicology at Indiana University. To read more about her work in folklore and ethnomusicology, see http://www.indiana.edu/~alldrp/members/stone.html. iii See also Klara Jelinkova et al., “Creating a Five-Minute Conversation about Cyberinfrastructure,” Educause Quarterly, Vol. 31, no. 2, 2008. Available online at http://www.educause.edu/EDUCAUSE+Quarterly/EDUCAUSEQuarterlyMagazineVolu m/CreatingaFiveMinuteConversatio/162881. iv Paul Gandel et al., “More People, Not Just More Stuff: Developing a New Vision for Research Cyberinfrastructure,” ECAR Bulletin, issue 2, 2009. Available online at http://www.educause.edu/ECAR/MorePeopleNotJustMoreStuffDeve/163667. v “Building Cyberinfrastructure into a Campus Culture,” (March 30, 2010). See http://net.educause.edu/LIVE108. vi HathiTrust is a consortium of universities, including all of the CIC, formed to digitize library holdings for accessibility and preservation. See http://www.hathitrust.org. vii The Large Synoptic Survey Telescope is a multi-institutional, multi-national resource that aims to gather unprecedented amounts of astronomical data that will enable new science (such as 3D modeling of the universe). The data will be housed at NSCA, in the National Petascale Computing Facility. See http://lsst.org/. viii The Virtual School offers several courses each summer with multiple sites participating by high definition videoconferencing. In 2009, 110 students from 12 CIC institutions attended one or more courses. See http://www.vscse.org/. ix The same point is made by Gandel et al. (note 4). x From Thomas J. Hacker and Bradley Wheeler, “Making Cyberinfrastructure a Strategic Choice,” Educause Quarterly, Vol. 30, no. 1, 2007. Available online at http://www.educause.edu/EDUCAUSE+Quarterly/EDUCAUSEQuarterlyMagazineVolu m/MakingResearchCyberinfrastruct/157439. xi The NSF-wide Advisory Committee on Cyberinfrastructure is currently sponsoring a Campus Bridging Task Force (https://nsf.sharepointspace.com/acci_public/) aimed at understanding how campus cyberinfrastructure can seamlessly connect to national resources. Educause has a working group on campus cyberinfrastructure (http://www.educause.edu/CCI) that has coherence among layers as a standing concern. Internet2 has a history of interest in this area expressed through the work of a Campus Expectations Task Force (http://www.internet2.edu/cetf/). xii “Developing a Coherent Cyberinfrastructure from Local Campus to National Facilities: Challenges and Strategies”, report from a joint workshop of the Educause Campus Cyberinfrastructure Working Group and the Coalition for Academic Scientific Computing hosted by Indiana University in Summer 2008. Report available online at http://net.educause.edu/ir/library/pdf/EPO0906.pdf. xiii For a description of Purdue’s model and a business perspective on its advantages, see “Condo Computing” at 21 | P a g e A Research Cyberinfrastructure Strategy for the CIC: Advice to the Provosts from the Chief Information Officers http://www.nacubo.org/Business_Officer_Magazine/Magazine_Archives/JuneJuly_2009/Condo_Computing.html. xiv This well-known definition comes from Peter Weill and Jeanne Ross, “IT Governance on One Page,” MIT Sloan Working Paper No. 4517-04 (November 2004). Available online at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=664612. xv A tool for conducting cyberinfrastructure impact analysis on grant proposals has been developed and tested at Illinois, but not yet implemented at any major research university. Testing the tool on several hundred proposals showed significant opportunities for improving the project or reducing the collateral costs to the institution, or both. xvi From remarks at the Spring 2010 Internet2 Member Meeting: Jennifer Schopf, “Update on NSF Programmatic Network Funding,” 26 April 2010; see also http://news.sciencemag.org/scienceinsider/2010/05/nsf-to-ask-every-grantapplicant.html. 22 | P a g e