Transformation of Science through Cyberinfrastructure Manish Parashar Program Director, Office of Cyberinfrastructure National Science Foundation mparasha@nsf.gov (Based on a presentations by E. Seidel and J. Munoz) Data-Driven Multiscale Collaborations* for Complexity - Great Challenges of 21st Century Multiscale Collaborations • General Relativity, Particles, Geosciences, Bio, Social... • And all combinations... Science and Society being transformed by CI and Data • Completely new methodologies • “The End of Science” (as we know it) CI plays central role • No community can attack challenges alone • Technical, CS, social issues to solve Places requirements on computing, software, networks, tools, etc *Small groups still important! Cyberinfrastructure => Cyber-Ecosystems 21st century Science and Engineering: New Paradigms & Practices • Fundamentally collaborative • Fundamentally data-driven Unprecedented opportunities for Science/Engineering Addressing applications in an end-to-end manner! Opportunistically combine computations, experiments, observations, data, to manage, control, predict, adapt, optimize, … Knowledge-based, information/data-driven, context/ content-aware computationally intensive, pervasive applications Crisis management, monitor and predict natural phenomenon, monitor and manage engineered systems, optimize business processes New paradigms and practices in science and engineering? How can it benefit current applications? How can it enable new thinking in science? Unprecedented Challenges Information System Availability, resolution, quality of information Very large scales Devices capability, operation, Disruptive trends calibration • Must be addressed at multiple levels • many/multi-cores, accelerators, Trust in data, data models clouds – Algorithms/Application formulations Semantics Heterogeneity • Asynchronous/chaotic, failure tolerant, … • capability, connectivity, reliability, guarantees – Abstractions/Programming systems Application • Adaptive, application/system aware, proactive, … Dynamics Dynamic behaviors Ad hoc structures, failure – • Infrastructure/Systems • space-time adaptivity Dynamic Distributed system!self-managing, resilient, • Decoupled, … and complex couplings Dynamic and complex • Lack of guarantees, common/ (opportunistic) interactions complete knowledge, … Software/systems engineering Emerging concerns issues • Power, resilience, … • Emergent rather than by design The Challenge: “a right hand turn” " Over half of the central processing units (CPUs) that Intel shipped in the fourth quarter 2007 contained two or more cores Single Thread Performance " HPC AMD Phenom <=> PC Intel Woodcrest IBM Cell Broadband Engine http://domino.research.ibm.com/comm/ research.nsf/pages/r.arch.innovation.html?ope n http://www.amd.com/us-en/assets/content_type/ DigitalMedia/43264A_hi_res.jpg http://www.intelstartyourengines.com/images/Woodcrest%20Die%20Shot%202.jpg “right hand turn” ascribed to P. Otellini, Intel 6 “New” approaches in hardware nVidia Tesla: GPGPU IBM Cell B.E. SGI Altix 350: FPGA TeraGrid resource (also used in PlayStation 3) Energy/Power Efficiency is Critical Power consumption of HPC systems is reaching the limit of power available to them Japan’s Earth Simulator with 5120 processor consumes 11.9MW ORNL’s Cray XT5 Jaugar supercomputer in with 182,000 processing cores consumes 7 MW of power • Next generation > 10 MW The cost of running such HPC systems runs into millions of dollars According to LLNL for every 1 W IBM BlueGene/L consumes 0.7 W is require to cool it 8 Empirical data shows every 10°C rise in temperature results in doubling of system failure rate Data Crisis: Information Big Bang NSB Report: Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century PCAST Digital Data Industry Storage Networking Industry Association (SNIA) 100 Year Archive Requirements Survey Report NSF Experts Study “there is a pending crisis in archiving… we have to create long-term methods for preserving information, for making it available for analysis in the future.” 80% respondents: >50 yrs; 68% > 100 yrs “Data generation == 4 x Moore’s Law Wired, Nature Data Deluge: WSJ Aug 28, 2009 Never have so many people generated so much digital data or been able to lose so much of it so quickly, experts at the San Diego Supercomputer Center say Computer users world-wide generate enough digital data every 15 minutes to fill the U.S. Library of Congress More technical data have been collected in the past year alone than in all previous years since science began, says Johns Hopkins astrophysicist Alexander Szalay The problem is forcing historians to become scientists, and scientists to become archivists and curators 10 Software Crisis Computers are exceedingly complex Desktops with hundreds of cores Supercomputers with millions of cores They last 3-4 years... Software systems and applications Science apps have 103 to 106+ lines, have bugs Applications may take decades to develop We spend at least 10x as much on hardware GC communities place requirements on software for complex CI (not just HPC!) We have a crisis in software We don’t know how to write it! Is our science reproducible? If not...not science! Toolkit for complex CI? 11 Crises We’re Facing Computing 4 yrs Technology: Computing last Multicore, programming model, fault tolerance, new models (clouds, grids) etc We don’t know how to use it! Software Complex applications, tools needed Modern apps: 106+ lines, take decades We don’t know how to write it! Collaboration “Computational science serves to advance all of science......inadequate and outmoded structures within the Federal government and the academy today do not effectively support this critical multidisciplinary field” We don’t know how to organize it! 12 NSF Vision for Cyberinfrastructure “National-level, integrated system of hardware, software, data resources & services... to enable new paradigms of science” http://www.nsf.gov/pubs/2007/nsf0728/index.jsp The Cyberinfrastructure Vision “Cyberinfrastructure integrates hardware for computing, data and networks, digitally-enabled sensors, observatories and experimental facilities, and an interoperable suite of software and middleware services and tools…” - NSF’s Cyberinfrastructure Vision for 21st Century Discovery A global phenomenon; several LARGE deployments Cybera, WestGrid, TeraGrid, Open Science Grid (OSG), EGEE, UK National Grid Service (NGS), DEISA, etc. New capabilities for computational science and engineering seamless access • resources, services, data, information, expertise, … seamless aggregation seamless (opportunistic) interactions/couplings Office of Cyberinfrastructure (OCI) Development of collaborative computational science Research and development of comprehensive CI Application of CI to solve complex problems in science and engineering Provide stewardship for computational science at NSF, in strong collaborations with other offices, directorates, and agencies Supports the preparation and training of current and future generations of researchers and educators to use Cyberinfrastructure to further research and education goals OCI Program Areas Learning Management Dr. J. L. Muñoz Alan Blatecky Dr. Manish Parashar Dr. Rob Pennington Dr. Susan Winter Data/Visualization Dr. Phil Bogden Jon Stoffel High Virtual Organizations Dr. Susan Winter first 16 Lehigh’09 Networking/CyberSec Performance Computing Dr. Rob Pennington Dr. Barry Schneider and WF Dev Alan Blatecky Dr. Jennifer Schopf Software Dr. Manish Parashar Dr. Abani Patra Dr. Jennifer Schopf initial|lastname@nsf.gov J. Muῆoz/NSF:OCI 16 Office Director Deputy Office Director High Performance Computing Data Grand Challenge Communities & Virtual Organizations Workforce Development Networking- Campus Bridging Cross-NSF Activities/ Other Activities Software NSF Vision for Cyberinfrastructure High End Computing “Modeling, simulation, and knowledge from data collections, [which] is increasingly essential to scientific and engineering Sustained petascale capable systems; going beyond HPC! HPC software and tools Necessary scalable applications Sharing among academic institutions to optimize the accessibility and use of HPC assets deployed and supported at the campus level” Toolkit for complex CI? AMR on a million processors? HPC @ NSF Evolution 1985-97 1997-2004 Super PACI computer Centers Five separate centers: Pittsburgh, NCSA, SDSC, Princeton, Cornell Two “leading-edge sites” NCSA & SDSC partner with other campus/ regional research & computational centers. Bundled support for hardware, staff, outreach, sub-awards to partners. 2000-2004 Extensible Terascale Facility (ETF) construction phase 2004-2010 TeraGrid Resource Providers and GIG operational phase 2005-2007 Core HPC Support SDSC & NCSA. NCSA, SDSC, Argonne National Laboratory, and the Center for Advanced Computing Research (CACR) at California Institute of Technology, followed by PSC, IU/ Purdue and ORNL 11 Resource providers providing HPC, storage, visualization, networking, support, outreach. TeraGrid Partnership TeraGrid • To enable broad use by researchers and educators requires: • Access to digital resources at unprecedented scales: high-end computing, high-end storage, and high-end data analysis tools • A user environment that makes it straightforward to use these resources for complex scientific work • Consulting support and training • Advanced software - system software, middleware, and application software High Performance Computing- Track 1 NSF seeks to deploy/support a world-class HPC machine of unprecedented capability to empower the U.S. academic research community Machine is called “Blue Waters” and will be located at the NSCA at the University of Illinois at UrbanaChampaign Award of $207M effective October 1, 2007 for 5 years Blue Waters will be completed/operational 2011 Available for use on “Grand Challenge” projects with users selected via a competitive process Blue Waters Petascale System (2011) Blue Waters General Characteristics Based on IBM PERCS 1 petaflops sustained performance on real applications Blue Waters System Characteristics > 200,000 cores using multicore POWER7 processors > 32 gigabytes of main memory per SMP > 10 petabytes of user disk storage > 100 Gbps external connectivity (initial) Fortran, Co-Array Fortran, C/C++, UPC, MPI/MPI2, OpenMP, Cactus, Charm++ Blue Waters Interim Systems at NCSA POWER 5+ and POWER6 software and application development testbeds Blue Waters System Training and Support 2 High Performance Computing- Track 2 NSF Solicitation to deploy/support HPC machines to a wide range of researchers nationwide Two machines completed with two more planned Systems are used in various research simulation & modeling projects Machine operating costs/ maintenance/user support $7.5M/ yr-$9M/yr High Performance Computing- Track 2 Kraken (Cray XT5), UTK ($65 million, 2007/2009) Peak performance of more than 607 teraflops (now 1 PF) 8256 compute nodes, 66,048 computational cores More than 100 terabytes of memory 2,300 trillion bytes of disk space Ranger, TACC ($59 million, 2006/2008) Peak performance: 579 TFLOPS Over 60,000 processing cores 125 TB memory 1.7 PB Track 2D Update Three Track 2 awards were made in 2009 Experimental ~2 yrs -> Production cyberinfrastructure $20M Data Intensive, SDSC/UCSD Flash Gordon project: Solid State Disk (Huge SSD) $12M Experimental HPC, GaTech Keeneland project: GPGPU computing $10.1M Experimental Grid, Indiana U FutureGrid project: Grid/Cloud Computing TeraGrid Phase III – eXtreme Digital (XD) New infrastructure to deliver next generation high-end national digital services Goals: Advance science and engineering Providing researchers and educators with the capability to work with extremely large amounts of digitally represented information Make it easy to move between local and national Anticipate researchers working with much larger range of digital artifacts, including digital text, digitized physical samples, real-time data streams, … PetaApps Develop the future simulation, optimization and analysis tools that use emerging petascale computing Will advance frontiers of research in science and engineering with a high likelihood of enabling transformative research Areas examined include: -Climate Change -Earthquake Dynamics -Storm Surge Models -Supernovae simulations http://nsf.gov/pubs/2008/nsf08592/nsf08592.pdf Data, Data Analysis, and Visualization “Any cogent plan must address the phenomenal growth of data in all dimensions Goals are to Catalyze the development of a system of science and engineering data collections that is open, extensible, and evolvable Support development of a new generation of tools and services for data discovery, integration, visualization, analysis and preservation The resulting national digital data framework will be an integral component in national CI” 3 0 $100M DataNet Program (5 Years) (Sustainable Digital Data Preservation & Access Network Partners) Goals: University State College USER Catalyze development of multi- Federal Non-profit disciplinary science & engineering Commercial data collections: open, extensible Local International & evolvable, sustainable over 50+ years. User-centric Support development of a new Multi-Sector generation of tools & services facilitating data acquisition, Sustainable mining, integration, analysis, Extensible visualization. Evolvable Status: UNM, JHU awards Round 2 being competed Nimble Reliable First Two DataNet Awards Data Conservancy: Johns Hopkins University Initial focus on observational data about astronomy, turbulence, biodiversity and environmental science Especially suited to terabyte-scale data sets but with strong focus on “the long tail of small science.” DataNetOne: University of New Mexico Designed to enable long-term access to and use of preserved earth observation data • Example: spread of diseases, the impact of human behavior on the oceans, relationships among human population density and greenhouse gas production [Up to] 3 new awards in 2010 Community-based Data Interoperability Networks (INTEROP) Re-purposing data Using it in innovative ways & combinations not envisioned by its creators Requires finding & understanding data of many types & from many sources - community building! Interoperability Ability of two or more systems or components to exchange and use information Status 7 awards made in first competition 2008 was cancelled 2009 proposals were due July 23, 2009 http://www.nsf.gov/pubs/2007/nsf07565/nsf07565.htm Software … Strategic Technologies for Cyberinfrastructure: STCIis the modality for Computational SW Support work leading to the development and/or st Science/Thinking in the 21 Century! demonstration of innovative Cyberinfrastructure services • Software ClearlyDevelopment more hasfortoCyberinfrastructure: be done here… SDCI • System SW as first class entities HPC, Data, Networking and Middleware target areas Crosscutting issues – Sustainability, • Crosscutting issues will Manageability, Energy efficiency be critical Cyberinfrastructure Reuse sustainability, A -venture fund set up repeatability, by OCI to promote reuse of CI elements including software, data collections, and other manageability, energy-efficiency, ... computer/data/networking based entities Virtual Organizations for Distributed Communities “A VO functions as a coherent unit...through the use of end-to-end CI systems, providing shared access to resources and services, often in realtime Technological framework...experimental facilities, instruments and sensors, applications, tools, middleware, remote access Operational framework from campus level to international scale...” Specific interpretation: Next generation Grand Challenge communities for science, engineering, humanities... Grand Challenge Communities The Next Level Up Complex problems require many disciplines, all scales of collaborations, advanced CI Individuals, groups, teams, communities Multiscale Collaborations: Beyond teams Old GC Team notion extended by VOs Grand Challenge Communities assemble dynamically Emergency forecasting: flu, hurricane, tornado... Gamma-ray bursts, supernovae, The human brain, metagenomics Place requirements on CI: software, networks, collaborative environments, data, sharing, computing, etc Scientific culture, Open Access, university structures Virtual Organizations as Socio-technical Systems (VOSS) What constitutes effective virtual organizations? How do they enhance research and education production and innovation? Supports scientific research directed at advancing the understanding of what constitutes effective Virtual Organizations Multi–disciplinary Anthropology, complexity sciences, CS, decision and management sciences, economics, engineering, organization theory, organizational behavior, social and industrial psychology, public administration, sociology Broad variety of qualitative and quantitative methods Ethnographies, surveys, simulation studies, experiments, comparative case studies, network analyses. Grounded in theory, rooted in empirical methods http://www.nsf.gov/pubs/2009/nsf09540/nsf09540.htm Open Science Grid as Model “Campus Bridge” NSF very interested in creating “bridges” from campus to national CI OSG is a national CI, locally deployed...good model We are very interested in... Exploring ways to integrate campuses better with national centers, instruments TeraGrid-OSG cooperation • Driven by applications! Understanding example science communities that can benefit from, drive this: GC Communities will require Related international cooperation: EGEE/EGI, etc Learning and Workforce Development “NSF will: Identify and address the barriers to utilization of cyberinfrastructure tools, services, and resources Promote the training of faculty, educators, students, researchers Encourage programs that will explore and exploit cyberinfrastructure, including taking advantage of the international connectivity it provides...” Cyberinfrastructure Training, Education, Advancement, and Mentoring for Our 21st Century Workforce (CI-TEAM) Preparation of a S&E workforce with skills to create, advance, and take advantage of CI over the long-term Prepare current and future generations of scientists, engineers, and educators to use, support, deploy, develop, and design CI Helps building cross-institutional networks of faculty and students through the use of collaboratories and with the express purpose of collectively addressing a common research question Implements data sensors across distributed networks of researchers to collect and analyze knowledge production and scientific argumentation under different conditions What about Campuses? Collaborative environments will need unprecedented levels of sophistication for compute, data and collaboration Can barely do low-end video conferencing today! HD, Optiportal-level environments needed DNA sequencers generate TeraBytes of Data Multidisciplinary computational science supported at very few places We need to seriously rethink our campus environments and how they can support new data-driven modalities of research, collaboration, iHDTV: 1500 Mbits/sec Calit2 to UW Research education Channel Over NLR/CENIC/PW 41 Finding a Foundation for CS&E The Third Pillar Needs a Place to Stand OCI can help create this foundation in NSF NSF supports all disciplines, all universities • OCI can be neutral catalyst across NSF for Computational Science Create new CI/CSE-research agenda • Software, networks, compute systems (not just HPC!) • Support those who prototype, use CI for next gen science? • These people are central to our future, peripheral to home dept! Education and Workforce development • CAREER awards, curriculum development, grad, postdocs, etc Universities need to address! Curriculum, best practices, rewards... Need help to organize community! Critical Lessons A comprehensive approach to CI is needed to address complex problem solving of the 21st century All elements have to be addressed, not just a few, or else cannot even start to address the real problem The CI itself is extraordinarily complex Must educate next gen: collaborative & CI-savvy for science and society New organizational structures needed Find a home for computational science We can use CI to begin to address these problems Next Priorities Integrating all activities into a much more comprehensive cyberinfrastructure… a cyberinfrastructure framework New programs in software: life-cycle, all layers Creating deeper partnerships with DoE and other agencies; international partnerships Cyber-learning Creating a computational science research agenda that crosses NSF and reaches out to other agencies and other countries 44 Campus Bridging Task Forces Data & Viz Timelines: Software 12-18 months Led by NSF Advisory Committee HPC on Cyberinfrastructure (Clouds Workshop(s) Grids) Recommendations We then go back and develop programs Education Workforce 45 Grand Challenge VOs Summary Excellent Vision already in place Atkins report, creation of OCI, initial steps all good The challenges OPPORTUNITIES are balanced, integrated, national Comprehensive, many high performance cyberinfrastructure needed! Many parts arewill underdeveloped; all needed for The rewards be limited only by our complex problem solving IMAGINATION OCI is about supporting people and apps that drive Clearly, computational science plays an this integral coming role in are the disruptive! scientific method… along with to experimentation/observation OCI wants partner with you to prepare and theory Computational science needs a home, in Changes agencies and academia! Need help!