The Future of Scientific Computing Microsoft Faculty Summit, 2005 Dr. Francine Berman Director, San Diego Supercomputer Center Professor and HPC Endowed Chair, UC San Diego SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Dan’s Questions 1. What are the key technical issues your centers currently face? 2. What research problems do you encounter in supporting scientists? 3. What will be the top challenges to Computational Sciences in 5-10 years? SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO 100+TF Class TF TF TF Class Class class Center Center Mid-large SuperSuperscale computers computers clusters and parallel machines Mid-range Mid-range parallel parallel processors processors andand networked networked workstations workstations Personal devices, Home and desktop computers, High Performance workstations High High Performance Performance Workstations Workstations Scientific Computing Enabling Technologies: Today’s version of the the Branscomb Pyramid, circa 1993 Branscomb Pyramid Key function of the NSF Supercomputer Centers: Provide facilities over and above what can be found in the typical campus/lab environment SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Data (more BYTES) There’s More to the Story … Data-oriented Science and Engineering Environment Home, Lab, Campus, Desktop Traditional HPC environment Compute (more FLOPS) SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Today’s scientific applications span the spectrum of usage and requirements Data Mgt. Envt. Data (more BYTES) EOL Extreme I/O Environment Climate Data-oriented SCEC SCEC Science Simulation ENZO Visualization simulation and NVOEngineering ENZO Environment Visualization Turbulence GridSAT CiPres Seti@Home Home, Lab, Campus, Desktop field CFD MCell Protein Traditional Folding/MD HPC CPMD environment QCD GAMESS EverQuest Turbulence Reattachment length Lends itself to Grid Could be targeted efficiently on Grid Difficult to target efficiently on Grid Compute (more FLOPS) SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO SDSC: Using Data as a Driver in a nutshell SDSC DataStar • National Cyberinfrastructure Center • UCSD Organized Research Unit • Staff includes 400 multidisciplinary IT professionals, applied researchers, technologists, and students IBM Power 4 Dataoriented HPC and storage IntimiData Blue Gene --Data ENZO astrophysics TeraShake geosciences Data-oriented scientific applications Projects include NEES IT Center (earthquake engineering) HPWREN NVO Sensor Data Community Databases and Data Collections Protein Data Bank (life sciences) CAIDA (internet data analysis) GEON (geosciences) TeraGrid (Grid Computing) ++ Data-oriented Software and Services SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Today’s Challenges for SDSC SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Integration and Coordination • Today’s “computer” is an integrated and coordinated set of hardware, software, and services. Internet SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Integration and Coordination Challenges: Sensors to Supercomputers • Computational scientists and engineers need to integrate resources at all scales to support “end-to-end” applications SDSC’s Notebook Project Enables scientists to integrate data management, collaboration, and computation environments in a digital laboratory notebook SAN DIEGO SUPERCOMPUTER CENTER Berman Graph courtesyFran of Henri Casanova UNIVERSITY OF CALIFORNIA, SAN DIEGO Incorporating the “ilities” Scalability, Predictability Software engineering fundamental for modern scientific codes – managing distribution, accommodating multiple users and web interfaces, integrating with complex SW envt. Is key Predictable performance models key to execution optimization for scientific codes SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Graphs courtesy of Jenny Schopf Usability, Interoperability, Flexibility UNIVERSITY OF CALIFORNIA, SAN DIEGO Incorporating the “ilities”: Reliability and Sustainability • • • Computational scientists and engineers increasingly rely on persistent and valuable community data collections Extreme data curation for 100 years or more involves long-term planning and a strategic approach to support Two approaches used by the preservation community: 1. Make lots of copies Entity at risk What can go wrong Frequency File Corrupted media, disk failure 1 year Tape + Simultaneous failure of 2 copies 5 years System + Systemic errors in vendor SW, or malicious user, or operator error that deletes multiple copies 15 years Archive + Natural disaster, obsolescence of standards 50 - 100 years 2. Make copies in heterogeneous SW environments Data Reliability and Sustainability: What can go wrong SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Tomorrow’s Challenges for SDSC SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Better Application Performance through Adaptivity Everyware Program Performance by [Wolski et al., 1999] Infrastructure Type 5 Minute Averages What is the smallest complete undirected twocolored (“red” and “green”) graph R(m,n) such that there a red clique of size m or green clique of size n? 1.00E+10 NT Unix Legion 1.00E+09 Integer Ops. Per Second -- a highly adaptive (nonembarrassingly parallel) Grid application which investigated solutions to the Ramsey Number Problem: Legion Condor NT Globus Unix Java Netsolve 1.00E+08 1.00E+07 1.00E+06 1.00E+05 Globus Condor 1.00E+04 Java 1.00E+03 SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Graph courtesy of Rich Wolski UNIVERSITY OF CALIFORNIA, SAN DIEGO Netsolve The Dynamics of Sharing • As scientific applications require more and more components, with more and more sophisticated interactions, models for effective group dynamics, policy, distribution of control, and other “social” issues will become critical for success. SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Group Dynamics – Innovation from the Commercial Sector IM – Group communication RPG – Group Entertainment E-Bay – the group dynamics of shopping SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO The Next Generation of Scientists, Consumers, and Leaders is Tech-savvy • • Assume • That everything is available online and everything is free (the Web) • That none of the resources have to be where you are • That you can communicate with anyone anytime (email, IM) • That you can adapt to things in real time (RPG) • Some rudimentary level of competence with “business models” (Sim environments) Expect traditional ways of doing things to change … That’s the baseline … SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO Thank You www.sdsc.edu SAN DIEGO SUPERCOMPUTER CENTER Fran Berman UNIVERSITY OF CALIFORNIA, SAN DIEGO