Large-scale Data Management Challenges of Southern California Earthquake Center (SCEC) Philip J. Maechling (maechlin@usc.edu) Information Technology Architect Southern California Earthquake Center Research and Data Access and Preservation Summit Phoenix, Arizona 9 April 2010 Interagency Working Group on Digital Data (2009) Consider the Digital Data Life Cycle Can we Validate this Life Cycle Model against Digital Data Life Cycle Observations? Digital Data Life Cycle Origination – Jan 2009 Digital Data Life Cycle Completion – Jan 2010 Notable Earthquakes in 2010 The SCEC Partnership National Partners International Partners Core Institutions Participating Institutions SCEC Member Institutions (November 1, 2009) Core Institutions (16) California Institute of Technology Columbia University Harvard University Massachusetts Institute of Technology San Diego State University Stanford University U.S. Geological Survey, Golden U.S. Geological Survey, Menlo Park U.S. Geological Survey, Pasadena University of California, Los Angeles University of California, Riverside University of California, San Diego University of California, Santa Barbara University of California, Santa Cruz University of Nevada, Reno University of Southern California (lead) Participating Institutions (53) Appalachian State University; Arizona State University; Berkeley Geochron Center; Boston University; Brown University; Cal-Poly, Pomona; Cal-State, Long Beach; Cal-State, Fullerton; Cal-State, Northridge; Cal-State, San Bernardino; California Geological Survey; Carnegie Mellon University; Case Western Reserve University; CICESE (Mexico); Cornell University; Disaster Prevention Research Institute, Kyoto University (Japan); ETH (Switzerland); Georgia Tech; Institute of Earth Sciences of Academia Sinica (Taiwan); Earthquake Research Institute, University of Tokyo (Japan); Indiana University; Institute of Geological and Nuclear Sciences (New Zealand); Jet Propulsion Laboratory; Los Alamos National Laboratory; Lawrence Livermore National Laboratory; National Taiwan University (Taiwan); National Central University (Taiwan); Ohio State University; Oregon State University; Pennsylvania State University; Princeton University; Purdue University; Texas A&M University; University of Arizona; UC, Berkeley; UC, Davis; UC, Irvine; University of British Columbia (Canada); University of Cincinnati; University of Colorado; University of Massachusetts; University of Miami; University of Missouri-Columbia; University of Oklahoma; University of Oregon; University of Texas-El Paso; University of Utah; University of Western Ontario (Canada); University of Wisconsin; University of Wyoming; URS Corporation; Utah State University; Woods Hole Oceanographic Institution Southern California Earthquake Center • Involves more than 600 experts at over 60 institutions worldwide • Focuses on earthquake system science using Southern California as a natural laboratory • Translates basic research into practical products for earthquake risk reduction, contributing to NEHRP Lithospheric Architecture & Dynamics Tectonic Evolution & B.C.s Fault Models Crustal Deformation Modeling Fault & Rupture Mechanics Deformation Models Earthquake Rupture Models Earthquake Forecasting & Prediction Earthquake Rupture Forecasts Seismic Hazard Products Block Models Unified Structural Representation Anelastic Structures Ground Motion Simulations Attenuation Relationships Risk Mitigation Products Seismic Hazard & Risk Analysis Ground Motion Prediction SCEC Earthquake System Models & Focus Groups SCEC Leadership Teams Board of Directors Planning Committee Staff Earthquakes are system-level phenomena… They emerge from complex, long-term interactions within active faults systems that are opaque – thus are difficult to observe They cascade as chaotic chain reactions through the natural and built environments – thus are difficult to predict Origin time Surface faulting Stress transfer Landslides Liquifaction Fires Slow slip transients Tectonic loading Stress accumulation Nucleation Seismic Fault shaking rupture Seafloor deformation Socioeconomic aftereffects Structural & nonstructural damage to built environment Tsunami Human casualties Dynamic triggering Disease ----- Foreshocks ----- century decade year month Anticipation time week day ------ Aftershocks ------------------------------------------------------------------ 0 minute hour day Response time year decade Automated prospective performance evaluation of forecast models over time within collaborative forecast testing center. Engineering and interdisciplinary Research Automated retrospective testing of forecast models using community defined validation problems. Collaborative Research Project Computational codes, structural models, and simulation results versioned with associated tests. Individual Research Project Development of new computational, data, and physical models. CME Platform and Data Administration System CME Platform and Data Management TAG CME cyberinfrastructure supports a broad range of research computing with computational and data resources. Programmable Interfaces Real-time Earthquake Monitoring HPC Resource Providers Public and Governmental Forecasts Seismic Data Centers External Seismic /Tsunami Models Contribution and annotation of digital artifacts. Discovery and access to digital artifacts. Future of solid earth computational science Echo Cliffs PBR Echo Cliffs PBR in the Santa Monica Mountains is >14m high and has a 3-4s free period. This rock withstood ground motions estimated at 0.2g and 12 cm/s during the Northridge earthquake. Such fragile geologic features give important constraints on PSHA. Simulate Observed Earthquakes Then, validate simulation model by comparing simulation results against observational data recorded by seismic sensors . (red – simulation results, black – observed data) Simulate Potential Future Earthquakes SCEC Roadmap to Petascale Earthquake Computing 2004 M8 2.x 40-m spacing and 435 billion mesh points, M8 2.x to run on 230K NCCS Jaguar cores, the world most powerful machine. TeraShake1.x First large wave propagation simulations of Mw7.7 earthquakes on the southern San Andreas with maximum frequency of 0.5Hz run using kinematic source descriptions based on the Denali earthquake. 240 SDSC DataStar cores used, 53 TBs outputs, largest simulation outputs recorded. The most read article of year ShakeOut 2.x Simulations of Mw7.8 earthquakes with max 1.0Hz using source descriptions generated by SGSN dynamic rupture simulations. The ShakeOut 2.x dynamic rupture simulations were constructed to produce final surface slip equivalent to the ShakeOut 1.x kinematic sources. 32K TACC Ranger cores used. BGW BG/L 96% Parallel efficiency on 40K TJ Waterson BG/L cores. 2006 2005 M8 3.2 2010 TeraGrid Viz Award TACC Ranger 2012 Big 10 2008 Simulaion of 9.0 Megaquake in Pacific Northwest TeraShake2.x Simulations of Mw7.7 earthquakes in 2005-2006 using source descriptions generated by dynamic rupture simulations. The dynamic rupture simulations were based on Landers initial stress conditions, used 1024 NCSA TG cores. 15 Mio SUs, awarded, largest NSF TG allocation New model under development to deal with complex geometry, topography and non-planar fault surfaces. M8 1.x ALCF BG/P INCITE allocations ShakeOut verification with 3 models NICS Kraken 2007 Improved source descriptions based Wave propagation simulation: dx=25m, Mw8.0, 2-Hz, 2,048 billion mesh points, 256x bigger than current runs Simulations of Mw8.0 scenario on SAF from the Salton Sea to Parkfield ('Wall-to-Wall'), up to 1.0Hz. The source description was generated by combining several dynamic Mw7.8 dynamic source descriptions ('ShakeOut-D’). 96K NICS Kraken cores used. 2011 2009 SciDAC OASCR Award ShakeOut 1.x Simulations of Mw7.8 with max frequency of 1.0Hz run using kinematic source descriptions based on geological observations.1920 TACC Lonestar cores. Chino Hills 1.x Comparison of simulated and recorded ground motions for 2009 Mw5.4 Chino Hills, two simulations were conducted using meshes extracted from CMU eTree database for CVM4 and CVM-H, 64K NICS Kraken cores used. M8 3.1 Dynamic rupture simulation, dx=5m (50 x 25 x 25km). Improve earthquake source descriptions by integrating more realistic friction laws into dynamic rupture simulations and computing at large scales including inner-scale of friction processes and outer-scale of large faults SAN DIEGO SUPERCOMPUTER CENTER, UCSD 20 SCEC: An NSF + USGS Research Center Panel Questions • What technical solutions exist that meet your academic project requirements? • What requirements are unique to the academic environment? • Are there common approaches for managing large-scale collections? Simulation Results Versus Data • Context of this workshop is Research Data Management. – I would like to communicate characteristics of the data management complete perform seismic hazard computational research. • I will refer to our simulation results as “data” – Some groups distinguish observational data from simulation results – This distinction becomes more difficult as observation and simulation results are combined. • For today’s presentation, I will focus on management of SCEC simulation results which may include both observational data and simulation results. SCEC Storage Volume by Type Estimated SCEC Data Archives (Total Current Archives ~ 1.4 PB) SCEC Storage Elements (Files,Rows) by Type Estimated SCEC Data Archives (Total Current Archives ~ 100M files, 600M rows) Consider the Digital Data Life Cycle Estimated SCEC Simulation Archives in Terabytes by Storage Location Goal: • 1 Hz body waves Sources & Receivers: • 150 three-component stations [Nr] • Up to 0.5 Hz Surface waves • 200 earthquakes [Ns] Simulation parameters: • 200m, 1872 M mesh points • 2min time series, 12000 time steps Costs: • 2TB per SWF • 6TB per RGT • 2Hr per run •10.4 M CPU-Hrs (650 runs, 3.6 Months on 4000 cores) •400 - 600 TB Data Management Context for SCEC • Academic research groups responding to NSF proposals. Aggressive, large-scale, collaborative with need for transformative, innovative, original research (bigger, larger, faster) • Data management tools and processes managed by heavily burdened academic staff Data Management Context for SCEC • Academic research very cost sensitive for new technologies • HPC capabilities largely based on integrating existing cyberinfrastructure (CI) (not new CI development) • Largely based on use of other peoples computers and storage systems (resulting in widely distributed archives) Panel Questions • What technical solutions exist that meet your academic project requirements? • What requirements are unique to the academic environment? • Are there common approaches for managing large-scale collections? SCEC Milestone Capability Runs Milestone Runs Machine TS1 TS2 SDSC SDSC DataStar DataStar DS2 SO1 SO2 CH50m W2W-1 CH15m* M8 W2W-3** NCSA IA-64 TACC LoneStar TACC Ranger NICS Kraken NICS Kraken NICS Kraken NCCS Jaguar Blue Water NCSA Outer scale (km) 600 600 299 600 600 180 800 183 810 800 Inner (m) 200 200 100 100 100 50 100 15 40 25 Max Frequency 0.5 0.5 1.0 1.0 1.0 2.0 1 3.3 1.0 2.0 Min Surface Vel (m/s) 500 500 500 500 500 500 500 250 200 250 Mesh Points Time Steps 1.8E+09 1.8E+09 9.6E+08 1.4E+10 1.4E+10 1.1E+10 3.1E+10 3.0E+11 4.4E+11 2.0E+12 22,768 22,768 13,637 45,456 50,000 80,000 60,346 100,000 120,000 320,000 Vel. Model Input (TB) 0.05 0.05 0.03 0.42 0.42 0.31 0.89 6.87 12.68 59.60 Storage w/o ckpt (TB) 53.0 10.0 9.5 0.5 0.5 1.9 0.3 66.4 39.9 400.0 Cores used 240 1,920 1,024 1,920 32,000 64,000 96,000 96,000 223,080 320K** Wall-Clock-Time (hrs) 66.8 6.7 35.2 32.0 6.9 2.3 2.5 24 21.2 45** Sustained TeraFlop/s 0.04 0.43 0.68 1.44 7.29 26.86 50.00 87.00 174.00 1,000** * benchmarked, ** estimated Data Transfer, Archive and Management Input/output data transfer between SDSC disk/HPSS to Ranger disk at the transfer rate up to 450 MB/s using Globus GridFTP 90k – 120k files per simulation, 150 TBs generated on Ranger, organized as a separate sub-collection in iRODs Direct data transfer using iRODs from Ranger to SDSC SAM-QFS up to 177 MB/s using our data ingestion tool PIPUT Sub-collections published through SCEC digital library (168 TB in size) integrated through SCEC portal into seismic-oriented interaction environments (Zhou et al., CSO’10) SAN DIEGO SUPERCOMPUTER CENTER, UCSD 33 CyberShake Data Management Numbers • CyberShake – 8.5 TB staged in (~700k files) to TACC’s Ranger – 2.1 TB staged out (~36k files) to SCEC storage – 190 million jobs executed on the grid – 750,000 files stored in RLS CyberShake map 34 CyberShake Production Run - 2009 • Run from 4/16/09 – 6/10/09 • 223 sites – Curve produced every 5.4 hrs • 1207 hrs (92% uptime) – 4,420 cores on average – 14,540 peak (23% of Ranger) • 192 million tasks – 44 tasks/sec – 3.8 million Condor jobs • 192 million files – 11 TB output, 165 TB temp 35 Challenge: Millions of tasks • Automation is key – Workflows with clustering • Include all executions, staging, notification – Job submission • Data management – Millions of data files – Pegasus provides staging – Automated checks • Correct number of files • NaN, zero-value checks • MD5 checksums 36 What is DAG-workflow 37 Jobs with dependencies organized in Directed Acyclic Graphs (DAG) Large number of similar DAGs make up a workflow GriPhyN Virtual Data System Virtual data language – Users define desired transformations – logical names for data and transformations Virtual data catalog – Stores information about transformations, derivations, logical inputs/outputs Query tool – Retrieves necessary transformations given a description of them – Gives an abstract workflow Pegasus – Tool for executing abstract workflows on the grid Virtual Data Toolkit (VDT): part of GriPhyN and iVDGL projects – Includes existing technology (Globus, Condor) and experimental software (Chimera, Pegasus) GlobusWORLD 2003 Virtual Data Applications Chimera Virtual Data Language VDL API/CLI (manipulate derivations and transformations) Task Graphs (compute and data movemment tasks, with dependencies) Data Grid Resources (distributed execution and data management) XML Virtual Data Catalog (implements Chimera Virtual Data Schema) The Globus View of Data Architecture GriPhyN VDT Replica Catalog DAGman Globus Toolkit, Etc. 38 Functional View of Grid Data Management Application Metadata Service Planner: Data location, Replica selection, Selection of compute and storage nodes Replica Location Service Information Services Location based on data attributes Location of one or more physical replicas State of grid resources, performance measurements and predictions Security and Policy Executor: Initiates data transfers and computations Data Movement Data Access Compute Resources GlobusWORLD 2003 Storage Resources The Globus View of Data Architecture 39 Panel Questions • What technical solutions exist that meet your academic project requirements? • What requirements are unique to the academic environment? • Are there common approaches for managing large-scale collections? Treat Simulation Data as Depreciating Asset Simulation results differ from observational data. - Tends to be larger - Can be (often) recomputed - Often decreases in value with time - Less well-defined metadata SCEC: An NSF + USGS Research Center Collaborate with Existing Data Center Avoid re-inventing Data Management Centers - (Re)-Train Observational data centers to manage simulation data Change the culture so deleting data is acceptable SCEC: An NSF + USGS Research Center Simulation Data as Depreciating Asset Manage simulation results as depreciating asset: - Unique persistent ID’s for all sets - Track cost to produce, and cost to re-generate for every data set SCEC: An NSF + USGS Research Center Simulation Data as Depreciating Asset Responsibilities of researchers who want a lot of storage: - Default storage lifetime is always limited - Longer term storage-based on community use, community value, and readiness for use by community - Burden on researchers for long term storage is more time adding metadata SCEC: An NSF + USGS Research Center Remove the Compute/Data Distinction Compute models should always have associated verification and validation results and data sets should always have codes demonstrating access and usage. Apply automated acceptance tests for all codes and access retrieval codes for all data sets. SCEC: An NSF + USGS Research Center Data Storage Entropy Resistance Data sets will grow to fill storage - We recognize the need to encourage efficient storage practices as routine SCEC: An NSF + USGS Research Center Data Storage Entropy Resistance We are looking for data management tools that provide project management with tools to administer simulation results project-wide by providing information such as: - Total Project and User Storage in use Time since access for data Understanding of backup and replicas SCEC: An NSF + USGS Research Center Metadata Strategies Development of simulation metadata lead to extended effort with minimal value to geoscientists: - Ontology development as basis for metadata not (yet?) shown significant value in field. Difficulty based on need to anticipate all possible future uses. SCEC: An NSF + USGS Research Center Controlled Vocabulary Tools Controlled vocabulary management based on community-based wiki systems with subjects and terms used as tags in simulation data descriptions: - - Need tools for converting wiki, labels, and entries to relational database entries Need smooth integration between relational database (storing metadata) and wiki system SCEC: An NSF + USGS Research Center Metadata Strategies Current simulation metadata based on practical uses cases: - Metadata saved to support reproduction of data analysis described in publications. Metadata saved needed to re-run simulation. Unanticipated future uses of simulation data often not supported SCEC: An NSF + USGS Research Center End