Maechling_ASIST_9April2010 - University of Southern California

advertisement
Large-scale Data Management Challenges of
Southern California Earthquake Center (SCEC)
Philip J. Maechling (maechlin@usc.edu)
Information Technology Architect
Southern California Earthquake Center
Research and Data Access and Preservation Summit
Phoenix, Arizona
9 April 2010
Interagency
Working Group
on Digital Data
(2009)
Consider the Digital Data Life Cycle
Can we Validate this Life Cycle Model against
Digital Data Life Cycle Observations?
Digital Data Life Cycle Origination – Jan 2009
Digital Data Life Cycle Completion – Jan 2010
Notable Earthquakes in 2010
The SCEC Partnership
National
Partners
International
Partners
Core
Institutions
Participating
Institutions
SCEC Member Institutions (November 1, 2009)
Core Institutions (16)
California Institute of Technology
Columbia University
Harvard University
Massachusetts Institute of Technology
San Diego State University
Stanford University
U.S. Geological Survey, Golden
U.S. Geological Survey, Menlo Park
U.S. Geological Survey, Pasadena
University of California, Los Angeles
University of California, Riverside
University of California, San Diego
University of California, Santa Barbara
University of California, Santa Cruz
University of Nevada, Reno
University of Southern California (lead)
Participating Institutions (53)
Appalachian State University; Arizona State University; Berkeley Geochron
Center; Boston University; Brown University; Cal-Poly, Pomona; Cal-State,
Long Beach; Cal-State, Fullerton; Cal-State, Northridge; Cal-State, San
Bernardino; California Geological Survey; Carnegie Mellon University; Case
Western Reserve University; CICESE (Mexico); Cornell University; Disaster
Prevention Research Institute, Kyoto University (Japan); ETH (Switzerland);
Georgia Tech; Institute of Earth Sciences of Academia Sinica (Taiwan);
Earthquake Research Institute, University of Tokyo (Japan); Indiana
University; Institute of Geological and Nuclear Sciences (New Zealand); Jet
Propulsion Laboratory; Los Alamos National Laboratory; Lawrence
Livermore National Laboratory; National Taiwan University (Taiwan);
National Central University (Taiwan); Ohio State University; Oregon State
University; Pennsylvania State University; Princeton University; Purdue
University; Texas A&M University; University of Arizona; UC, Berkeley;
UC, Davis; UC, Irvine; University of British Columbia (Canada); University
of Cincinnati; University of Colorado; University of Massachusetts;
University of Miami; University of Missouri-Columbia; University of
Oklahoma; University of Oregon; University of Texas-El Paso; University of
Utah; University of Western Ontario (Canada); University of Wisconsin;
University of Wyoming; URS Corporation; Utah State University; Woods
Hole Oceanographic Institution
Southern California Earthquake Center
•
Involves more than 600 experts at over 60
institutions worldwide
•
Focuses on earthquake system science using
Southern California as a natural laboratory
•
Translates basic research into practical products for
earthquake risk reduction, contributing to NEHRP
Lithospheric Architecture
& Dynamics
Tectonic
Evolution &
B.C.s
Fault
Models
Crustal
Deformation
Modeling
Fault & Rupture
Mechanics
Deformation
Models
Earthquake
Rupture
Models
Earthquake Forecasting &
Prediction
Earthquake
Rupture
Forecasts
Seismic
Hazard
Products
Block
Models
Unified Structural
Representation
Anelastic
Structures
Ground
Motion
Simulations
Attenuation
Relationships
Risk
Mitigation
Products
Seismic Hazard
& Risk Analysis
Ground Motion
Prediction
SCEC Earthquake System Models & Focus Groups
SCEC Leadership Teams
Board of Directors
Planning Committee
Staff
Earthquakes are system-level phenomena…
 They emerge from complex, long-term interactions within active faults
systems that are opaque – thus are difficult to observe
 They cascade as chaotic chain reactions through the natural and built
environments – thus are difficult to predict
Origin
time
Surface
faulting
Stress transfer
Landslides
Liquifaction
Fires
Slow slip transients
Tectonic
loading
Stress accumulation
Nucleation
Seismic
Fault
shaking
rupture
Seafloor
deformation
Socioeconomic
aftereffects
Structural & nonstructural
damage to built environment
Tsunami
Human casualties
Dynamic triggering
Disease
----- Foreshocks -----
century
decade
year
month
 Anticipation time
week
day
------ Aftershocks ------------------------------------------------------------------
0
minute
hour
day
Response time 
year
decade
Automated prospective performance evaluation of forecast models over
time within collaborative forecast testing center.
Engineering and
interdisciplinary
Research
Automated retrospective testing of forecast models using
community defined validation problems.
Collaborative
Research Project
Computational codes, structural models, and simulation
results versioned with associated tests.
Individual Research
Project
Development of new computational, data, and
physical models.
CME Platform and Data
Administration System
CME Platform and Data
Management TAG
CME cyberinfrastructure supports a
broad range of research computing
with computational and data
resources.
Programmable Interfaces
Real-time
Earthquake
Monitoring
HPC Resource
Providers
Public and
Governmental
Forecasts
Seismic Data
Centers
External Seismic
/Tsunami Models
Contribution and
annotation of digital
artifacts.
Discovery and
access to digital
artifacts.
Future of solid earth computational science
Echo Cliffs PBR
Echo Cliffs PBR in the Santa Monica
Mountains is >14m high and has a 3-4s
free period. This rock withstood ground
motions estimated at 0.2g and 12 cm/s
during the Northridge earthquake. Such
fragile geologic features give important
constraints on PSHA.
Simulate Observed Earthquakes
Then, validate simulation model
by comparing simulation results
against observational data
recorded by seismic sensors .
(red – simulation results,
black – observed data)
Simulate Potential Future Earthquakes
SCEC Roadmap to Petascale
Earthquake Computing
2004
M8 2.x
40-m spacing and 435 billion mesh
points, M8 2.x to run on 230K NCCS
Jaguar cores, the world most
powerful machine.
TeraShake1.x
First large wave propagation
simulations of Mw7.7
earthquakes on the southern
San Andreas with maximum
frequency of 0.5Hz run using
kinematic source descriptions
based on the Denali
earthquake. 240 SDSC
DataStar cores used, 53 TBs
outputs, largest simulation
outputs recorded.
The most
read article
of year
ShakeOut 2.x
Simulations of Mw7.8 earthquakes with
max 1.0Hz using source descriptions
generated by SGSN dynamic rupture
simulations. The ShakeOut 2.x dynamic
rupture simulations were constructed to
produce final surface slip equivalent to the
ShakeOut 1.x kinematic sources. 32K
TACC Ranger cores used.
BGW
BG/L
96% Parallel efficiency
on 40K TJ Waterson
BG/L cores.
2006
2005
M8 3.2
2010
TeraGrid Viz Award
TACC
Ranger
2012
Big 10
2008
Simulaion of 9.0
Megaquake in
Pacific Northwest
TeraShake2.x
Simulations of Mw7.7 earthquakes in
2005-2006 using source descriptions
generated by dynamic rupture
simulations. The dynamic rupture
simulations were based on Landers
initial stress conditions, used 1024
NCSA TG cores.
15 Mio SUs,
awarded,
largest NSF
TG allocation
New model under
development to deal with
complex geometry,
topography and non-planar
fault surfaces.
M8 1.x
ALCF
BG/P
INCITE allocations
ShakeOut verification
with 3 models
NICS Kraken
2007
Improved source descriptions based
Wave propagation simulation: dx=25m,
Mw8.0, 2-Hz, 2,048 billion mesh points,
256x bigger than current runs
Simulations of Mw8.0 scenario on
SAF from the Salton Sea to Parkfield
('Wall-to-Wall'), up to 1.0Hz. The
source description was generated by
combining several dynamic Mw7.8
dynamic source descriptions
('ShakeOut-D’). 96K NICS Kraken
cores used.
2011
2009
SciDAC OASCR
Award
ShakeOut 1.x
Simulations of Mw7.8 with max frequency of 1.0Hz
run using kinematic source descriptions based on
geological observations.1920 TACC Lonestar
cores.
Chino Hills 1.x
Comparison of simulated and recorded ground
motions for 2009 Mw5.4 Chino Hills, two
simulations were conducted using meshes
extracted from CMU eTree database for CVM4
and CVM-H, 64K NICS Kraken cores used.
M8 3.1
Dynamic rupture simulation, dx=5m (50 x 25 x
25km). Improve earthquake source descriptions
by integrating more realistic friction laws into
dynamic rupture simulations and computing at
large scales including inner-scale of friction
processes and outer-scale of large faults
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
20
SCEC: An NSF + USGS Research Center
Panel Questions
• What technical solutions exist that meet your
academic project requirements?
• What requirements are unique to the
academic environment?
• Are there common approaches for managing
large-scale collections?
Simulation Results Versus Data
• Context of this workshop is Research Data Management.
– I would like to communicate characteristics of the data
management complete perform seismic hazard computational
research.
• I will refer to our simulation results as “data”
– Some groups distinguish observational data from simulation
results
– This distinction becomes more difficult as observation and
simulation results are combined.
• For today’s presentation, I will focus on management of
SCEC simulation results which may include both
observational data and simulation results.
SCEC Storage Volume by Type
Estimated SCEC Data Archives (Total Current Archives ~ 1.4 PB)
SCEC Storage Elements (Files,Rows) by Type
Estimated SCEC Data Archives (Total Current Archives ~ 100M files, 600M rows)
Consider the Digital Data Life Cycle
Estimated SCEC Simulation Archives in Terabytes by Storage Location
Goal:
• 1 Hz body waves
Sources & Receivers:
• 150 three-component stations [Nr]
• Up to 0.5 Hz Surface waves
• 200 earthquakes [Ns]
Simulation parameters:
• 200m, 1872 M mesh points
• 2min time series, 12000 time steps
Costs:
• 2TB per SWF
• 6TB per RGT
• 2Hr per run
•10.4 M CPU-Hrs (650 runs, 3.6
Months on 4000 cores)
•400 - 600 TB
Data Management Context for SCEC
• Academic research groups responding to NSF
proposals. Aggressive, large-scale,
collaborative with need for transformative,
innovative, original research (bigger, larger,
faster)
• Data management tools and processes
managed by heavily burdened academic staff
Data Management Context for SCEC
• Academic research very cost sensitive for new
technologies
• HPC capabilities largely based on integrating
existing cyberinfrastructure (CI) (not new CI
development)
• Largely based on use of other peoples
computers and storage systems (resulting in
widely distributed archives)
Panel Questions
• What technical solutions exist that meet your
academic project requirements?
• What requirements are unique to the
academic environment?
• Are there common approaches for managing
large-scale collections?
SCEC Milestone Capability Runs
Milestone Runs
Machine
TS1
TS2
SDSC
SDSC
DataStar DataStar
DS2
SO1
SO2
CH50m
W2W-1
CH15m*
M8
W2W-3**
NCSA
IA-64
TACC
LoneStar
TACC
Ranger
NICS
Kraken
NICS
Kraken
NICS
Kraken
NCCS
Jaguar
Blue Water
NCSA
Outer scale (km)
600
600
299
600
600
180
800
183
810
800
Inner (m)
200
200
100
100
100
50
100
15
40
25
Max Frequency
0.5
0.5
1.0
1.0
1.0
2.0
1
3.3
1.0
2.0
Min Surface Vel (m/s)
500
500
500
500
500
500
500
250
200
250
Mesh Points
Time Steps
1.8E+09 1.8E+09 9.6E+08 1.4E+10 1.4E+10 1.1E+10 3.1E+10 3.0E+11 4.4E+11 2.0E+12
22,768
22,768
13,637
45,456
50,000
80,000
60,346
100,000
120,000
320,000
Vel. Model Input (TB)
0.05
0.05
0.03
0.42
0.42
0.31
0.89
6.87
12.68
59.60
Storage w/o ckpt (TB)
53.0
10.0
9.5
0.5
0.5
1.9
0.3
66.4
39.9
400.0
Cores used
240
1,920
1,024
1,920
32,000
64,000
96,000
96,000
223,080
320K**
Wall-Clock-Time (hrs)
66.8
6.7
35.2
32.0
6.9
2.3
2.5
24
21.2
45**
Sustained TeraFlop/s
0.04
0.43
0.68
1.44
7.29
26.86
50.00
87.00
174.00
1,000**
* benchmarked, ** estimated
Data Transfer, Archive and
Management
 Input/output data transfer between SDSC
disk/HPSS to Ranger disk at the transfer
rate up to 450 MB/s using Globus
GridFTP
 90k – 120k files per simulation, 150 TBs
generated on Ranger, organized as a
separate sub-collection in iRODs
 Direct data transfer using iRODs from
Ranger to SDSC SAM-QFS up to 177
MB/s using our data ingestion tool PIPUT
 Sub-collections published through SCEC
digital library (168 TB in size)
 integrated through SCEC portal into
seismic-oriented interaction environments
(Zhou et al., CSO’10)
SAN DIEGO SUPERCOMPUTER CENTER, UCSD
33
CyberShake Data Management Numbers
• CyberShake
– 8.5 TB staged in (~700k
files) to TACC’s Ranger
– 2.1 TB staged out (~36k
files) to SCEC storage
– 190 million jobs
executed on the grid
– 750,000 files stored in
RLS
CyberShake map
34
CyberShake Production Run - 2009
• Run from 4/16/09 – 6/10/09
• 223 sites
– Curve produced every 5.4 hrs
• 1207 hrs (92% uptime)
– 4,420 cores on average
– 14,540 peak (23% of Ranger)
• 192 million tasks
– 44 tasks/sec
– 3.8 million Condor jobs
• 192 million files
– 11 TB output, 165 TB temp
35
Challenge: Millions of tasks
• Automation is key
– Workflows with clustering
• Include all executions, staging, notification
– Job submission
• Data management
– Millions of data files
– Pegasus provides staging
– Automated checks
• Correct number of files
• NaN, zero-value checks
• MD5 checksums
36
What is DAG-workflow
37


Jobs with dependencies organized in Directed Acyclic
Graphs (DAG)
Large number of similar DAGs make up a workflow
GriPhyN
Virtual Data System

Virtual data language
– Users define desired transformations
– logical names for data and transformations

Virtual data catalog
– Stores information about transformations,
derivations, logical inputs/outputs

Query tool
– Retrieves necessary transformations
given a description of them
– Gives an abstract workflow

Pegasus
– Tool for executing abstract workflows
on the grid

Virtual Data Toolkit (VDT): part of
GriPhyN and iVDGL projects
– Includes existing technology
(Globus, Condor) and experimental
software (Chimera, Pegasus)
GlobusWORLD 2003
Virtual Data
Applications
Chimera
Virtual Data Language
VDL API/CLI
(manipulate derivations
and transformations)
Task Graphs
(compute and data
movemment tasks,
with dependencies)
Data Grid Resources
(distributed execution
and data management)
XML
Virtual Data Catalog
(implements Chimera
Virtual Data Schema)
The Globus View of Data Architecture
GriPhyN VDT
Replica Catalog
DAGman
Globus Toolkit, Etc.
38
Functional View of Grid
Data Management
Application
Metadata Service
Planner:
Data location,
Replica selection,
Selection of compute
and storage nodes
Replica Location
Service
Information Services
Location based on
data attributes
Location of one or
more physical replicas
State of grid resources,
performance measurements
and predictions
Security and Policy
Executor:
Initiates
data transfers and
computations
Data Movement
Data Access
Compute Resources
GlobusWORLD 2003
Storage Resources
The Globus View of Data Architecture
39
Panel Questions
• What technical solutions exist that meet your
academic project requirements?
• What requirements are unique to the
academic environment?
• Are there common approaches for managing
large-scale collections?
Treat Simulation Data as Depreciating Asset
Simulation results differ from observational data.
- Tends to be larger
- Can be (often) recomputed
- Often decreases in value with time
- Less well-defined metadata
SCEC: An NSF + USGS Research Center
Collaborate with Existing Data Center
Avoid re-inventing Data Management Centers
-
(Re)-Train Observational data centers to manage
simulation data
Change the culture so deleting data is acceptable
SCEC: An NSF + USGS Research Center
Simulation Data as Depreciating Asset
Manage simulation results as depreciating asset:
- Unique persistent ID’s for all sets
- Track cost to produce, and cost to re-generate
for every data set
SCEC: An NSF + USGS Research Center
Simulation Data as Depreciating Asset
Responsibilities of researchers who want a lot of
storage:
- Default storage lifetime is always limited
- Longer term storage-based on community
use, community value, and readiness for use
by community
- Burden on researchers for long term storage
is more time adding metadata
SCEC: An NSF + USGS Research Center
Remove the Compute/Data Distinction
Compute models should always have associated
verification and validation results and data
sets should always have codes demonstrating
access and usage.
Apply automated acceptance tests for all
codes and access retrieval codes for all data
sets.
SCEC: An NSF + USGS Research Center
Data Storage Entropy Resistance
Data sets will grow to fill storage
- We recognize the need to encourage efficient
storage practices as routine
SCEC: An NSF + USGS Research Center
Data Storage Entropy Resistance
We are looking for data management tools that
provide project management with tools to
administer simulation results project-wide by
providing information such as:
-
Total Project and User Storage in use
Time since access for data
Understanding of backup and replicas
SCEC: An NSF + USGS Research Center
Metadata Strategies
Development of simulation metadata lead to
extended effort with minimal value to
geoscientists:
-
Ontology development as basis for metadata not
(yet?) shown significant value in field.
Difficulty based on need to anticipate all possible
future uses.
SCEC: An NSF + USGS Research Center
Controlled Vocabulary Tools
Controlled vocabulary management based on
community-based wiki systems with subjects
and terms used as tags in simulation data
descriptions:
-
-
Need tools for converting wiki, labels, and entries
to relational database entries
Need smooth integration between relational
database (storing metadata) and wiki system
SCEC: An NSF + USGS Research Center
Metadata Strategies
Current simulation metadata based on practical
uses cases:
-
Metadata saved to support reproduction of data
analysis described in publications.
Metadata saved needed to re-run simulation.
Unanticipated future uses of simulation data often
not supported
SCEC: An NSF + USGS Research Center
End
Download