INCITE Project Closeout Report 2014 High Frequency Physics-Based Earthquake System Simulations (Year 1 of 1) PI and Co-PI(s): Thomas H. Jordan, Jacobo Bielak, Steven Day, Kim Olsen, Yifeng Cui, Po Chen, Ricardo Taborda, Philip Maechling Scientific Discipline: Earth Science: Geological Sciences INCITE Allocation: 112,200,000 processor hours Site: Argonne National Laboratory Machine (Allocation): IBM Blue Gene/Q (64,200,000 processor hours) ALCF Project Name: Equake_SS Site: Oak Ridge National Laboratory Machine (Allocation): Cray XT (48,000,000 processor hours) OLCF Project Name: GEO015 1. Provide a brief 1-3 paragraph summary or abstract of your INCITE project that covers the duration of this project. During this INCITE allocation, researchers from the Southern California Earthquake Center (SCEC) used computing time and storage space on INCITE computational resources to carry out research on high frequency physics-based earthquake system simulations. Our objective was to reproduce earthquake physics, wave propagation, site response, and ground motions at frequencies up to 10 Hz using deterministic modeling approaches. While this 2014 closeout report is requested by OLCF, we will describe the overall progress in our INCITE-supported research program, so we will mention accomplishments using both OLCF and ALCF resources. Where possible, we will identify which INCITE resources were used to produce specific results. In 2014, SCEC used INCITE resources to investigate a broad range of scientific issues. We processed recent ground motion observations into improved earth structure models. We introduced more realistic physics into our computation models. We integrate our improved earth structure models, together with our improved computational models, and validated our computational results against observations. Then we used our improved computational methods to calculate broadimpact seismic hazard estimates. SCEC’s 2014 INCITE computational research activities are a critical part of SCEC’s overall, collaborative earthquake system science research program, that includes academic scientists, US government agencies including NSF and USGS, and large utility companies who are working together to develop, improve, evaluate, and apply seismic hazard modeling methods for use in public seismic hazard estimates and engineering applications. 2. Summarize the major accomplishments of your INCITE project for the entire duration of your project. Please focus on what you consider to be true accomplishments through the INCITE award. Reiterate the milestones of your proposal and discuss the 1 accomplishments (planned or unplanned) achieved relative to those milestones. Identify the reasons these achievements required the unique, world-class compute and data capabilities of LCF. (3-page limit) Most Significant Accomplishments in 2014: Below is a short list of our most significant INCITE-enabled research accomplishments in 2014. After that, we provide a brief discussion of each of the milestones we defined in our 2014 INCITE allocation request. We believe SCEC’s INCITE 2014 computational activities are the among the largest-scale and most scientifically advanced earthquake ground motion and seismic hazard calculations being performed anywhere in the world. Use of INCITE computing resources have enabled us to develop and apply advanced computational techniques, such as full 3D tomography, to practical problems of seismic hazards in California. INCITE computational resources have made it possible to perform deterministic ground motion simulations up to 10Hz, which was considered out of the reach by any open-science computing facility until the current generation of leadership class system became available. SCEC’s large, collaborative, interdisciplinary computational research is the type of collaborative, teambased, system-oriented, research that can produce broad-impact results using leadership class systems. As we report in our impact section below, we believe that SCEC’s INCITEbased results are now being integrated in the plans of broad-impact users. 1. Produced a new, and arguably the most accurate, southern California velocity model using full 3D tomography. Released CVM-S4.26 southern California velocity model to research community. (Mira) Defined starting 3D velocity model for Central California, and performed two full 3D tomographic iterations for Central California velocity model. (Mira) Defined starting state-wide 3D velocity model, and performed one full 3D tomographic iteration for this state-wide model (Mira) 2. Performed dynamic Rupture and Wave propagation simulations using small-scale heterogeneities. Performed dynamic rupture simulations on rough faults at frequencies up to 8 Hz (Mira). Integrated new physics into wave propagation simulation software to models plastic yielding (Titan). 3. Calculated higher frequency, physics-based probabilistic seismic hazard curve. Prototyped a computational technique for running scientific workflows on Titan (Titan). Developed, calculated, and verified first ever physics-based 1Hz CyberShake hazard curve. (Mira, Titan, Blue Waters) Discussion of Milestones: 2 We identified the following seven milestones [M1 - M7] in our original allocation request. In retrospect, we would have defined some milestones somewhat differently, such as M1, which we would divide into two separate goals. Also, we would modify some of the target simulation frequencies. But overall, these milestones worked as reasonable goals that helped keep our activity focused forward. [M1] Velocity model improvements for Southern California used in 5Hz validation simulation. Produce and distribute an updated CVM-S model using improvements from full 3D tomography, and will use the improved velocity to run a 5Hz simulation of a well-observed historic earthquake and calculated selected goodness of fit measurements. P. Chen and E. Lee used Mira to develop an improved southern California 3D velocity model called CVM-S4.26. This model has been publically released and represents, arguably, the best available 3D velocity model for southern California. This updated velocity model is a critical input to SCEC’s CyberShake physics-based hazard models for southern California, and this updated CVM-S4.26 was used as best available model during SCEC collaborations with Civil Engineers in 2014. Full 3D tomography work on Mira includes development of a state-wide 3D model. Project Reference [1]: Lee et al., (2014) [M2] Hazard-scale dynamic rupture simulations. Perform 10-Hz+ dynamic rupture simulations to replicate one or more historical California earthquakes using SORD. Z. Shi and Kim Olsen used ALCF supported SORD software to run 10Hz Northridge rupture simulations. These dynamic rupture simulations shows how rough fault geometries contribute to high frequencies ground motions to earthquake ruptures. [M3] High frequency forward simulations using alternative CVMs. Compare the impact of CVMs and soft-soil deposits (or geotechnical layers) on 5Hz+ simulations by simulating forward events using alternative velocity models and comparing the results. J. Bielak and R. Taborda used Titan to run large-scale high frequency deterministic wave propagation simulations using SCEC Hercules GPU code on Titan. Project Reference [5]: Taborda et al., (2014) [M4] High frequency forward simulations using alternative plasticity models. Compare the impact of alternative plasticity and attenuation models on 5Hz+ simulations by simulating forward events using alternative velocity models and comparing the results. K. Olsen’s SDSU research group ran large-scale southern California scenario earthquake simulations of the ShakeOut earthquake to evaluate the effects of plastic yielding on peak ground motions using Titan. 3 Project Reference [3]: Roten et al., (2014) [M5] Simulate a historical earthquake using CVMs with, and without, small scale heterogeneities. Simulate a historical California earthquake at 5Hz+ using CVMs with, and without, small-scale heterogeneities and evaluate the results using selected goodness of fit measurements to identify the velocity model to use for 10Hz simulations J. Bielak and R. Taborda ran simulations of Chino Hills earthquake with alternative velocity models at high frequencies and validated the simulation results against observed ground motions for this well-observed earthquake using Titan. Project Reference [5]: Taborda et al., (2014) Project Reference [2]: Lee et al., (2014) [M6] Run a 10Hz historical California earthquake simulation. Integrate our best available configuration (code, velocity model, source description) and simulate a well-observed California earthquake at 10Hz and calculate selected goodness of fit measurements. We did not complete this milestone under our 2014 allocation. Results of other milestones indicated we needed to combine methodology and results from several of this years advancements including dynamic rupture simulations on rough faults [M2] and plastic yielding during strong motions [M4] before we can perform and validate simulations of historical earthquakes at 10Hz. We have moved this milestone into our goals for 20152016 INCITE research. [M7] Calculate a 1.5Hz CyberShake Hazard curve. Used updated CVMs, source models, and codes to calculate an higher frequency CyberShake hazard curve. During our 2014 INCITE allocation, we completed first 1Hz CyberShake Hazard curve using CVM-S4.26. This is part of our CyberShake verification work before we begin ensemble CyberShake runs that produce a regional seismic hazard map. CyberShake is an integration point for many of our research results. In our CyberShake calculations, we use the best available input earth models, best available wave propagation codes, and the highest performance computational software. CyberShake is also a computational distributed research calculation. Our 2014 1Hz CyberShake results made use of ALCF, OLCF resources, as well as NCSA Blue Water resources. The input seismic velocity model used in these calculations, called CVM-S4.26 was developed using full 3D tomography on Mira. The parallel strain Greens tensor (SGT) calculations were done using our high-performance GPU codes on OLCF Titan. We used NCSA Blue Waters to perform our high-throughput post-processing for these hazard curves. These 1Hz evaluation hazard curves help us verify the computational methods and software needed to work at the higher 1Hz frequency. In the past, SCEC has calculated 0.5Hz, and lower, frequency CyberShake hazard models, while engineering users of CyberShake are asking us to increase our maximum simulated frequency to 1.5Hz. 4 3. What is the impact of this work to date? Try to link it to the specific accomplishments of this project and the impact this project could have on the research community and/or the community at large, to the sponsor, scientific discovery, and/or translation of outcome to new areas. We believe SCEC’s 2014 INCITE results have reached an unprecedented level of acceptance by broad impact users of SCEC computational-based seismic hazard information. Several highly influential users are evaluating how they can use SCEC computational results in the scientific, public, and economic decisions they make. Three users include a [1] California utility (Pacific Gas and Electric), [2] building code recommendation developers from the American Society of Civil Engineers (ASCE), and [3] the USGS, the official source of seismic hazard information in the United States. PG and E, a California utility company, is interested in using SCEC’s INCITE-based research results in their seismic hazard evaluations of their nuclear power plant (Diablo Canyon), and more than 50 hydro-electric dams they operate in Northern California. SCEC has submitted a proposal to PG and E to perform a computational research project that would to extended the full 3D tomography work, currently being done on Mira, to additional regions of California. SCEC has initiated work on developing a Central California 3D velocity model, including the Diablo Canyon area, as a prototype for the proposed work. Assuming good progress on this prototype activity, SCEC results using INCITE resource may contribute to the seismic hazard evaluations performed by this utility, for both Central and Northern California, with potential impact on millions of California residents. Current engineering users of seismic hazard estimates, including an American Society of Civil Engineers (ASCE) working group, have requested 1Hz (and later 1.5Hz) CyberShake hazard maps for southern California. The interested ASCE working group produces building code recommendations that are often adopted by California cities and other governmental agencies. This ASCE working group wants to use SCEC’s CyberShake simulations as part of the seismic hazard estimates they use in the next Building Code recommendation update in California. Since such building code updates impact billions of dollars of construction projects, SCEC INCITE-based computational research is positioned to make a large impact on national infrastructure development, as well as on large private construction activities. The United State Geological Survey is the official source of seismic hazard information in the United States. SCEC is partially funded through USGS support, together with NSF support, and SCEC is well positioned to contribute new research tools and techniques to influence USGS seismic hazard methods. USGS currently updates national seismic hazard models on a five-year interval, publishing seismic hazard maps at a national, but lowresolution, scale. SCEC’s CyberShake seismic hazard models, which to-date have been calculated only for the Los Angeles region, are under consideration for use as urban seismic hazard models for southern California. The USGS is developing urban seismic hazard maps as high-resolution seismic hazard models that integrate specialized local information, to augment existing seismic hazard models in high-density urban areas. USGS is currently 5 evaluating SCEC’s existing CyberShake hazard model of the Los Angeles region for use as a Los Angeles Urban seismic hazard model, and has asked SCEC to extend the CyberShake methodology to other urban areas in California. Adoption of the CyberShake methodology and results as a California Urban seismic hazard model would impact USGS official public statements about seismic hazards in California, and it would represent another way the SCEC’s INCITE-based computational research is having a broad public impact. We can provide additional information on any of these impact statements. We anticipate that all three of these activities will continue throughout SCEC’s INCITE 2015-2016 allocation. 4. List all peer-reviewed journal or conference articles published, submitted, and/or accepted that resulted from the use of OLCF resources for this specific INCITE project. Published Publications Accepted Publications Submitted Publications [1] Lee, E.-J., P. Chen, T. H. Jordan, P. J. Maechling, M. A. M. Denolle, and G. C. Beroza (2014), Full-3-D tomography for crustal structure in Southern California based on the scatteringintegral and the adjoint-wavefield methods, J. Geophys. Res. Solid Earth, 119(8), 6421– 6451, doi:10.1002/2014JB011346. [2] Lee, E., Chen, P., and Jordan, T. H. (2014) Testing Waveform Predictions of 3D Velocity Models Against Two Recent Los Angeles Earthquakes, Seismological Research Letters, 85(5), 1274-1284. [3] Roten, D., K.B. Olsen, S.M. Day, Y. Cui and D. Faeh, Expected seismic shaking in Los Angeles reduced by San Andreas fault zone plasticity, Geophysical Research Letters, 41, No. 8, 2769-2777, doi:10.1002/2014GL059411, 2014. [4] Deelman, E., Karan Vahi, Gideon Juve, Mats Rynge, Scott Callaghan, Philip J. Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira da Silva, Miron Livny and Kent Wenger (2014), Pegasus: A Workflow Management System for Science Automation, Future Generation of Computer Systems, to appear, 2014. [5] Taborda, R. and Bielak, J. (2014). Ground-Motion Simulation and Validation of the 2008 Chino Hills, California, Earthquake Using Different Velocity Models. Bulletin of the Seismological Society of America. Submitted for publication. [6] Poyraz, E., H. Xu, and Y. Cui (2014), Application-specific I/O optimizations on petascale supercomputers, Procedia Computer Science vol. 29, pp. 910-923, 2014. 5. List other accomplishments related to your INCITE project such as: Invited talks Non-peer reviewed publications 6 Other presentations about your INCITE work Formal software releases generated as a result of this INCITE project Students graduated or postdocs deployed Technical Accomplishments – Please list technical accomplishments such as development of reusable code resulting in a new tool, new algorithm design ideas or programming methodologies, formal software releases, etc Patents (filed or received) Invention disclosures Other [1] SCEC Computational Research Presented at Northridge 20 Symposium: The 1994 Northridge Earthquake: Impacts, Outcomes and Next Steps, January 16-17, 2014 UCLA [2] Lee, E., Chen, P., Jordan, T. H., Maechling, P. J., Denolle,M., and Beroza, G. C. (2014) Full3D waveform tomography for crustal structure in Southern California using earthquake recordings and ambient-noise Green’s functions based on the adjoint and the scatteringintegral methods, Seismological Society of America 2014 Annual Meeting, 30 April – 2 May 2014, Anchorage, Alaska [3] Maechling, P. J, SCEC HPC research, IRIS Data Management Service Meeting, University of Southern California, March 2014 [http://scec.usc.edu/scecpedia/IRIS_DMS_Meeting_March_2014] [4] Shaw, J. Unified Structural Representation of the Southern California Crust and Upper Mantle, SCEC-ERI Summer School for Earthquake Science, September 28-October 2, 2014, Oxnard CA. [5] Callaghan, S., Description of Calculations of Risk-Targeted Maximum Considered Earthquake Response Spectra (MCER) at 14 Southern California Sites, 3rd SCEC UGMS Committee Meeting, November 3, 2014, University of Southern California, Los Angeles [6] Withers, K., Kim Olsen, Zheqiang Shi and Steve Day. High-Complexity Deterministic Q(f) Simulation of the 1994 Northridge Mw 6.7 Earthquake,’ Palm Springs, CA, September 2014, SCEC (Southern California Earthquake Center) Annual Meeting. [7] Roten, D., K.B. Olsen, S.M. Day, Y. Cui, and D. Fah (2014). The ShakeOut earthquake scenario with plasticity, Seism. Res. Lett. 85:2, 470. [8] Savran, W.H., K.B. Olsen, and B.H. Jacobsen (2014). Unique amplification patterns generated by models of small-scale crustal heterogeneities, American Geophysical Union Annual Meeting, San Francisco, December 15-19, Poster # S41E-06. [9] Withers, K., K.B. Olsen, Z. Shi, and S.M. Day (2014). High-Complexity Deterministic Q(f) Simulation of the 1994 Northridge Mw 6.7 Earthquake, San Francisco, 2014, December, 7 American Geophysical Annual Meeting Meeting, San Francisco, December 15-19, Poster #S31C-4433. [10] Withers, K., K.B. Olsen, and S.M. Day (2013). Deterministic High-Frequency Ground Motions using Dynamic Rupture along Rough Faults, Small-Scale Media Heterogeneities and Frequency-Dependent Attenuation. San Francisco, 2013, December, American Geophysical Meeting Annual Meeting. [11] Withers, K., K.B. Olsen, Z. Shi, and S.M. Day (2013). Deterministic High-frequency Ground Motion Using Dynamic Rupture Along Rough Faults, Small-scale Media Heterogeneities, and Frequency-dependent Attenuation. Palm Springs, CA, September 2013, SCEC (Southern California Earthquake Center) Annual Meeting. [12] Savran, W.H., K.B Olsen (2014). Deterministic simulation of the Mw5.4 Chino Hills event with frequency-dependent attenuation, heterogeneous velocity structure and realistic source model, Seism. Res. Lett. 85:2, 498. [13] Withers, K., K.B. Olsen, Z. Shi, and S.M. Day (2014). High-Complexity Deterministic Q(f) Simulation of the 1994 Northridge Mw 6.7 Earthquake, Seism. Res. Lett. 85:2, 471. [14] Withers, K.W., K.B. Olsen, and S. M. Day (2014). Memory-Efficient Simulation of Frequency Dependent Q. Bull. Seism. Soc. Am. 6. Application Performance: Summarize the performance (e.g., percent of peak, scalability) of your project's simulation codes used in the allocations for this project. What progress was made in improving the application's performance on LCF architecture during this project? What challenges (if any) remain? List the technical risks and challenges that were confronted by your project (overcome or not) this year. Were they anticipated? SCEC makes extensive use of scientific workflows to run our CyberShake model calculation. CyberShake production runs are our largest heterogeneous ensemble simulations that depend heavily on workflow technology. The scale and complexity of the CyberShake calculation requires the automation and reproducibility provided by workflows. We would like to run our workflows on both Mira and Titan. During SCEC’s 2014 allocation, we made significant progress implementing our workflows on Titan. SCEC’s existing CyberShake workflow implementation is based on a NSF-developed software stack that includes Condor, Condor DAGManager, and Pegasus-WMS. In the past, SCEC has not been able to run our workflows on INCITE resources primarily because our workflow solution requires remote job submission. In our current workflow-computing model, an external computer, at SCEC, runs a Condor DAGManager job queue. Jobs are submitted to the resource provider’s job queue at a reasonable rate. HPC providers, including TACC Stampede, and NCSA Blue Waters have supported this workflowcomputing model with Globus GRAM-based job submission. This year, SCEC worked with USC Pegasus-WMS workflow tool developers, and Titan’s technical group to develop a 8 solution that will enable SCEC to run our workflows on Titan. Titan provides the large GPU count needed for the compute intensive part of our CyberShake calculations. We are working to develop technical solution that will enable us to like to run the high-throughput part of the workflow on Titan. Once working on Titan, we will work to implement this solution on Mira as well. The solution we have developed in collaboration with Titan technical group makes use of the special characteristics of Titan Service Nodes. Specifically, we make use of the fact that jobs running on Titan service nodes can make outgoing connections to external resources. In the prototype workflow implementation we developed this year on Titan, a Condor pilot jobs is run on the Titan service node, which establishes a connection back to SCEC’s CondorDAG manager. DAGManager then feeds jobs to the pilot job running on the Titan Service Node, and the Titan Service node schedules jobs via aprun on Titan compute nodes [Figure 1]. As proof of concept, we developed this prototype solution and we ran a CyberShake workflow end-to-end on Titan resources. To accomplish this, we submit Condor pilot jobs on Titan that communicate to the Condor collector on our workflow submission host, enabling our host to schedule work to the pilot jobs on Titan. In this way we were able to run our entire CyberShake workflow on Titan and verify the results. We have some additional work and refinements that are needed before we put this solution to use in SCEC CyberShake production runs. We need to more carefully evaluate the performance of the workflow system, and the impact of our workflows on Titan resources. We will continue to work with the Titan technical group as we develop this workflow computing approach. A useable technical solution for running our very large (100s of millions of tasks) workflows will provide a great benefit to our ability to complete important research using INCITE resources. Figure 1: SCEC and Titan technical groups have developed this Pegasus-WMS, Condor DAGman, based approach to running scientific workflows on Titan. 9 7. What is the contribution made by LCF staff and/or programs (liaison/catalyst collaboration, vendor support, user assistance, other staff, training, etc.) We added several new users to our allocation during this year, and OLCF User assistance supported addition of new users. SCEC’s OLCF liaison helped coordinate technical discussion between our workflow development group and OLCF technical groups in support of our workflow development. 8. Please include a science image that we can associate with your closeout report, a caption for the image, and proper accreditation of the image if desired. During SCEC’s 2014 INCITE allocation, we carried out work on Titan related to the scattering effects of small-scale heterogeneities, modeled using Von Karman autocorrelation functions. Our studies unilaterally show the importance of including such crustal small-scale heterogeneity in order to generate realistic high-frequency ground motion estimates. An interesting finding from 3D visualization of the waves propagating in models with small-scale heterogeneities is the tendency toward bands of larger-amplitude waves developing from the source into the medium [Figure 2]. These bands of larger-amplitude (typically SH) waves tend to follow paths of lower (S) velocity regions in the model, radially emanating from source. One important implication of these bands is that the distribution of ground motion amplification may be significantly different for directions parallel to and perpendicular to the strike of a range-bounding fault, such as the San Andreas fault along the San Bernardino basin is southern California. Depth"(km)" Vs"(m/s)" Y"Distance"(km)" Figure 2. Isosurface (magenta) of the waves emitted from a point source located at x=12.5 km, Y=6.25 km, and 10 km depth, into a medium with small-scale heterogeneities (sliced background). We used a Von Karman model of small-scale heterogeneities with a Hurst exponent of 0.05, a correlation length of 250 m, a standard deviation of the perturbations with respect to the 10 background model of 5%, and a horizontal to vertical anisotropy of 5. Notice the generation of bands of larger-amplitude (typically SH) waves that tend to follow paths of lower (S) velocity regions in the model, radially emanating from source. (Image Credit: Kyle Withers, Kim Olsen, San Diego State University (SDSU)) 11