AuScope: a new initiative to build an Australian Earth Science Grid Lesley Wyborn Geoscience Australia and AuScope 2007-2011 Outline • • • • • The context in which AuScope was funded The components of AuScope The geoinformatics component of AuScope The proposed Australian e-Research Initiative Discuss mechanisms for potential international collaboration 2007-2011 The context in which AuScope was developed • As part of the National Collaborative Research Infrastructure Strategy (NCRIS) • ~$500M committed for FY06-FY11 • 15 “Capability Areas” identified for potential funding + one on Systemic ICT Infrastructure • The research communities in each capability area asked to develop a single Investment Plan 9 Proposals Approved in November 2007 5.1 Evolving biomolecular platforms and informatics 5.2 Integrated biological systems 5.3 Characterisation 5.4 Fabrication 5.5 Biotechnology products 5.8 Networked biosecurity framework 5.10 Optical and radio astronomy 5.12 Integrated marine observing system 5.13 Structure & evolution of the Australian continent (AuScope) 5.16 Platforms for collaboration (due March 30 2007) 2007-2011 Context in which the NCRIS plans were developed • Australia – e-Research Coordinating Committee Report • http://www.dest.gov.au/sectors/research_sector/policies_issues_reviews/key_issues/e_research_consult/discussion_paper.htm • http://www.dest.gov.au/sectors/science_innovation/publications_resources/profiles/Presentation_Data_for_Science.htm – PMSEIC Data For Science Working Group • International – NSF revolutionising Science and Engineering through cyberinfrastructure • http://www.pfc.org.au/twiki/pub/Main/Documents/USCyberReport.pdf • http://www.pfc.org.au/twiki/pub/Main/Documents/LongLiveddata-NSB.pdf • http://www.geo-prose.com/computational_SES.html • http://www.rcuk.ac.uk/cmsweb/downloads/rcuk/research/esci/datadeluge.pdf – US Science Board: Long Lived Data Collections: enabling Research and Education into the 21st century – High performance computing requirements for the Computational Solid Earth Sciences – The Data Deluge: Tony Hey and Anne Trefethan Source: The March 2006 2020 Science Report • The use of computer science concepts and tools in science form a third, and vital component of enabling a ‘golden triangle’ • Noted an important challenge is that of end-to-end scientific data management, from data acquisition and data integration, to data treatment, provenance and persistence • Noted the need for widespread encoding of scientific knowledge: bioinformatics to commence a 5 year codification program 1) http://research.microsoft.com/towards2020science/background_overview.htm 2) http://www.nature.com/nature/journal/v440/n7083/index.html Toys Trauma Thrills Treasure 2007-2011 • AuScope Infrastructure System • A toolkit for Geoscience Research and Geoscience Applications minerals, energy, groundwater discovery and management; hazard prediction; environmental monitoring and management AuScope Other 2007-2011 AuScope Investment Summary AuScope NCRIS Co-investment Totals Earth Imaging $8.37M $3.53M $11.90M Composition & Age $3.00M $2.80M $5.80M Virtual Core Library $2.88M $8.03M $10.91M Geospatial $16.99M $48.33M $65.31M Simulation & Modeling $8.00M $11.29M $19.29M AuScope Grid $6.38M $3.10M $9.48M AuScope Administration $1.00M $1.50M $2.50M TOTALS $46.62M $78.58M $126.20M 2007-2011 Data 1: Transects Program Data 2: Geochemical Instruments 2007-2011 Cameca 1280 Ion Probe Data 3: Virtual Core Library 2007-2011 Linescan camera Spectrometer Control computer Telescope Profilometer Fibre optic cable Telescope Quartz halogen lamps ASD spectrometer Cooler Robotic x/y table Controlling computer Chip tray Robotic x-y table Chip tray carrier © CSIRO 2003 2007-2011 Data 4: GPS, Geodesy Viscous torque Ocean currents Plate tectonics Thrills: Simulation & Modeling 2007-2011 Trauma: The Geoinformatics Tetrahedron 2007-2011 Content (Data, Information Knowledge) High Performance Computing Bandwidth Tools AGU definition “a distributed, integrated digital information system and working environments that are interactive and functionally complete for research communities in terms of people, data, information, tools, and instruments.” 2007-2011 The four components of the geoinformatics tetrahedron are linked through Interoperability • Interoperability is the ability to transfer and use information in a uniform and efficient manner across multiple organizations and information technology systems – Australian Government Information Management Office (AGIMO) Lesley’s definition My stuff will operate with your stuff and I don’t give a damn where it is, how it works and what the format is (where stuff = digital computers, programs, data etc) APPLICATIONS Map Service Perth Report Service, Canberra Desktop 2007-2011 Common Interface Binding – GML/XMML WA WFS GA WFS DATA SERVICES NSW WFS SA WFS QLD WFS TAS WFS NT WFS VIC WFS DATA SOURCES Western Australia GA Sth Australia Tasmania Queensland Victoria NSW Nth Territory Our demonstrator proved interoperability was feasible and was required was that the organisation serving the data 2007-2011 could map to a standardised interface XML GML/XMML + NADM = GeoSciML Computing Services APAC 2007-2011 AuScope Grid IVEC Software Finite Element Solvers Reactive Transport Code Fluid Flow Modelling Mechanical Modelling Terawulf TPAC AC3 ACcESS APAC VPAC Hardware CSIRO Data and Knowledge Services Visualisation Tools 3D Modelling Tools Govt Data & Knowledge Gateways APAC Data & Knowledge Gateways Industry Data & Knowledge Gateways AuScope Users A variety of interfaces to suit user capabilities Expert Scientist Non-Expert University Scientist Student General Public School Students GA APAC SA WA SAPAC Insurance NT IVEC VIC TAS VPAC NSW TPAC AC3 Environment Petroleum Industry Minerals Industry QLD QPSF 2007-2011 HPC Hardware Dedicated to earth sciences – ACcESS (QLD) – Terrawulf (ANU) – GA cluster 1.1 Teraflop 800 Gigaflop 140 Gigaflop Other – APAC (ANU 71 in top 500) 8.9 Teraflop • it is recognised that Solid Earth computation often requires specific configuration, in particular significant communication amongst processors • ACcESS computer has been specifically configured for earth sciences Power of HPC: Single pass 150 m cell differentially reduced to the pole magnetic anomaly map 2007-2011 • Previously done on individual tiles: technically inferior solution • Done on the ACcESS 1.1 teraflop supercomputer in Brisbane as a single pass • Done to a grid cell size of 150m for the whole continent: 654 million cells • Took 33 mins 11 secs to process • Used 12% of cpu and 22% of memory on the ACcESS supercomputer Thanks to Hugh Tassell, Ole Nielson, Peter Milligan & Lutz Grotz Computing Services APAC 2007-2011 AuScope Grid IVEC Software Finite Element Solvers Reactive Transport Code Fluid Flow Modelling Mechanical Modelling Terawulf TPAC AC3 ACcESS APAC VPAC Hardware CSIRO Data and Knowledge Services Visualisation Tools 3D Modelling Tools Govt Data & Knowledge Gateways APAC Data & Knowledge Gateways Industry Data & Knowledge Gateways AuScope Users A variety of interfaces to suit user capabilities Expert Scientist Non-Expert University Scientist Student General Public School Students GA APAC SA WA SAPAC Insurance NT IVEC VIC TAS VPAC NSW TPAC AC3 Environment Petroleum Industry Minerals Industry QLD QPSF AuScope Modelling Framework 2007-2011 Societal Need Modelling Services Natural Hazards Tsunami Storm Surge Vulnerability Modelling Inter polation Numerical Basis Base Scientific Concepts Mesh Chemistry (Gibbs) Predictive ore deposition Mine design Earth quake Coupled Mechanics & Reactive Transport Reaction Kinetics Particle Rock Mechanics Sustainable Energy Finite Element Solver (Fastflo) Chemistry (Log K) Finite Volumes Chemical Reactions Mine Waste Disposal Reactive Transport Inundation Modelling Codes Resources & Mining Fluid dynamics Geodynamics Mantle Convection Geodynamics Modelling code Underworld ESys Crustal Visualisation (gLucifer) Scripting environment for Finite element simulations (E-Script) (Finley) Coupled Processes Particle in Cell Deformation Surface Modeling Package (SPModel) Finite Mantle Element Convection Solver Code (StGFEM) (CitcomS) Finite Element Solver Finite Elements Slab subduction Finite Difference Surface Processes Fluid Flow 2007-2011 The thrill part AuScope Modelling Framework 2007-2011 Societal Need Modelling Services Natural Hazards Tsunami Storm Surge Vulnerability Modelling Inter polation Numerical Basis Base Scientific Concepts Mesh Chemistry (Gibbs) Predictive ore deposition Mine design Earth quake Coupled Mechanics & Reactive Transport Reaction Kinetics Particle Rock Mechanics Sustainable Energy Finite Element Solver (Fastflo) Chemistry (Log K) Finite Volumes Chemical Reactions Mine Waste Disposal Reactive Transport Inundation Modelling Codes Resources & Mining Fluid dynamics Geodynamics Mantle Convection Geodynamics Modelling code Underworld ESys Crustal Visualisation (gLucifer) Scripting environment for Finite element simulations (E-Script) (Finley) Coupled Processes Particle in Cell Deformation Surface Modeling Package (SPModel) Finite Mantle Element Convection Solver Code (StGFEM) (CitcomS) Finite Element Solver Finite Elements Slab subduction Finite Difference Surface Processes Fluid Flow 2007-2011 Relationship between NCRIS 5.13 and NCRIS 5.16 NCRIS 5.13 AuScope Working Groups NCRIS 5.13 AuScope Access & Interoperability NCRIS 5.16 Platforms for Collaboration Community-specific Knowledge Environments and Networks for Research and Education Customised for discipline- and project-specific applications e.g, Geophysics, Geochemistry, Geochronology, Geospatial, Geology e-Science and e-Geoscience Layer Data and Information Infrastructure Data and Knowledge Portals Visualisation 3-D models Application Portals Base Computing Technologies Networks, Communications High performance computing High Volume Storage Middleware Architecture 2007-2011 NCRIS 5.16: Platforms for Collaboration Production grade foundation services & facilities for e-Research – Australian Access Federation: authentication, security, accounting – Australian Research and Education Network (AREN) – Australian National Data Service (ANDS) – A National Compute Infrastructure – Registration service – Interoperability – User Support 2007-2011 High Performance Computing • Currently we have more HPC than we can use • Bandwidth is not a major issue for the Research Networks (1 Gbps min) Band width • We are starting to get tools….. • But the amount of coherent data we can access in machine readable forms is the limiting factor!!! Content: Data, Information, Knowledge Tools Geoinformatics Tetrahedron in Australia 2007-2011 2007-2011 The greatest barrier to AuScope • The social one • Right now we need the barriers between competition and collaboration to move forever in favour of collaboration • It is only through global collaboration we will develop and continually evolve the required semantics and ontologies, as well as technical standards 2007-2011 Next Steps • Collaborate internationally to develop a plan to develop standards for Geoscience Data, Information and Knowledge • Coordinate the evolution of these standards at the international level through IUGS (as per GeoSciML), or Geounions • Coordinate Geounions with CODATA or ICSU for interaction with chemistry, physics, biology etc • Above all encourage community agreement, adoption and evolution • Adoption and develop a central website for this to happen and to better coordinate standards development • Key Links for this presentation – www.AuScope.org www.seegrid.csiro.au – www.pfc.org.au www.onegeology.org