Mudge IEEE eScience

advertisement
Evolving inversion methods in
Geophysics with Cloud Computing –
a case study of an eScience
collaboration
Mudge, Chandrasekhar, Heinson, Thiel
Prof J Craig Mudge FTSE
University of Adelaide
Australia
School of Computer Science/ School of Earth Sceinces
7th IEEE eScience Conference, Stockholm, December 2011
1
Two South Australian successes in geology
1. Hot rocks for geo-thermal energy - 95% investment is in
South Australia
2. Olympic Dam - BHP Billiton
-- world's fourth largest copper deposit, fifth largest gold
deposit and the largest uranium deposit.
2
craig.mudge@adelaide.edu.au
IEEE eScience 2011
Outline
1.
2.
3.
4.
5.
6.
Cloud computing
Collaborative Cloud Computing Lab (C3L)
Inversion in magnetotelluric processing
Geothermal – EGS in South Australia
Results and Lessons learned
Future work
Cloud service provider
owns and operates the infrastructure
and innovates to
keep technology leading edge,
handle software upgrades, and
steadily reduce energy costs
Google, Dalles Oregon
Microsoft Azure, Chicago
4
Massive scale of data centres delivers 4 – 7X
cost reduction and energy efficiency
Air flow
5
A no-machines Lab
machines
eScience enabled by
cloud computing
Seed funding from
-- Department of Mines www.pir.sa.gov.au
-- MSFT Research Jim Gray Seed Grant
Started June 2010
6
Our three cloud service providers
1. Amazon Web Services
2. Microsoft Azure
Now adding government funded eResearch
clouds which will run Open Stack (NASA and
Rackspace)
craig.mudge@adelaide.edu.au
IEEE eScience 2011
7
Magnetotelluric (MT) imaging
1.
2.
3.
Using the magnetic and electric
fields of the earth, MT imaging
determines the resistivity
structure of a sub-surface area of
interest.
It goes deeper (hundred or so Km)
than seismic (<2 Km) but does not
have the same resolution
Applications
1.
2.
3.
4.
5.
6.
mineral exploration,
water management in mining,
geothermal exploration,
carbon storage,
aquifer research and management
earthquake and volcano studies.
(Heinson and Mudge, 2010)
CO2 in depleted gas field
8
Electrical resistivity
Electromagnetic methods
Data logging by University of Adelaide
Geophysics, on a geothermal site – Paralana, SA,
Australia
11
MT Processing steps
Inversion
craig.mudge@adelaide.edu.au
IEEE eScience 2011
12
start
Searching the solution space
compute
sensitivity
matrix
compute model’s
MT response
Inversion iterations:
locally improve
model misfit
Compute model response,
compare with observed data
compare model response
to observed data
yes
can locally
improve misfit?
no
no
required
misfit?
yes
yes
locally improve
model smoothness
can locally improve
smoothness?
no
smooth
enough?
yes
no
no
> max
iterations?
yes
finish
13
craig.mudge@adelaide.edu.au
IEEE eScience 2011
craig.mudge@adelaide.edu.au
IEEE eScience
14
craig.mudge@adelaide.edu.au
IEEE eScience 2011
15
Setting up a new inversion – part 1
craig.mudge@adelaide.edu.au
IEEE eScience 2011
16
Setting up a new inversion – part 2
craig.mudge@adelaide.edu.au
IEEE eScience 2011
17
Dashboard
craig.mudge@adelaide.edu.au
IEEE eScience 2011
18
Results and Lessons learned
19
craig.mudge@adelaide.edu.au
IEEE eScience 2011
Speedup
Sequential
Parallel
craig.mudge@adelaide.edu.au
IEEE eScience 2011
20
Performance analysis beyond speedup
Sequential
Parallel
Examples of recent performance analysis
1. Effect of FORTRAN compiler with different optimisations has been worth exploring. A factor of
3X speed up from the Intel Visual Fortran Composer XE 2011 for Windows.
2. “Steal time” - time lost due to hypervisor’s management of a virtual machine – Netflix have
analysed their Amazon experience extensively
craig.mudge@adelaide.edu.au
IEEE eScience 2011
21
Results and learnings
1. “No-machines” works
2. Speedup has led to 100% adoption in MT research
3. First results of monitoring fluid injection in EGS
Reservoirs using magnetotellurics (MT) – promising
since seismic does not indicate fluid flow, and MT is
low cost
4. Taking chunks of FORTRAN is achievable in a timely
manner
5. Capability building – a true eScience partnership
6. Our Web Services user interactions took same
amount of programming effort as parallelising
craig.mudge@adelaide.edu.au
IEEE eScience 2011
22
eScience in the cloud
- observations of a veteran of the
computer industry (but not my co-authors
in this eScience paper)
1. Web Services (giving interoperability
between disparate services of historic
proportion) could have been adopted faster
in eScience
craig.mudge@adelaide.edu.au
IEEE eScience
23
craig.mudge@adelaide.edu.au
IEEE eScience 2011
(Mudge, 2002)
24
(Mudge, 2002)
25
craig.mudge@adelaide.edu.au
IEEE eScience 2011
eScience in the cloud
- observations of a veteran of the
computer industry (but not my co-authors
in this eScience paper)
1. Web Services (giving interoperability
between disparate services of historic
proportion) could have been adopted faster
in eScience
2. Cloud computing will speed up the use of
web services , because cloud makes it natural
to interact using web services (service
craig.mudge@adelaide.edu.au
IEEE eScience
orientation, discovery,
interoperability) 26
Lessons learned – HPC programming
1. MapReduce (Hadoop) is the programming model that
best matches data centre as the computer. However,
because it requires rewrite of existing programs, the
first wave of benefits come from simpler parallelism –
parameter sweeps, Monte Carlo simulation, job-level
parallelism, etc.
2. Second wave of benefits will be new algorithms and
rewrites using MapReduce
3. Nevertheless, the first wave in cloud-based
bioinformatics (matching short reads against
reference genome) did use MapReduce
craig.mudge@adelaide.edu.au
IEEE eScience
27
Lessons learned - Azure
1. Why was Azure much harder to migrate to than
predicted?
Answer:
- We came from a non .Net environment
- Azure younger than Amazon (2 years)
-
-
Virtual Machine in Beta
Deployment times 20 minutes vs 20 seconds slows
debugging
Azure designed for long running applications, e.g.,
ecommerce, more than for scientific
2. However, we persist.
- Warehouse-sized data centre – operating system is
robust and rich, e.g., hot swap of patches
- Benefits of PaaS
craig.mudge@adelaide.edu.au
IEEE eScience 2011
28
Future work
craig.mudge@adelaide.edu.au
IEEE eScience 2011
29
Future work
1 of 2
1. Inversion on demand, available to colleagues
and explorers world-wide, wrapped in
workflow (persistence, provenance, partial
runs, ...)
2. National/international collaboration building
on a national Geophysics Virtual Lab
- access to disparate data (seismic, borehole images,
gravity, magnetic, ...) built by Auscope using
results of GeoSciML Interoperability Working
Group
craig.mudge@adelaide.edu.au
IEEE eScience
30
Societal
Need
Sustainable Energy Policy
Environment
Virtual Laboratory
Energy Exploration Integrated Virtual
Laboratory
Virtual Geophysical
Laboratory
National
Borehole
Laboratory
Processing
Services
Processing
Services
Data
Geophysics
Virtual Geodesy
Laboratory
Processing
Services
Data
Borehole
Virtual Earth
Observation
Laboratory
Processing
Services
Data
Geodesy
Land cover
craig.mudge@adelaide.edu.au
Virtual
Laboratories
Modelling &
analytic tools
Processing
Services
Data
Dr Robert Woodcock and Dr IEEE
Lesley
Wyborn
eScience
2011
Virtual Oceans
Laboratory
Integrated
Virtual Labs
Data
Virtual
Libraries
Marine
31
Future work
2 of 2
3. Explore statistical machine learning to detect
interesting patterns
4. Exploring solution space using Evolutionary
Algorithms implemented on thousands of
processors in the cloud (Brad Alexander)
5. Promulgate security best practices
6. Following the success of speedup, model size
has become the limiter for our geophysicists
craig.mudge@adelaide.edu.au
IEEE eScience
32
Acknowledgements
Brad Alexander
Gordon Bell
Pinaki Chandrasekhar
Dennis Gannon
Graham Heinson
Tony Hey
Ed Lazowska
Stephan Thiel
craig.mudge@adelaide.edu.au
IEEE eScience
33
Summary
1.
2.
3.
4.
5.
6.
Cloud computing
Collaborative Cloud Computing Lab (C3L)
Inversion in magnetotelluric processing
Geothermal – EGS in South Australia
Lessons learned
Future work
Thanks
and
questions
craig.mudge@adelaide.edu.au
www.cloudinnovation.com.au
+61 417 679 266
+1 650 224 2111
craig.mudge@adelaide.edu.au
IEEE eScience 2011
35
Security best practices
1.
2.
3.
4.
5.
6.
7.
8.
Certifications
Physical security
Secure services
Data privacy via encryption
Backups
Constant monitoring
External review
Compare yours with Google, Amazon, Azure
craig.mudge@adelaide.edu.au
IEEE eScience
36
Download