Synchrony, Parallelism and Multimodeling

advertisement
Synchrony, Parallelism and
Multimodeling
Louis J. Gross
The Institute for Environmental Modeling
Departments of Ecology and Evolutionary
Biology and Mathematics
University of Tennessee
Overview
• Synchrony and concurrency in ecology
• Everglades restoration and ATLSS - Tales from the
Real World: Mathematics and Computing Meets
Greed, Politics, Lawyers and the Army Corps of
Engineers
• Computational ecology and some parallelization
results
• Distributed, grid-based computing for ecological
modeling, providing stakeholders with capability to
investigate their own hypotheses
• Spatial control - some examples
• Some educational issues in computational science
Key Points
• The availability of parallel computing in its
many forms offers opportunities to rethink how
to model many systems . Accounting for
concurrency and the possibility of synchronous
processes arising has great potential to rethink
the way that many biological and social systems
are modeled, going beyond the serial mindset
that underlies much of applied science today.
Developing the capability for this will require
computational scientists with insight in the
phenomena being modeled as well as deep
understanding of parallelism.
• A central question in science is what
macroscopic properties arise from the properties
of the entities which make up the system and
how these are affected my modifications in the
properties of the entities themselves and the
interactions between the entities. Parallel
computational naturally provides a means to
investigate these issues outside of any
constraints arising from a limited set of
available mathematical approaches.
• Realistic modeling of natural systems requires
multiple linked approaches – multimodeling and new methods are needed to develop and
analyze these. Such multimodels utilize a
mixture of different underlying mathematical or
computational approaches, (sometimes called
hybrid models), are a reasonable way to analyze
multiscale phenomena, and present problems
appropriate for coarse parallelization.
• Much of applied ecology deals with problems of
spatial control – what to do, where to do it, when to
do it, and how to monitor it – and these problems
are not easily solved, opening up many new,
fascinating problems in applied mathematics and
computational science. These offer the opportunity
to tie simulation methods with one of the most
pervasive technological tools in environmental
analysis, geographic information systems. Though
readily accepted throughout applied ecology, GIS
has had little connection to system dynamics
methods needed to for decision support in resource
management.
This report notes
that Simulation-Based
Engineering Science
is central to advances
in numerous fields
including biomedicine,
nanomanufacturing,
and energy and
environmental sciences
This report comments that:
• Methods are needed for linking models at various scales and
simulating multiphysics phenomena
• The US must be in the forefront of methods to make
simulation easier and more reliable
• The simulation methods available are for only limited
ranges of spatial and temporal scales and the principal
physics governing events typically changes with scale so the
models themselves must change in structure as the
ramifications of events pass from one scale to another
• Simulation-based decision making gives rise to complex
optimization problems, which are governed by large-scale
simulations.
• Despite arguing that new thinking on how to model events at
multiple scales is required to alleviate the “tyrany of scales” ,
the report says nothing about the potential for parallelization
methods in this.
Why is there not more emphasis on the role of
parallelization in driving innovative modeling
approaches?
“there are so many important computational problems
that are, besides the issue of efficiency, much more
elegantly solved in parallel, and in particular can be
naturally mapped onto an architecture with multiple
processors. However, it is chiefly the performance,
and among various performance metrics, primarily the
speed of execution, that have been the dominant
driving forces behind the quest for massive
parallelism” Pedrag T. Tosic, ACM 2004
So parallelization is still viewed as primarily for
speedup, not for reconceptualizing the
underlying model
This used spatial grid partitioning and MPI. A key lesson from this
effort was the need to rethink the rules of movement and interaction
of deer and panther from the serial implementation to allow for
concurrent actions. The results were NOT the same as those for the
serial implementation and could be argued as being more realistic as
the parallel implementation accounted for interactions within a
model time step in a way better according with field biology.
Wet Season:
May-October
Dry Season:
November-April
Photos: South Florida Water Management District
Collaborators:
Don DeAngelis
Rene Salinas
Holly Gaff
Jon Cline
Mark Palmer
Michael Peek
Scott Duke-Sylvester
Jane Comiskey
Eric Carr
Paul Wetzel
Brian Beckage
Numerous field biologists
Computational facilities provided by NSF-supported
Scalable Intracampus Research Grid (SInRG)
www.cs.utk.edu/sinrg
www.tiem.utk.edu
Everglades natural system management requires
decisions on short time periods about what water
flows to allow where and over longer planning
horizons how to modify the control structures to
allow for appropriate controls to be applied.
This is very difficult!
•The control objectives are unclear and
differ with different stakeholders.
•Natural system components are
poorly understood.
•The scales of operation of the physical
system models are coarse.
So what have we done?
Developed a multimodel (ATLSS - Across Trophic
Level System Simulation) to link the physical and
biotic components.
Compare the dynamic impacts of alternative
hydrologic plans on various biotic components
spatially.
Let different stakeholders make their own
assessments of the appropriate ranking of
alternatives.
http://atlss.org
Individual-Based
Models
Age/Size Structured
Models
Cape Sable
Seaside Sparrow
Snail Kite
White-tailed Deer
Wading Birds
Florida Panther
Fish Functional Groups
Alligators
Radio-telemetry
Tracking Tools
Reptiles and Amphibians
Linked Cell
Models
Lower Trophic Level Components
Vegetation
Process Models
Spatially-Explicit
Species Index Models
Cape Sable
Seaside Sparrow
Long-legged
Wading Birds
Short-legged
Wading Birds
Snail Kite
Abiotic Conditions
Models
High Resolution Topography High Resolution Hydrology
White-tailed Deer
Alligators
Disturbance
© TIEM / University of Tennessee 1999
ATLSS High Resolution Topography
* The High Resolution
Topography model provides
more detail about local
variation in elevation.
* The detail captures variation
in elevation due to important
features such as tree islands.
High Resolution Topography
Water Management Model Topography
ATLSS High Resolution Hydrology
* With the High Resolution
Topography, High Resolution
Hydrology values can be created from
the SFWMD hydrology.
High Resolution Hydrology
* Hydrology values created in this way
provide the spatial variation and
resolution required to model the
dynamics of many animal populations
in South Florida.
4 miles
SFWMD Hydrology4 miles
How High Resolution Topography Is Made.
Habitat cover map, provided by the
Florida GAP analysis
4 miles
At each location in the Florida GAP map, the model predicts a ground surface which is higher or
lower than the base ground surface, derived from the hydroperiod of the cell, as given by the SFWMD
hydrology data, and the estimated hydroperiod for the habitat type at that location.
The total volume of water predicted by the SFWMD model in each grid cell is preserved in the
High Resolution Hydrology Model.
Estimates of hydroperiod for each habitat
type in the Florida GAP analysis map.
Class
MinHp
0
365
45
180
30
15
40
45
0
365
10
60
0
0
10
0
….
….
….
A hydroperiod curve for each location on the
map showing the number of days the water
surface was at or above each elevation. This
curve is generated from the
Calibration/Validation run of the SFWMD
hydrology model.
Max HP
0
1
2
3
4
5
6
7
Spatially-Explicit Species Index (SESI)
Models
These are designed as extensions of habitat
suitability index models, to provide yearly
assessments of the effects of within and
between year hydrology variation on basic
requirements for foraging and breeding in a
spatially-explicit manner. They allow
comparisons of alternative scenarios, and
allow different stakeholders to focus on their
own criteria.
ATLSS SESI Models
Implement and Execute the Models for a Hydrology Scenario
Objectives: Integrate SESI components into a cohesive computational
framework and apply the models to a hydrology scenario.
Hydrology Scenario
Daily Water Depth
Distribute water over high resolution
topography
High Resolution Hydrology
SESI Models
Cape Sable
Seaside Sparrow
Are the nests
flooded during
egg incubation?
Snail
Kite
Are conditions
favorable for the
apple snails they
depend on?
Wading Birds
Are water depths in
the correct range for
the fish they eat?
Standard Output Generation/Visualization Tools
White-tailed
Deer
Is breeding
disrupted by
high water
levels?
American
Alligator
Is there high ground
to build a nest on?
SESI Output for Long-Legged Wading Birds in N. Taylor Slough: For 1993
Long-Legged Wading Bird SESI Index - WCA-2B Subregion,
Comparing F2050 (blue) with D13R (red)
0.3500
D13R
0.2500
0.2000
0.1500
0.1000
0.0500
F2050
Year (from 1965)
33
31
29
27
25
23
21
19
17
15
13
11
9
7
5
3
0.0000
1
Foraging Index
0.3000
ATLSS Fish Functional Group Dynamics Model
Fish biomass is one of the most important components of the
Everglades system. To produce projections of fish biomass
ATLSS uses a...
… spatially explicit size-structured dynamic simulation model,
ALFISH.
ALFISH simulates the number, size-structure and biomass
densities of “small fish” and “large fish” functional groups in the
freshwater marsh on 5-day time steps.
This represents the temporally and spatially varying food base for
wading birds.
ALFISH has been evaluated through comparisons to some sites in
Shark Slough and WCA3.
ALFISH
Objectives
•Provide estimates of effects of alternative water
management scenarios on spatial and temporal
distribution of food resources for upper trophic
level consumers (wading birds).
•Provide method to evaluate hypothesized impact of
hydrologic changes on fish community composition.
ATLSS Landscape Fish Model
Holly Gaff, Rene’ Salinas, Louis Gross, Don
DeAngelis, Joel Trexler, Bill Loftus and John
Chick
Approach
A size-structured population model for fish functional
groups (large and small fish) that operates on a spatial
cell basis with movement between cells and between
habitats within cells.
ALFISH FLOW CHART
Fish Cell Layout
Example of Small Fish
Least Killifish
Heterandria formosa
Female
Male
Pond areas assumed permanently
wet, marsh areas periodically dry
Landscape Layout and Movement
Fish as Prey
Fish provide the prey-base for
endangered wading bird species such
as Great Egret (Casmerodius albus)
White - movement from low water to high water areas
Red - movement from high fish density to low density areas
ALFISH MODEL EXAMPLE RESULTS - Alt D13r4 compared to F2050Base
Fish Available as Prey during
a Typical Rainfall Year
Fish Available as Prey during
a High Rainfall Year
Average Fish Available as
Prey from 1965 - 1995
Fish Available as Prey during
a Low Rainfall Year
Distribution of Sizes for Fish
in WCA 3A
Total Fish Densities through
31-year Model Run
Average Fish Available as
Prey during Breeding Season
for Wading Birds
Total Fish Densities for
Certain Years in Given Areas
Parallelizations for Everglades
Fish model investigated
• Comparison of serial version to grid partitioning by
region to analyze impacts of compartmentalization
• Comparisons of MPI methods on clusters and SMP
• Analysis of dynamic load balancing with row-stripe
partitioning on SMP
• Comparison of alternative MPI and multithread
(Pthread) implementations.
• Comparisons of parallelization by component
structure (age classes)to spatial grid partitioning
using MPI and Pthread implementations.
• Multiple model implementation combining Fish
model with Wading Bird model
ATLSS grid-service module
ATLSS Model Interface
Wang et al. 2005. A grid service module for natural resource
managers. IEEE Internet Computing 9:35-41
Information Analysis/Control and
Data Representation Layer
Ecological Models
Components Layer
Biotic component
Ecological Modeling Oriented Data Assimilation
Layer
Spatial Information (GIS,
topology, etc)
External Models
(hydrology, climate,
etc)
Abiotic
component
GEM: Grid-based Ecological Modeling, http://www.tiem.utk.edu/gem
Wang, D., M. W. Berry, E. A. Carr, L. J. Gross. Towards Ecosystem
Modeling on Computing Grids, Computing in Science and Engineering
Vol. 13, No. 1, pp55-76, 2005
If space is the final frontier
then spatial control theory sets our course
to apply our ecological understanding of
spatial effects to many practical problems
in applied ecology.
Supported by NSF Awards DMS-0110920, DEB-0219269 and
IIS-0427471
What is spatial control?
What do we do?
How do we do it?
Where do we do it?
How do we assess/monitor to
determine success?
Why is spatial control important?
Much of applied ecology involves
questions for which spatial control
is required.
•Harvesting
•Reserve Design
•Water planning
•Intercropping
These problems offer us the opportunity to demonstrate
the utility of computing in very practical situations, and
link together models with GIS and decision support tools
that natural system managers and policy-makers need.
Example problems in spatial control
– The ATLSS project and Everglades restoration
– Black bears (Salinas, Lenhart)
• Metapopulation approach and human-bear interactions
• Reserves and individual-based models
– Invasives - Lygodium macrophyllum (Duke-Sylvester)
– Invasive control of foci vs outliers - (Whittle, Lenhart)
– Control of integro-difference equation models (Lenhart, Joshi,
Gaff, Whittle)
– Fisheries harvesting (Ding, Lenhart)
– Tick-borne disease control (Gaff)
– Control theory and intercropping (Lenhart, Joshi)
– Managing antibiotic resistance (Duke-Sylvester)
– Wildfire control and optimization (Bains, Berry, Shaw)
American Black Bear (Ursus americanus)
Rene Salinas and
Suzanne Lenhart
Salinas, R., S. Lenhart and L. Gross. 2005. Control of a metapopulation
harvesting model for black bears. Natural Resource Modeling 18:307-321
Current Black Bear Distribution
Southeastern U.S.
Source: Pelton and van Manen (1994)
Current Issues
• The human population surrounding the (GSMNP)
has also grown over the last 70 years.
• Nuisance bear activity is a major problem all
along the Appalachian range.
• With the increase in bear-human encounters, the
likelihood of harmful encounters also increases.
BASE scenario during a
good mast year.
BASE scenario during a
poor mast year.
ALT2 scenario during
the same poor mast
year.
Spatial treatment for control of an
invasive - Detection, Mapping &
Prediction of Spread of Lygodium
microphyllum in Loxahatchee NWR
(Scott Duke-Sylvester)
Background About Lygodium
• Old world climbing fern
• Ranges from Africa to SE Asia/Australia
–
(Pemberton, et. al)
• Introduced to South Florida : prior to 1958
–
(Nauman and Austin, 1978)
• Negatively impacts both flora and fauna
SRF Data
2000
2002
Goals of Modeling
• Provide a method to collect all available
data and suggest additional data
requirements
• Provide a means to assess the impacts of
alternative possible control schemes
• Provide guidance to managers regarding
economics of control
Spatial Model Dynamics
Px,y (t)  Px,y (t 1)   a0 k(x, y, x', y')I x,y (t)Px,y (t)rRx,y (t)
x',y'
I x,y (t)  I x,y (t 1)  a0 k(x, y, x', y')I x,y (t)Px,y (t) a1k(x', y')I x,y (t)Rx,y (t)  Tx,y (t)
x',y'
x',y'
Rx,y (t 1)  Rx,y (t 1)   a1k(x, y, x', y')I x,y (t)Rx,y (t) Tx,y (t)  rRx,y (t)
x',y'
Px,y (0)  P0 x,y , I x,y (0)  I 0 x,y , Rx,y (0)  R 0 x,y
Px,y (t) I x,y (t) Rx,y (t)  1,x, y,t
Results
• Optimal control with limited resources
0%
0%
91-100%
91-100%
Infected
Treated
Results
• Optimal control with limited resources
Total Treatment Effort
Some thoughts on educational issues:
• Collaborations between disciplines can be effective at providing a
common language for interdisciplinary computational science problems,
but cannot be effectively established in a single class or workshop sustained effort is required for effective collaboration
• The move of computer science programs to Engineering colleges may be
effective at encouraging students to develop skills beyond coding, but it
is far from clear that computer science units are the most effective home
for new computational science programs
• Far greater exposure of science students at the undergraduate level to
simulation methods is necessary given the importance of simulation
across science and this implies a change from a curriculum focusing on
scientific computing (e.g. numerical analysis) for these students to one
containing scientific simulation - hosts of good products are available,
but many are little used
• The development of curricula which encourage new uses of parallel
computing beyond simply its potential for speedup should be supported
• Enhanced graduate student use of computational science can arise from
development of more of an “outreach”, service orientation from those
units at universities focused on high performance computing
Key Points
• The availability of parallel computing in its many
forms offers opportunities to rethink how to model
many systems . Accounting for concurrency and the
possibility of synchronous processes arising has great
potential to rethink the way that many biological and
social systems are modeled, going beyond the serial
mindset that underlies much of applied science today.
Developing the capability for this will require
computational scientists with insight in the phenomena
being modeled as well as deep understanding of
parallelism.
• A central question in science is what
macroscopic properties arise from the properties
of the entities which make up the system and
how these are affected my modifications in the
properties of the entities themselves and the
interactions between the entities. Parallel
computational naturally provides a means to
investigate these issues outside of any
constraints arising from a limited set of
available mathematical approaches.
• Realistic modeling of natural systems requires
multiple linked approaches – multimodeling and new methods are needed to develop and
analyze these. Such multimodels utilize a
mixture of different underlying mathematical or
computational approaches, (sometimes called
hybrid models), are a reasonable way to analyze
multiscale phenomena, and present problems
appropriate for coarse parallelization.
• Much of applied ecology deals with problems of
spatial control – what to do, where to do it, when to
do it, and how to monitor it – and these problems
are not easily solved, opening up many new,
fascinating problems in applied mathematics and
computational science. These offer the opportunity
to tie simulation methods with one of the most
pervasive technological tools in environmental
analysis, geographic information systems. Though
readily accepted throughout applied ecology, GIS
has had little connection to system dynamics
methods needed to for decision support in resource
management.
Download