Current and Future Challenges in Mathematical and Computational Biology

advertisement
Current and Future Challenges in
Mathematical and Computational Biology
Louis J. Gross
The Institute for Environmental
Modeling
Departments of Ecology and
Evolutionary Biology and Mathematics
University of Tennessee
Overview
• Complex systems – examples and a game
• Biology – opportunities for quantitative
approaches
• Computational ecology – panthers, bears
and gators, oh my!
• Everglades restoration - Tales from
the Real World: Mathematics
and Computing Meets Greed,
Politics, Lawyers and the Army
Corps of Engineers
What is biocomplexity?
Bio – pertaining to life, so must include some
aspect of this.
Complexity – A complex system is one in
which a well trained scientist knowledgable
about the system of concern cannot rapidly
intuit how the system will behave. By rapidly
here, I mean with the aid of simple
computational tools (pencil and paper,
computer, calculator, etc.) in a few minutes.
Routes to complexity
Complexity should include systems that may
be quite easily described, but have
underlying complicated responses (e.g. the
chaotic dynamics models of single or multiple
populations) that one cannot intuit easily, as
well as systems that most of us would agree
are complicated due to multiple interacting
factors (food webs with many components,
ecosystems with dynamic and spatial
responses on multiple scales).
www.clearwater.org
Hudson River System –
questions that might be asked:
• What are the projected changes in fish
populations in the next few years?
• What is the impact of salt water dynamics
along the river on the biota present?
• How are pollutants such as PCB
transported through the natural system?
• What are the effects of PCB on the
organisms present?
How does math relate to this?
• Models in the form of systems of
differential equations describing the levels
of toxicants in different “compartments”,
such as reaches, sediments, and species
populations, carry out a mass balance,
tracking the changes in amounts of
toxicants throughout the natural system.
Fate Models
These mass balance models are elaborated
to account for the changes in body burden
of various toxicants through the river
system, particularly in fish. This requires
models for the effects of toxicants on
physiology, behavior and reproduction.
Effects Models.
Effects models for various species taking
account of the interactions across the food
webs in the river are combined with
physical models for salinity, hydrology
and nutrients to provide an assessment of
system response to alternative
management (e.g. dredging).
Ecosystem models.
Some of these models are done
for the Hudson, but not all.
•It takes a great deal of data and iterative
modeling to make this work. There’s alot
left for you to do here.
•For now though, let’s go to something
“simple”.
Music courtesy of Steve Kaufman, 3-time
National Flatpicking champion
What is going on here?
• This is an example of a Polya Urn scheme
• If you think of what is happening in one
particular cup, this corresponds to the sample
path or trajectory for a Stochastic Process (a
collection of random variables, Xn which give
the number of dark beans in the cup at time n)
• Each cup (or sample path) is independent of the
others, and we describe the whole process by
looking at the collection of sample paths at each
time, and calculate the fraction of these having a
particular value – this gives a histogram that
describes the probability distribution at each
time.
• We can simulate this and we see that each
sample path approaches a particular fraction of
beans in the cup of a particular color.
• When we look across all cups after a long period
of time we see that the histogram approaches a
particular distribution – in the case we did it is
the Uniform distribution – each possible fraction
between 0 and 1 is equally likely.
Why does this happen?
• If we let Zn = Xn / (n+2) be the fraction of
dark beans in the cup after we draw and
add a bean n times, then we can show
a
a
a1
a
a
E[Z n|Zn  1 
]

 (1 
)
n1
n1 n 2
n1 n 2
a

 Z n 1
n1
so Zn is a martingale.
A martingale can be proved to have a
limiting distribution so that
lim Z n  Z
n 
Where Z is the limiting random
variable and it’s distribution is called
the limiting distribution – this is the
Uniform distribution in our cup
experiment.
0.3
0.2
0.1
S1
S4
0
4
1
S7
10
7
S10
Urn Schemes and Null Models:
Cohen, Joel E. 1976. Irreproducible Results and
the Breeding of Pigs (or Nondegenerate Limit
Random Variables in Biology). BioScience
26:391-394.
For an ecological example of extensions of these
ideas see The Unified Neutral Theory of
Biodiversity and Biogeography. Stephen P.
Hubbell. Princeton University Press. 2001.
For an evolutionary theory example of extensions
see Gavrilets, S, R. Acton and J. Gravner. 2000.
Dynamics of speciation and diversification in a
metapopulation. Evolution 54:1493-1501
What is computational ecology?
An interdisciplinary field devoted to the
quantitative description and analysis of
ecological systems using empirical data,
mathematical models (including statistical
models), and computational technology.
Focus includes: Data Management,
Modeling, and Visualization (Helly et al.,
1995)
Computational Ecology
Theory development
• How do population and
community properties arise
from individual behaviors?
• How does spatial and
temporal heterogeneity
affect population and
communities?
• How are distribution and
abundance of species
linked to spatio-temporal
patterns of evolution?
Applications
• How do we link dynamic
models with spatial data
to aid natural system
monitoring and
management, including
reserve design, water
flow control, and
harvesting schedules?
• How do we include
socio-economic analysis
with ecological models?
Environmental Modeling
Species densities
Data sources
Animal telemetry
GIS map layers (Vegetation,
hydrology, elevation),Weather,
Roads, Species densities
Physical conditions
Monitoring
Models
Statistical
Management input
Differential
equations
Harvest regulation
Matrix
Water control
Reserve design
Agent-based
Analysis
Visualization, corroboration,
sensitivity, uncertainty
Simulation
Matlab, C++, Distributed, Parallel
Overview
•
•
•
•
Everglades natural history
History of hydrology in South Florida
Restoration planning
Computational ecology
– Multimodeling
– The ATLSS project and Everglades restoration
• Problems in spatial control
– Reserves and individual-based models
– Managing antibiotic resistance
– Theory and intercropping
• What are some future challenges?
Key Points
• Realistic modeling of natural systems requires
multiple linked approaches – multimodeling and new methods are needed to develop and
analyze these.
• Much of applied ecology deals with problems of
spatial control – what to do, where to do it, when
to do it, and how to monitor it – and these
problems are not easily solved, opening up many
new, fascinating problems in applied
mathematics and computational science.
• It can be very rewarding for mathematicians to
get involved in “big” multidisciplinary problems
Wet Season:
May-October
Dry Season:
November-April
Photos: South Florida Water Management District
Everglades Restoration
The Everglades and Big Cypress Swamp
of South Florida are characterized by
complex patterns of spatial
heterogeneity and temporal
variability, with water flow being
the major factor controlling the
trophic dynamics of the system. A key
objective of modeling studies for
these systems is to compare the
future effects of alternative
hydrologic scenarios on the biotic
components of the systems.
Recent History of Everglades Restoration
C&SF Project facilities developed since 1940’s include 30 pumping
stations, 212 control and diversion structures, 990 miles of levees,
978 miles of canals, 25 navigation locks, and 56 railroad bridges.
1992 - Congress authorizes Comprehensive Review Study
(Restudy) of the C&SF Project to develop modifications to
restore the Everglades and Florida Bay ecosystems while
providing for the other water-related needs of the region.
1999 - Restudy Plan submitted to Congress on July 1.
Restudy Objective:
Develop a comprehensive plan for implementing changes needed
to meet water supply needs through 2050 and restore over 2.4
million acres of the greater Everglades ecosystem
Agencies involved in Restudy:
U.S. Army Corps of Engineers
Environmental Protection Agency
National Park Service
National Marine Fisheries Service
Natural Resources Conservation Service
U.S. Fish and Wildlife Service
Florida Department of Agriculture and Consumer
Services
Florida Department of Environmental Protection
Florida Game and Fresh Water Fish Commission
South Florida Water Management District
Miccosukee Tribe
Seminole Tribe
plus input from numerous NGO's and individuals.
Plan includes:
Reconnecting over 80 percent of the remaining
Everglades by removing over 240 miles of internal
levees and canals.
Reduce the average of 1.7 billion gallons of water
wasted every day from discharges to the ocean
Additional land purchases of 47,000 acres as an
addition to ENP
Approximate cost: $7.8 Billion over 20 years
What is computationally
challenging in this?
• Space-time linkages
• GIS very limited at dynamic modeling
• Different components operate on different scales
(resolution required differs between components)
• Model data can be huge
• Models are complex
• Large state variable dynamical systems
• Large numbers of interconnected agents
• Models are not independent - multimodeling
Everglades natural system management requires
decisions on short time periods about what water
flows to allow where and over longer planning
horizons how to modify the control structures to
allow for appropriate controls to be applied.
This is very difficult!
•The control objectives are unclear and
differ with different stakeholders.
•Natural system components are
poorly understood.
•The scales of operation of the physical
system models are coarse.
So what have we done?
Developed a multimodel (ATLSS - Across Trophic
Level System Simulation) to link the physical and
biotic components.
Compare the dynamic impacts of alternative
hydrologic plans on various biotic components
spatially.
Let different stakeholders make their own
assessments of the appropriate ranking of
alternatives.
http://atlss.org
ATLSS (Across Trophic Level
System Simulation)
ATLSS is structured as a multimodel, a
mixture of modeling approaches based upon
the inherent temporal scales and spatial
extent of various trophic components,
linked together by spatially-explicit
information on underlying environmental
(e.g. water, soil structure, etc.), biotic
(e.g. vegetation), and anthropogenic
factors (e.g. land-use). The approaches
currently involved include static
spatially-explicit indices, compartment
analysis, differential equations for
structured populations and communities, and
individual-based models.
What ATLSS attempts to do
• Provide a general methodology for
regional assessment of natural systems by
coupling physical and biotic processes in
space and time using a mixture of
modeling approaches.
• Utilize the best available science and
intuition of many biologists with extensive
field experience to construct models for
particular system components and link
these at appropriate spatial and temporal
resolutions
What ATLSS attempts to do (con’d)
• Provide a method to compare the relative
impacts of alternative management of the
region on the natural systems, so different
stakeholders can focus on sub-regions,
species, or conditions of particular interest
to them.
• Ensure that the structure of the multimodel
is extensible so that as new models, data
and monitoring information becomes
available, it may be efficiently utilized.
Individual-Based
Models
Age/Size Structured
Models
Cape Sable
Seaside Sparrow
Snail Kite
White-tailed Deer
Wading Birds
Florida Panther
Fish Functional Groups
Alligators
Radio-telemetry
Tracking Tools
Reptiles and Amphibians
Linked Cell
Models
Lower Trophic Level Components
Vegetation
Process Models
Spatially-Explicit
Species Index Models
Cape Sable
Seaside Sparrow
Long-legged
Wading Birds
Short-legged
Wading Birds
Snail Kite
Abiotic Conditions
Models
High Resolution Topography High Resolution Hydrology
White-tailed Deer
Alligators
Disturbance
© TIEM / University of Tennessee 1999
ATLSS High Resolution Topography
* The High Resolution
Topography model provides
more detail about local
variation in elevation.
* The detail captures variation
in elevation due to important
features such as tree islands.
High Resolution Topography
Water Management Model Topography
ATLSS High Resolution Hydrology
* With the High Resolution
Topography, High Resolution
Hydrology values can be created from
the SFWMD hydrology.
High Resolution Hydrology
* Hydrology values created in this way
provide the spatial variation and
resolution required to model the
dynamics of many animal populations
in South Florida.
4 miles
SFWMD Hydrology4 miles
How High Resolution Topography Is Made.
Habitat cover map, provided by the
Florida GAP analysis
4 miles
At each location in the Florida GAP map, the model predicts a ground surface which is higher or
lower than the base ground surface, derived from the hydroperiod of the cell, as given by the SFWMD
hydrology data, and the estimated hydroperiod for the habitat type at that location.
The total volume of water predicted by the SFWMD model in each grid cell is preserved in the
High Resolution Hydrology Model.
Estimates of hydroperiod for each habitat
type in the Florida GAP analysis map.
Class
MinHp
0
365
45
180
30
15
40
45
0
365
10
60
0
0
10
0
….
….
….
A hydroperiod curve for each location on the
map showing the number of days the water
surface was at or above each elevation. This
curve is generated from the
Calibration/Validation run of the SFWMD
hydrology model.
Max HP
0
1
2
3
4
5
6
7
Spatially-Explicit Species Index (SESI)
Models
The simplest of the ATLSS models, they are
designed as extensions of habitat suitability
index models, to provide yearly assessments
of the effects of within and between year
hydrology variation on basic requirements for
foraging and breeding in a spatially-explicit
manner. They allow comparisons of
alternative scenarios, and allow different
stakeholders to focus on their own criteria.
ATLSS SESI Models
Implement and Execute the Models for a Hydrology Scenario
Objectives: Integrate SESI components into a cohesive computational
framework and apply the models to a hydrology scenario.
Hydrology Scenario
Daily Water Depth
Distribute water over high resolution
topography
High Resolution Hydrology
SESI Models
Cape Sable
Seaside Sparrow
Are the nests
flooded during
egg incubation?
Snail
Kite
Are conditions
favorable for the
apple snails they
depend on?
Wading Birds
Are water depths in
the correct range for
the fish they eat?
Standard Output Generation/Visualization Tools
White-tailed
Deer
Is breeding
disrupted by
high water
levels?
American
Alligator
Is there high ground
to build a nest on?
ATLSS Fish Functional Group Dynamics Model
Fish biomass is one of the most important components of the
Everglades system. To produce projections of fish biomass
ATLSS uses a...
… spatially explicit size-structured dynamic simulation model,
ALFISH.
ALFISH simulates the number, size-structure and biomass
densities of “small fish” and “large fish” functional groups in the
freshwater marsh on 5-day time steps.
This represents the temporally and spatially varying food base for
wading birds.
ALFISH has been evaluated through comparisons to some sites in
Shark Slough and WCA3.
ALFISH
Objectives
•Provide estimates of effects of alternative water
management scenarios on spatial and temporal
distribution of food resources for upper trophic
level consumers (wading birds).
•Provide method to evaluate hypothesized impact of
hydrologic changes on fish community composition.
ATLSS Landscape Fish Model
Holly Gaff, Rene’ Salinas, Louis Gross, Don
DeAngelis, Joel Trexler, Bill Loftus and John
Chick
Approach
A size-structured population model for fish functional
groups (large and small fish) that operates on a spatial
cell basis with movement between cells and between
habitats within cells.
ALFISH FLOW CHART
Fish Cell Layout
Example of Small Fish
Least Killifish
Heterandria formosa
Female
Male
Pond areas assumed permanently
wet, marsh areas periodically dry
Landscape Layout and Movement
Fish as Prey
Fish provide the prey-base for
endangered wading bird species such
as Great Egret (Casmerodius albus)
White - movement from low water to high water areas
Red - movement from high fish density to low density areas
ALFISH MODEL EXAMPLE RESULTS - Alt D13r4 compared to F2050Base
Fish Available as Prey during
a Typical Rainfall Year
Fish Available as Prey during
a High Rainfall Year
Average Fish Available as
Prey from 1965 - 1995
Fish Available as Prey during
a Low Rainfall Year
Distribution of Sizes for Fish
in WCA 3A
Total Fish Densities through
31-year Model Run
Average Fish Available as
Prey during Breeding Season
for Wading Birds
Total Fish Densities for
Certain Years in Given Areas
ATLSS Individual-Based Demographic Models
The ATLSS SESI models can provide considerable information
about spatial and temporal patterns of habitat conditions
affecting breeding and foraging. They can indicate how one
scenario differs from another, but no demographics are included.
To include demographics and thus project population-level
dynamics, ATLSS uses...
… spatially explicit individual-based (SEIB) demographic
models:
Snail kite, Cape Sable seaside sparrow, Florida
panther/white-tailed deer
These contain life cycle and behavioral information and they
allow the user to simulate population levels, structure and
growth.
SIMSPAR Flow Diagram
Projected Population Size
Fledgling
Productivity Maps
Take-Home Messages
• Realistic modeling of natural systems requires
multiple linked approaches – multimodeling.
• ATLSS has been successful in providing a
flexible structure in which new models can be
included, and new data taken into account to
modify existing models
• ATLSS has provided a rational approach, based
upon the best available science, for providing
multiple stakeholders with some of the tools
they need to have input into regional planning
Collaborations in ATLSS
In addition to various Federal and State cooperators, ATLSS
has involved researchers at
Florida International University
Southwestern Louisiana University
University of Florida
University of Maryland
University of Miami
University of Tennessee
University of Washington
National Wetland Research Center (USGS)
The Institute for Bird Populations
Everglades Research Group
Netherlands Institute of Ecology
Some other spatial control problems
•Bears and hunting preserves – an
application of individual-based
modeling (Rene’ Salinas)
•Controlling antibiotic resistance –
numerically intensive spatial modeling
(Scott Duke-Sylvester)
•Intercropping – a theoretical approach
coupling reaction-diffusion and ODE
models (Suzanne Lenhart)
Spatial Control and IndividualBased Models
• Spatial components
• Reserve design
• Habitat conditions
• Resource availability
• Individual-Based Models
•
•
•
•
Model space explicitly
Account for differences between individuals
Model movement explicitly
Test demographic forcing
American Black Bear (Ursus americanus)
Current Black Bear Distribution
Southeastern U.S.
Source: Pelton and van Manen (1994)
Model Description
• Individual-Based
• Time
• Daily time step
• Length of run is user defined
• Area
• 450m x 450m Cells
• 279 Km x 175.5 Km (effective area is smaller)
• Allows for variation in various spatial components.
• Sanctuaries
• Park and forest boundaries
State Variables
•
•
•
•
•
•
•
Age
Sex
Location
Denning
Estrus
Mating Status
Cubs
Flow Diagram
Initialization
Set Mast Values?
No
Movement
Update Food
Mortality
Update Indices
Yes
Mast Functions
Harvesting
• Tennessee Season
• Dec. 1- Dec. 14
• No cubs or females with cubs.
• North Carolina Season
• Oct.14 - Nov. 21 and Dec. 14 - Dec. 31
• No cubs or females with cubs.
• No harvesting in GSMNP or bear sanctuaries.
• One bear limit per calendar year
Variation in Spatial layout of
Sanctuaries
• Nantahala
• All of Nantahala National Forest is a sanctuary.
• Aside from GSMNP, there are no other sanctuaries.
• Pisgah+
• All of Pisgah National Forest is a sanctuary plus the present
sanctuaries in Nantahala.
• Aside from GSMNP, there are no other sanctuaries.
• Each has approximately the same sanctuary
area.
Spatial Control of Antibiotic Resistance
• Assumption: Limitations on the application of
certain antibiotics can be viewed in a spatial context
which, if effectively implemented, could extend the
time period of utility of a particular antibiotic
treatment.
• Question: Under what circumstances would a
spatial control policy be preferable to a policy of
uniform spatial rotation or a policy of local choice
with no overall spatial management?
Approach
Formulate the underlying problem in a discretetime, discrete-space, continuous state-variable,
finite control-set framework.
Assume a discrete set of spatial regions (cells),
interconnected by movements of individuals
between them, causing associated movements of
resistance to particular disease strains.
Do not track population movements between
regions, in order to reduce the complexity and
state space of the problem.
The state variables within each spatial cell are
the continuous levels of resistance to particular
disease strains.
Intercropping and Pathogen Dispersion
• Assumption: There are two crop varieties, one of
which is more resistant to a pathogen but which has
an associated higher cost or lower yield than the
crop with lower resistance.
• Question: How might we analyze the spread of the
pathogen linked to the growth of the crop and
produce optimal spatial planting patterns which
maximize yield, or minimize cost.
• Approach: Develop a general theory for boundarycontrol of reaction-diffusion type equations linked
to ordinary differential equations for local crop
growth.
Pathogen and 2 crop model
pt  d1 pxx   1c1up   2c 2vp  p
du
u
 r 1u (1  )  c1up
dt
K1
dv
v
 r 2v(1  )  c 2vp
dt
K2
u(t) and v(t) are time-dependent local crop densities,
p(x,t) is pathogen density. u and v vary with x but with
no movement, while the pathogen does move.
Objective function is:
1
J (u 0)  max  [( A1u  A2v)( x, T )]dx
0
And choose u0(x) + vo(x) < K the maximum initial
planting density. Assume Dirichlet boundary conditions
for the pathogen, some initial pathogen distribution, and
assume the spatial domain is [0,1]. Then with
appropriate assumptions it is possible to develop
optimal solutions.
We are also working on spatial control for integrodifference equations models of population spread.
Take-Home Messages
• Realistic modeling of natural systems requires
multiple linked approaches – multimodeling and new methods are needed to develop and
analyze these.
• Much of applied ecology deals with problems of
spatial control – what to do, where to do it,
when to do it, and how to monitor it – and these
problems are not easily solved, opening up
many new, fascinating problems in applied
mathematics and computational science.
Acknowledgements
• USGS Biological Resources Division
• National Science Foundation
• UT Center for Information Technology
Research
• UT Scalable Intra-Campus Network Grid
Download