Algorithmic Methods in Conservation Biology

advertisement
Algorithmic Methods in
Conservation Biology
Steven Phillips
AT&T Labs-Research
Vignettes: Data  Models  Policies
• Species detection: tree swallow roosts from radar
• Modeling species distributions
– Challenge 1: Presence-only data (Maxent)
– Challenge 2: Non-stationarity (STEM)
• Planning protected areas to allow dispersal
– Network flow, mixed integer programming
• Thanks to Tom Dietterich, Rebecca Hutchinson & Dan
Sheldon (Oregon State University) for many slides!
2
Dover, DE, 10/2/2010@6:52AM
The Dream
•
Automatic detection of roosts
at continent-scale on daily
basis
– Data gathering and
repurposing
•
Unprecedented view of species
distribution
Source: NOAA
– Spatial coverage
– Temporal resolution
•
Analyze results to learn about
– Roost biology
– Migration patterns
– Climate change
• Data archived since 1991
[Winkler, 2006]
Research by D. Sheldon & T. Dietterich (OSU) and D. Winkler (Cornell)
Progress: Machine Learning
•
Challenging image recognition task!
– Primarily shape features to-date – no temporal sequencing
– High precision for roosts with “perfect appearance”
– Variability in appearance is challenging  low recall
100 positive
examples
Top 100 predicted roosts
(shape features + SVM)
Progress: Ecology
•
Locating roosts
– Identifying roosts in radar
images
• Labeling efforts
– Estimate ground location within
a few km
• Previously difficult task
• 15+ roosts located in 2010-2011
– Oregon, Florida, Louisiana
•
Analysis of labeled data
– Understand regional patterns
– Roost growth dynamics
• Very predictable
• Potential species ID from radar!
Florida
Vignettes: Data  Models  Policies
Species Distribution Models (SDM)
SDM Challenge #1: Presence-only data
Yellow-throated
Vireo
occurrence points
…
environmental
variables
Predicted distribution
A solution: Maxent
•
•
•
Given:
• Training examples x1, …, xn
• Assumed to be from an unknown
distribution π = P(x|y=1)
• Environmental variables f1(x), …, fm(x)
Find:
• A good estimate of π (as a function of f1, …, fm) …and P(y=1|x)
Method: L1-regularized Maxent
• Maximum entropy principle: among distributions consistent with
the data, prefer one of maximum entropy (Jaynes, 1957)
• Consistency given by relaxed moment constraints:
• | Eπ[fi] –∑j fi(xj)/m | ≤ βi
•
E.g., “mean rainfall must be close to mean rainfall at training
examples”
S. J. Phillips, R. E. Schapire and M. Dudík 2004; S. J. Phillips, R. P. Anderson and R. E. Schapire 2006
Application: Protected area design
Application: Protected area design
(a) Dracula ant (Mystrium mysticum)
(b) Grandidier’s baobab (Adansonia grandidieri)
(c) Common leaf-tailed gecko (Uroplatus fimbriatus)
(d) Indri, the largest lemur species (Indri indri)
Application: Protected area design
Kremen et al., Science 320(5873), 2008, pp 222-226
Application: Invasive species
Cane toad: known
occurrences
Cane toad: areas
vulnerable to invasion
Elith et al., Methods in Ecology & Evolution 1, 2010, pp 330-342.
Application: guiding field surveys
Figures by Richard Pearson, AMNH
Application: guiding field surveys
Target survey areas
Highest priority
Lower priority
Chameleons (Brookesia & Chamaeleo)
Leaf-tailed geckos (Uroplatus)
Day geckos (Phelsuma)
Application: guiding field surveys
?
?
?
?
?
?
?
? ?
Results: new species
of chameleon
Calumma sp. 1
Calumma sp. 2
Results: new species of iguana, snake
Oplurus sp.
Liophidium sp.
and others…
Application: Giant exploding palm
J. Dransfield et al., Botanical Journal of the Linnean Society, 2008, 156, 79-91.
SDM Challenge #2: Non-stationarity
• Problem: predictor-response relationships can change
over space and time
• A solution: Spatial-Temporal Exploratory Models (STEM)
– Create ensembles with local spatial/temporal support
– Base learner = classification trees
• eBird
– Citizen Science
– Dataset publicly available for analysis
– LOTS of data!
• ~3 million observations reported this May
STEM
D. Fink et al., Ecological Applications, 2010, 20(8):2131-47
STEM SDM: Indigo Bunting
Animation courtesy of Daniel Fink
Vignettes: Data  Models  Policies
Reserve planning for Protea Dispersal
~300 endemic species in the fynbos of the Western Cape of S. Africa
Suitable conditions will shift under climate change
Limited dispersal ability (ants, rodents…)
Modeled distributions of Protea lacticolor
Source: Hannah et al., BioScience, 2005
Shifting suitable conditions
Interpretation: a patch of suitable conditions moving slowly
enough to support the species over time
Dispersal chain:
– Sequence of suitable cells (one per time slice)
– Physical distance between cells limited by dispersal ability
The goal: find disjoint dispersal chains for each species:
– At least 35 (100 km2) chains per species, if possible
Minimize #cells with proposed protection
– Union of all chains, non counting already protected
P. Williams et al., Conservation Biology 19(4) pp 1063—1074, 2005
Dispersal as network flow in a layered graph
cell suitable
for species
In this slice
•
•
dispersal
possibilities
Path from source to sink = dispersal chain for one species
With unit capacity arcs, an integral flow of size 35 represents a
set of 35 non-overlapping chains
S. J. Phillips et al., Ecological Applications 18(5), 2008, pp. 1200-1211
Solution: network flow and linear programming
• Flow conservation constraints are linear
• Integer variables: Preserve for each cell (0 or 1)
• Exact solution of MIP:
– Minimum possible number of protected cells to achieve the
conservation goal
Light grey: transformed
Green: already protected
Black: goal essential
Orange: MIP solution
Questions?
Download