Multi-scale Analysis: Options for Modeling Presence/Absence of Bird Species Kathryn M. Georgitis

advertisement
Multi-scale Analysis:
Options for Modeling
Presence/Absence of Bird Species
Kathryn M. Georgitis1, Alix I. Gitelman1, and Nick Danz2
Statistics Department, Oregon State University
2 Natural Resources Research Institute
University of Minnesota-Duluth
1
R82-9096-01
The research described in this presentation has been funded by the U.S.
Environmental Protection Agency through the STAR Cooperative Agreement
CR82-9096-01 Program on Designs and Models for Aquatic Resource Surveys
at Oregon State University. It has not been subjected to the Agency's review
and therefore does not necessarily reflect the views of the Agency, and no
official endorsement should be inferred
Talk Overview
•
•
•
•
Ecological Question of Interest
Western Great Lakes Breeding Bird Study
Interesting Features of our Example
Options for Modeling Species Presence/Absence
(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Graphical Model
Ecological Question of Interest
• How does the relationship between
landscape characteristics and presence of
a bird species change with scale?
• What scale is the most useful in terms of
understanding bird presence/absence?
Concentric Circle Sampling Design
1000m
500m
100 m
Western Great Lakes Breeding Bird Study
• Response Variable:
– Presence/Absence of Pine Warbler
• Explanatory Variables:
– % land cover within 4 different spatial extents
– Ten land cover types
Interesting Features of the Data
Correlation between Explanatory Variables
Spatial
Extent
pine and oak-pine/
spruce-fir
lowland non-forest/ n. hardwoods /
n. hardwoods
aspen-birch
100m
-0.31 (0.08)
-0.08 (0.08)
-0.07 (0.08)
500m
0.03 (0.08)
-0.17 (0.08)
-0.14 (0.08)
1000m
0.11 (0.08)
-0.24 (0.08)
-0.26 (0.08)
5000m
0.21 (0.08)
-0.58 (0.06)
-0.63 (0.06)
Correlation Between Pine and Oak-Pine
Measured at Different Scales
Spatial Extent
100m
500m
1000m
5000m
100m
1
0.81
0.70
0.45
500m
1000m
(0.05)
(0.06)
(0.07)
1
0.95
0.70
(0.03)
(0.06)
1
0.79
(0.05)
Relationship between Land Cover Variables and
10
20
30
40
50
Chequamegon Forest
Chippewa Forest
St. Croix Forest
Superior Forest
0
Percentage of Pine and Oak-Pine
60
Spatial Extent
0
1000
2000
3000
Spatial Extent (m)
4000
5000
Options for Modeling
Presence/Absence of Pine Warbler
(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 1: Separate Models Approach
(100m)
M1 : log(p(1-p)-1) = C1b1
(500m)
M5 : log(p(1-p)-1) = C5b5
(1000m)
M10 : log(p(1-p)-1) = C10b10
(5000m)
M50 : log(p(1-p)-1) = C50b50
where
Y denotes n-length vector of binary response with Pr(Yi=1) = pi,
C1 denotes matrix of explanatory variables at the 100m scale
Option 1: Separate Models Approach
Model Significant explanatory variables selected
using BIC criteria
M1
lowland conifer, pine and oak-pine
M5
lowland conifer, pine and oak-pine, spruce-fir,
spruce-fir:pine and oak-pine
M10
pine and oak-pine, spruce-fir, spruce-fir:pine and
oak-pine
M50
pine and oak-pine, foresta, foresta:spruce-fir,
spruce-fir
a: The forest variable is an indicator for stands located
in the Chequamegon national forest in Wisconsin.
Option 1: Separate Models Approach
• Disadvantages:
– does not account for possible relationships
between spatial extents
– multi-collinearity of explanatory variable
– 210 possible models for each spatial extent
Options for Modeling
Presence/Absence of Pine Warbler
(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 2: One Model for all Spatial Extents
Mall : log (p (1-p)-1) = Zall ball
where
Y denotes n-length vector of binary response with Pr(Yi=1) = pi,
Zall = [C1, C5, C10]
Option 2: One Model for all Spatial Extents
Spatial extent
Explanatory variables selected using
BIC for Mall
100m
aspen-birch, northern hardwoods, pine and
oak-pine, spruce-fir
500m
none
1000m
spruce-fir
100m:1000m
pine and oak-pine:spruce-fir
Option 2: One Model for all Spatial Extents
Advantages:
– allows for interactions between scales
Disadvantages:
– serious multi-collinearity problems
– 230 possible models
Options for Modeling
Presence/Absence of Pine Warbler
(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 3: Model using Functionals of
Explanatory Variables
• Difference Model
Mdiff : log (p (1-p)-1) = Zdiff bdiff
where Zdiff = C5 - C1
(element-wise)
• Proportional Model
Mprop : log (p (1-p)-1) = Zprop bprop
where Zprop = C5 /C1
(element-wise)
Option 3: Model using Functionals of
Explanatory Variables
Model
Explanatory variables selected using
BIC
Mdiff
pine and oak-pinediff
Mprop
aspen-birchprop , pine and oak-pineprop
Option 3: Model using Functionals of
Explanatory Variables
• Advantages:
– incorporates two spatial extents
• Disadvantages:
– biologically meaningful?
– multi-collinearity
– model selection
Options for Modeling
Presence/Absence of Pine Warbler
(1) Separate Models for Each Spatial Extent
(2) One Model for all Spatial Extents
(3) Model using Functionals of Explanatory Variables
(4) Bayesian Network (Graphical) Model
Option 4: Graphical Model
- think of explanatory variables and response
holistically (i.e., as a single multivariate
observation)
X1
X2
X3
Y
Logistic Regression Model
X4
X1
X2
X3
X4
Y
Bayesian Network (Graphical) Model
Option 4: Graphical Model
For comparison with MALL, we use the
same “explanatory” variables
pine & oak-pine
100m
aspen-birch
100m
spruce-fir
1000m
n. hardwoods
100m
spruce-fir
100m
Pine
Warble
r
Option 4: Graphical Model
Diagram of MALL
N. hardwoods
100m
aspen-birch
100m
spruce-fir
100m
Diagram of Bayesian MALL
spruce-fir
1000m
pine & oak-pine
100m
Pine
Warbler
Where Z= variables in MALL
log (p (1-p)-1) = Zball ; fixed Z
N. hardwoods
100m
aspen-birch
100m
spruce-fir
100m
spruce-fir
1000m
pine & oak-pine
100m
Pine
Warbler
Z ~ Multinomial(P,100)
log(spruce-fir1000)~ N(m,s2)
log (p (1-p)-1) = Z b + b5 log(spruce-fir1000)
Option 4: Graphical Model
Comparison of MALL and Bayesian MALL
Land cover type variable
intercept
MALL
Bayesian MALL
-3.87 (1.27)
-4.20 (1.18)
aspen-birch100
0.02 (0.01)
0.03 (0.01)
northern hardwoods100
0.03 (0.01)
0.03 (0.01)
pine and oak-pine100
0.06 (0.01)
0.10 (0.02)
spruce-fir100
0.02 (0.01)
0.02 (0.01)
log(spruce-fir1000)
0.3 (0.44)
0.34 (0.41)
-0.02 (0.008)
-0.02 (0.008)
pine and oak-pine100:
log(spruce-fir1000)
Option 4: Graphical Model
Bayesian MALL
N. hardwoods
100m
aspen-birch
100m
spruce-fir
100m
Bayesian Network Model
spruce-fir
1000m
pine & oak-pine
100m
Pine
Warbler
Where Z= variables in MALL
Z ~ Multinomial(P,100)
log(spruce-fir1000)~
N(m,s2)
log (p (1-p)-1) = Z b + b5 log(spruce-fir1000)
N. hardwoods
100m
aspen-birch
100m
spruce-fir
100m
spruce-fir
1000m
pine & oak-pine
100m
Pine
Warbler
Zi ~ Multinomial(Pi,100)
Pi=(Pi,1, Pi,2, Pi,3, Pi,4, Pi,5)
log(Pi,1/(1- Pi,1))=f0 + f1 log(spruce-fir1000)
log(spruce-fir1000)~ N(m,s2)
log(p (1-p)-1) = b0 + b1 pine & oak-pine100
Option 4: Graphical Model
Comparison of two Bayesian Network Models
Component
-2log likelihood
for
Bayesian MALL
160.9
-2 log likelihood for
Bayes Network
Model
179.4
100m Scale
25699.5
24478
1000m Scale
379.4
379.4
26239.8
25036.8
26354 (13)
25062 (11)
PIWA
Total
BIC total
Option 4: Graphical Model
• Advantages:
– considers ecological system holistically
– can eliminate multi-collinearity
– biologically meaningful
• Disadvantages:
– model selection
– implementation issues
Acknowledgements
Don Stevens, OSU
Jerry Niemi, N.R.R.I Univ. of Minn., Duluth
JoAnn Hanowski, N.R.R.I Univ. of Minn., Duluth
Download