Uploaded by KKYYC C

Journal of Animal Ecology - 2010 - Duchesne - Mixed conditional logistic regression for habitat selection studies

advertisement
Journal of Animal Ecology 2010, 79, 548–555
doi: 10.1111/j.1365-2656.2010.01670.x
Mixed conditional logistic regression for habitat
selection studies
Thierry Duchesne1*, Daniel Fortin2 and Nicolas Courbin2
1
Département de Mathématiques et de Statistique, Université Laval, Sainte-Foy, QC, Canada G1V 0A6; and 2Chaire de
Recherche Industrielle CRSNG-Université Laval en Sylviculture et Faune, Département de Biologie, Université Laval,
Sainte-Foy, QC, Canada G1V 0A6
Summary
1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies.
RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression,
mixed conditional logistic regression remains largely overlooked in ecological studies.
2. We demonstrate the significance of mixed conditional logistic regression for habitat selection
studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in
the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of
preference for habitat type A over habitat type B does not depend on the other habitat types also
available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat
selection of free-ranging bison Bison bison.
3. When movement rules were homogeneous among individuals and the IIA assumption was
respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best
estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions.
4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong interindividual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for
farmlands.
5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies,
but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection
and lead to departure from IIA. These situations are best modelled with mixed-effects models.
Mixed-effects conditional logistic regression should become a valuable tool for ecological
research.
Key-words: case–control location sampling, farmland, Global Positioning System, likelihoodratio test, mixed multinomial logit model, Prince Albert National Park, Spatially Explicit Landscape Event Simulator
Introduction
The resource selection function (RSF) is currently one of the
dominant tools used to quantify habitat selection (McLoughlin et al. 2010). RSFs link animal distribution to spatial
patterns of habitat heterogeneity by contrasting the charac*Correspondence author. E-mail: thierry.duchesne@mat.ulaval.ca
teristics of animal locations with those of a set of random
locations (Manly et al. 2002). Random locations are often
drawn across home-ranges of individuals (Compton, Rhymer & McCollough 2002), in which case observed (response
variable coded as ones) and random (response variable coded
as zeros) locations are generally contrasted with unconditional logistic regressions. Under such a sampling design,
however, estimation methods must consider that a certain
2010 The Authors. Journal compilation 2010 British Ecological Society
Mixed-effects models for habitat selection 549
number of random locations might have been visited, in
which case they do not all represent true absences (Keating &
Cherry 2004; Johnson et al. 2006). The use of a matched
design can then become advantageous. With a matched
design, each observed location is associated with a specific set
of random locations drawn within a limited spatial domain
(Boyce 2006), often corresponding to the distance where the
animal could have travelled during the relocation time interval (Boyce et al. 2003). Because the animal could not also be
at the random locations when its actual location was
acquired, random locations represent true absences. Furthermore, matched designs are appropriate when evaluating the
habitat selection of animals with home-ranges that are either
not well defined or large relative to the distance individuals
move between relocations (Arthur et al. 1996; Compton
et al. 2002). RSFs based on a matched design are estimated
by conditional logistic regression (Compton et al. 2002; Boyce et al. 2003; Boyce 2006; McDonald et al. 2006), an
approach that is becoming increasingly used in habitat selection analysis.
Despite difficulties in assigning a variance–covariance
structure (Craiu, Duchesne & Fortin 2008; Koper &
Manseau 2009), the value of random effects in RSFs has been
largely recognized in the case of models developed from
non-matched designs (Gillies et al. 2006; Hebblewhite &
Merrill 2008). Mixed effects should be better suited to analyse
unbalanced data sets or when selection for the different landscape attributes vary among individuals (Gillies et al. 2006).
Moreover, mixed-effects models can handle the situation
where several matched sets of locations come from a same
animal and are thus correlated.
The addition of random effects also provides these advantages in studies based on matched sampling designs, but
mixed-effects conditional logistic regressions have been largely overlooked in ecological research (but see Bruun &
Smith 2003; Fortin et al. 2009). Moreover, unlike mixedeffects conditional logistic regression, fixed-effects models
rely on assumptions that might not faithfully represent certain ecological systems. Fixed-effects models assume that the
strength of selection is homogeneous among individuals
within the population and thus estimate the population-averaged selection. Fixed-effects conditional logistic regression
also implies independence from irrelevant alternatives (IIA,
Revelt & Train 1998). The IIA hypothesis states that the
strength of preference for (i.e. the odds of choosing) habitat
type A over habitat type B does not depend on the other habitat types also available. Because behavioural decisions reflect
trade-offs among multiple competing demands, changes in
available options may alter individual preferences, thereby
violating the IIA assumption. For example, prey often make
greater use of patches located in relatively safe areas (Hay &
Fuller 1981; Morrison et al. 2004; Hochman & Kotler 2007).
The foraging efforts and selectivity of Nubian Ibex (Capra
nubiana F. Cuvier, 1825) vary with the distance from the
safety of a cliff (Hochman & Kotler 2007). In other words,
the strength of preference for a given type of food patch over
the baseline patch type depends on the presence or absence of
a cliff at close proximity, a spatial dependency that might violate the IIA hypothesis. In this context, fixed effects may yield
inappropriate conclusions, potentially leading to unfavourable management actions.
In this study, we illustrate how departures from the
assumption of homogeneous selection among individuals
due either to inter-individual variability in movement rules
or to the violation of the IIA assumption may influence the
estimation of RSF parameters under a matched sampling
design. We begin by showing how mixed effects can be
incorporated into conditional logistic regression model. We
follow an approach based on random utility theory (Cooper
& Millspaugh 1999) because it is easily interpretable in the
resource selection context and because the exponential form
of the RSF is robust to misspecification (McFadden & Train
2000). We then explain why, unlike the fixed-effects model,
its mixed-effects counterpart remains appropriate under
some types of violation of IIA. We follow with a simulationbased investigation of the impact of departures from the
homogeneity in selection probabilities and from the IIA
assumption on the estimation of RSFs. Finally, we illustrate
the methods with an analysis of habitat selection by the freeranging bison Bison bison (Linnaeus, 1758) of Prince Albert
National Park (Saskatchewan, Canada) during the springs
of 2005–2008. In the spring, bison occasionally leave the
park for adjacent private lands where they sometimes damage fences and crops, disturb livestock, and get killed by
hunters. It can be beneficial for management to evaluate
whether bison use of farmlands results from an active selection, and to quantify whether cross-boundary movements
are made by few individuals or whether it is a widespread
behaviour. We show that conditional mixed-effects RSFs
are better suited than marginal fixed-effects RSFs to achieve
this goal.
Materials and methods
RANDOM EFFECTS IN CONDITIONAL LOGISTIC
REGRESSION
As with ordinary (unconditional) logistic regression, random effects
can be included in conditional logistic regression models by replacing
fixed regression coefficients with random coefficients. Because of the
conditioning involved, conditional models have no intercept term
and random effects are included as random regression coefficients. In
the resulting mixed multinomial logit model (sensu Revelt & Train
1998), each animal assigns a value, termed utility (U), to all landscape
locations, and among the locations available at a given time selects
the one with the highest utility (Cooper & Millspaugh 1999; McDonald et al. 2006). Let n = 1,…,K represent the individuals, t = 1,…,tn
the time steps for individual n and j = 1,…,J the available locations
(or a sample of all available locations, McDonald et al. 2006) for animal n at time step t. The mixed multinomial logit model considers
utilities as random variables, with Unjt being the utility that animal n
assigns to the jth location available at time step t. Let xnjt1 ; . . . ; xnjtm
represent the values of m covariates (e.g. habitat attributes) measured
at the jth location available to animal n at time step t. Now let us
assume that the utility assigned to a location depends on its attributes, viz.
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
550 T. Duchesne et al.
Unjt ¼ b1 xnjt1 þ b2 xnjt2 þ þ bm xnjtm þ bn1 znjt1 þ þ bnq znjtq þ enjt
¼ xnjt 0b þ znjt 0b þ enjt ;
eqn 1
where b1,…,bm are the fixed regression coefficients, bn1,…,bnq are animal-level random effects, znjt1 ; . . .; znjtq are fixed values specifying the
structure of the random effects (usually equal to the subset of the covariates xnjti for which coefficients are random), enjt are independent
and identically distributed random error terms, b = (b1,…,bm)¢,
xnjt = (xnjt1 ; . . .; xnjtm )¢, b = (bn1,…,bnq)¢ and znjt = (znjt1 ; . . .; znjtq )¢.
We make the assumption that the random errors follow an extreme
value distribution, which reduces to the usual exponential RSF when
there are no random effects (see below); this assumption is mild and
the model thereby specified is very flexible (McFadden & Train 2000)
Let the random effects b be independent and identically distributed
with density f(b;h), with h a vector of unknown parameters. The
probability that an animal chooses location j within the set of J locations {1, 2,…,J}, i.e. Unjt > Unit for all i „ j, is
Z
expðx0njt b þ z0njt bÞ
Pðxnjt Þ ¼ PJ
fðb;hÞdb:
eqn 2
0
0
i¼1 expðxnit b þ znit bÞ
Though the distribution of the random effects is typically chosen as
the multivariate normal distribution with mean vector 0 and variance–covariance parameters to be estimated (Gillies et al. 2006; Hebblewhite & Merrill 2008), other distributions such as the lognormal,
uniform or triangular can be used (Bhat 2001). When all znjti in
eqn (1) take on value zero or when the variance of b is null (i.e. b is
identically 0), eqn (2) simplifies to
expðx0njt bÞ
Pðxnjt Þ ¼ PJ
;
0
i¼1 expðxnit bÞ
eqn 3
and we get the ordinary (i.e. fixed effects) conditional logistic regression model (McDonald et al. 2006).
RANDOM EFFECTS, HETEROGENEITY IN SELECTION
AND THE DEPENDENCE FROM IRRELEVANT
ALTERNATIVES
The addition of individual-level random effects in RSFs relaxes the
assumption of homogeneous selection among animals. For example,
adding an animal-level random regression coefficient allows for
inter-individual variations in the response to covariate x, which
means that each individual may respond differently to changes in x.
Because the random effects are unobserved random variables that
are common to all the locations of a given individual, the mixedeffects model does not assume that the observations of that individual are uncorrelated (Revelt & Train 1998). Note that, though they
do not explicitly model the animal-level heterogeneity, fixed-effects
model estimated by methods such as generalized estimating equations can handle correlated matched sets (Craiu et al. 2008).
The mixed multinomial logit model relaxes the IIA assumption,
but only at a population level. It does so by inducing correlation over
alternatives in the stochastic portion of utility (Revelt & Train 1998;
Skrondal & Rabe-Hesketh 2003). To illustrate this, we considered a
forager, such as the Nubian Ibex (Hochman & Kotler 2007),
responding to spatial patterns of risk. Suppose that each location is
of one of three types, which we code using covariates xjP and xjC:
location j may be a risky food patch, coded as xjP = 1, xjC = 0; a
safe cliff, coded as xjP = 0, xjC = 1; or a baseline habitat that offers
no food or protection, coded as xjP = 0, xjC = 0. We assume that
J > 2 locations are available and that location j = 1 is a food patch
(x1P = 1, x1C = 0), and location j = 2 is the baseline habitat
(x2P = 0, x2C = 0). If we assume a fixed-effect conditional logistic
regression model (McDonald et al. 2006) with RSF proportional to
exp(bPxjP + bCxjC), then the ratio of the probability that the animal
selects location j = 1 to the probability that the same animal (or
another animal chosen at random) selects location j = 2 is given by
.
expðbP Þ PJ
expðbP xjP þ bC xjC Þ expðbP Þ
. j¼1
¼ expðbP Þ;
eqn 4
¼
1
expð0Þ PJ
j¼1 expðbP xjP þ bC xjC Þ
which does not depend on whether there is a cliff among the other
available locations. Now let us assume the same model, but this time
with a random slope bP + b for covariate xP instead of the fixed
slope bP. Because b remains fixed for all the locations of a given animal, the ratio of the probability that the animal chooses location
j = 1 to the probability that it selects location j = 2 is given by
.
expðbP þ bÞ PJ
expðbP þ bÞ
j¼1 expðbP xjP þ bC xjC Þ
.
¼ expðbP þ bÞ;
¼
1
expð0Þ PJ
j¼1 expðbP xjP þ bC xjC Þ
which still does not depend on the attributes of the alternate locations, but it depends on b, the unobserved animal-specific random
effect. Now, if we consider the ratio of the probability that an animal
chosen at random selects location j = 1 to the probability that
another animal, again chosen at random, selects location j = 2, then
we get
R expðb þ bÞ.P
J
P
fðbÞdb
j¼1 expððbP þ bÞxjP þ bC xjC Þ
.
R expð0Þ P
J
fðbÞdb
j¼1 expððbP þ bÞxjP þ bC xjC Þ
2R 3
.
expðbÞ PJ
fðbÞdb
expððb
þ
bÞx
þ
b
x
Þ
jP
6
7
P
C jC
j¼1
7;
¼ expðbP Þ6
1
4 R PJ
5
fðbÞdb
j¼1 expfðbP þ bÞxjP þ bC xjC g
where the quantity in square brackets now depends on the characteristics of all available locations (Train 2003). This model thus
relaxes the IIA assumption at the population level. In other words,
by adding random coefficients in the conditional logistic regression
model, the population-averaged probability of choosing a given habitat type depends on the local alternatives.
ESTIMATION AND INFERENCE
We now consider maximum likelihood estimation of the parameters
of the model described by eqns (1) and (2) on the basis of data
obtained with a matched sampling design. To simplify the notation
and without loss of generality, we assume that the location chosen by
animal n at time step t among the J available locations is assigned
label j = 1 (and thus the locations not chosen are assigned labels
j = 2, 3,…,J). Maximum-likelihood estimates of the RSF and random effects distribution parameters are obtained by finding the values of b and h maximizing:
tn
K Z Y
Y
expðx0n1t b þ z0n1t bÞ
Lðb; hÞ ¼
fðb; hÞdb:
eqn 5
PJ
0
0
t¼1
n¼1
j¼1 expðxnjt b þ znjt bÞ
Because eqn (5) is a valid likelihood function, any likelihood-based
inference method for b, such as Wald confidence intervals based on
inverting the Hessian of the negative log-likelihood, likelihood-ratio
tests, or AIC-based model selection can be applied (McFadden &
Train 2000).
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
Mixed-effects models for habitat selection 551
According to parsimony principles, the need for random effects in
RSFs should be assessed. If random effects are not needed, then
fixed-effects conditional regression would improve estimation efficiency and model interpretability (Verbeke & Molenberghs 2000).
Fixed-effects model can be considered as a special case of mixedeffects model where the variance and covariance parameters in f(b; h)
are zero. A likelihood-ratio test for nested models can thus be used to
evaluate the need to increase model complexity through the use of
random effects. The likelihood-ratio statistic that tests whether the
fixed-effects model is reasonable is given by r = 2(‘1 ) ‘0), where ‘0
and ‘1 are the values of the maximized log-likelihoods of the fixedand mixed-effects models, respectively. Because the value zero is on
the boundary of the parameter space for variance parameters, the Pvalue is not simply based on the usual chi-squared distribution but
rather on a mixture of chi-squared distributions, with the number of
chi-squared variables in the mixture and their respective numbers of
degrees of freedom depending on the structure of the variance and
covariance parameters set to zero (Verbeke & Molenberghs 2000).
Consider for example a mixed-effects model with a single random
effect b with distribution N(0,r2). The likelihood-ratio statistic to test
whether b is needed follows, under the null model, a mixture of two
chi-squared distributions with zero and one degree of freedom,
respectively. This reduces the P-value to 05Pr½v21 >r,with v21 representing a chi-squared random variable with 1 degree of freedom.
Direct numerical maximization of L(b, h) given by eqn (5) can be
difficult, as it involves integrals that cannot be solved analytically.
The numerical maximization of the likelihood is often more likely to
converge for a fixed-effects RSF than its mixed-effects counterpart.
Bhat (2001) described simulation methods based on Halton quasirandom numbers that can efficiently evaluate the likelihood function.
Maximization of the likelihood from eqn (5) can be implemented
with this method using the mxlmsl package (Train 2006) for matlab
r2008a (MathWorks Inc. 2008). We provide the matlab code used for
our bison case study in Appendix S3.
There are other, albeit less direct, means of maximizing the likelihood from eqn (5). Chen & Kuo (2001) showed how to build a nonlinear Poisson model with random effects whose likelihood is
equivalent to a closely related multinomial formulation of eqn (5).
The required Poisson model can be fitted by maximum likelihood,
where the integrals are evaluated with adaptive Gaussian quadrature
or penalized quasi-likelihood. Bruun & Smith (2003) used the latter
approach to evaluate habitat selection by European starlings (Sturnus vulgaris Linnaeus, 1758). Mixed conditional logistic regression
models can also be fitted with Bayesian methods, but the approach
then requires specifying prior distributions (informative or not) for b,
h. R.V. Craiu, T. Duchesne, D. Fortin & S. Baillargeon (unpublished
data), propose a numerically stable and efficient two-step method
that gives accurate approximations to the maximum-likelihood estimates for mixed-effects conditional logistic regression. Perhaps,
methods based on the results of the first step (i.e. separate models fitted to each animal) of such a two-step approach could help in determining whether the need for random effects arises from betweenanimal heterogeneity or the violation of IIA.
Example 1: Simulation of patch selection under
predation risk
We use computer simulations to investigate the effect of departures
from the assumption of homogeneous habitat selection among individuals. Deviations from the assumption were induced by imposing
inter-individual variations in movement rules and by forcing movement decisions that violated the IIA assumption. Individual-based,
spatially explicit modelling was conducted using the Spatially Explicit Landscape Event Simulator (Fall & Fall 2001). We simulated the
movements of 200 virtual foragers, with each individual starting
(time 0) at a random location within the landscape (1000 · 1000
cells), and followed for 50 consecutive moves. Landscapes comprised
four types of randomly distributed habitat patches: Patch type H1
offered the most food, followed by H2. Neither H3 nor H4 offered
any food. H1 was risky, unless located <15 cells from H3, in which
case H1 became safe. H2 was always safe.
We tested four scenarios differing in the movement rules of individuals, with distinct statistical implications. Movements for scenarios 1 and 2 were both consistent with the IIA hypothesis; scenario 1
assumed a homogeneous movement rule, whereas scenario 2
involved inter-individual variation in the rules. Scenarios 3 and 4
both led to violation of the IIA hypothesis at the individual level,
because the preference for H1 over H2 depended on whether H3
occurs within 15 cells; a homogeneous movement rule was used for
scenario 3 whereas inter-individual variation in movement rules characterized scenario 4. The movement rules as well as the landscape
used for each of the four scenarios are described in detail in Appendix S1. To assess the effect of varying patch availability on inferences, scenario 3 was applied to five additional landscapes, where the
proportions of H1 and H2 remained unchanged but those of H3 and
H4 varied according to Landscape 1: 0Æ01%, 69Æ99%, Landscape 2:
0Æ02%, 69Æ98%, Landscape 3: 0Æ03%, 69Æ97%, Landscape 4: 0Æ05%,
69Æ95%, and Landscape 5: 0Æ06%, 69Æ94%, respectively.
In all scenarios, each observed location was matched to 10 locations randomly drawn within a 30-cell radius, which was enough to
encompass all step distances (Forester, Im & Rathouz 2009). Patch
type (H1–H4) was identified at all observed and random locations.
Fixed- and mixed-effects conditional logistic regressions were used to
build RSFs. Mixed-effects RSFs allowed the coefficient of H1 to vary
among individuals according to N(b1,r2). In all models, H2 was used
as the baseline patch type. Models were fitted by maximizing the likelihood given by eqn (5) using a publicly available matlab r2008a
(MathWorks Inc. 2008) package (Train 2006).
Example 2: Habitat selection by free-ranging bison
The field study was conducted in the springs of 2005–2008 (9 March–
31 May 2005, 1 March–31 May in 2006 and 2007, and 1 March–10
March 2008) in Prince Albert National Park, where the bison population was comprised of 385 individuals. The bison range is mostly
composed of forests (85 %), meadows (10%) and water bodies (5%).
The range is adjacent to farmlands, where bison are occasionally
found.
We followed 24 female bison equipped with Global Positioning
System collars (GPS collar 4400M from Lotek Engineering, Newmarket, ON, Canada) taking locations at 06:00 and 18:00 hours.
Each observed location was paired with 10 random locations sampled within a 1Æ6-km radius circle (>90% of all travelled distances
between relocations).
Land-cover types at observed and random locations were characterized based on classified Landsat ETM+ satellite images (Fortin
et al. 2009). Land-cover types were (i) meadow, including areas near
lakes and rivers dominated by grasses, forbs and sedges (MEADOW);
(ii) riparian areas largely comprised shrubs and located near streams
and rivers (RIPARIAN); (iii) forest consisting of deciduous, conifer
and mixed stands (FOREST); (iv) water bodies (WATER); (v) road
including the areas located <15 m from a human-made trail or a road
(ROAD); and (vi) farmlands (AGRIC). Fixed- and mixed-effects conditional logistic regressions fitted by maximum likelihood were used to
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
552 T. Duchesne et al.
build RSFs. Random effects assuming N(0,r2) were investigated for
AGRIC, with FOREST as the baseline land-cover type.
The IIA assumption remained valid in scenario 2, but this
time each animal had a different probability of choosing H1.
In this context, the mixed-effects RSF received greater empirical support (Table 1) as its random coefficient was an important addition to the model fit (likelihood-ratio test:
P < 0Æ0001).
We now consider a situation (scenario 3) where all individuals displayed the same movement strategy, but where the
IIA assumption was violated because the odds of choosing
H1 depended on whether a refuge patch H3 was at close
proximity. Selection coefficients for H1 were then systematically lower when estimated by mixed-effects conditional
logistic regression than by their fixed-effects counterpart
(Fig. 1). Whether H1 was selected or avoided remained generally consistent with both models, with the exception of
when H3 made up 0Æ04% of landscape. In this case, the fixedeffects RSF suggested a significant selection for H1, whereas
the better fitting (likelihood-ratio test: P < 0Æ0001) mixedeffects model revealed that the average simulated forager had
Results
EXAMPLE 1: SIMULATION OF PATCH SELECTION UNDER
PREDATION RISK
Scenario 1 represented a situation where the IIA hypothesis
was valid and where the movement strategy was fixed within
the population of simulated foragers. As expected in such
cases, the fixed- and mixed-effects RSFs yielded a similar
coefficient estimate for H1 of )0Æ91 ± 0Æ03 (±SE) (Table 1),
which agrees with the theoretical approximation (Appendix S2). Moreover, the standard deviation (SD) of the random coefficient associated with the mixed-effects model did
not differ significantly from 0 (likelihood-ratio test:
P = 0Æ46), indicating that a random coefficient for H1 was
not required (Table 1).
Table 1. Patch selection estimated by fixed- or mixed-effects conditional logistic regressions with normally distributed coefficients, for virtual
foragers travelling in landscapes according to four scenarios. The scenarios differed depending on whether movement rules were similar among
all individuals of the population and whether the assumption of independence from irrelevant alternatives (IIA) was violated. H2 was the
baseline patch type in all resource selection functions
Fixed-effects model
Variable
b
SE
Scenario 1: no inter-individual variation, IIA assumption respected
Fixed coefficient
H1
)0Æ908
0Æ031
H4
)1Æ520
0Æ024
Random coefficient
H1
–
–
SD of coefficient
–
–
Max. log likelihood
)22 030Æ021
Scenario 2: inter-individual variation, IIA assumption respected
Fixed coefficient
H1
)0Æ835
0Æ031
H4
)1Æ528
0Æ024
Random coefficient
H1
–
–
SD of coefficient
–
–
Max. log likelihood
)22 023Æ359
Scenario 3: no inter-individual variation, IIA assumption violated
Fixed coefficient
H1
0Æ073
0Æ027
H3
)0Æ736
0Æ528
H4
)1Æ458
0Æ026
Random coefficient
H1
–
–
SD of coefficient
–
–
Max. log likelihood
)21 557Æ935
Scenario 4: inter-individual variation, IIA assumption violated
Fixed coefficient
H1
0Æ019
0Æ028
H3
)1Æ540
0Æ724
H4
)1Æ454
0Æ026
Random coefficient
H1
–
–
SD of coefficient
–
–
Max. log likelihood
)21 655Æ263
Mixed-effects model
95% CI
b
)0Æ969, )0Æ847
)1Æ567, )1Æ473
–
)1Æ520
–
0Æ024
–
)1Æ567, )1Æ473
)0Æ908
0Æ000
0Æ031
0Æ174
)22 030Æ020
)0Æ969, )0Æ847
–
)1Æ528
–
0Æ024
–
)1Æ575, )1Æ481
)0Æ873
0Æ368
0Æ041
0Æ041
)21 999Æ292
)0Æ953, )0Æ793
–
)0Æ712
)1Æ462
–
0Æ530
0Æ027
–
)1Æ751, 0Æ327
)1Æ515, )1Æ409
0Æ006
0Æ752
0Æ060
0Æ046
)21 273Æ523
–
)1Æ461
)1Æ450
–
0Æ726
0Æ026
)0Æ062
0Æ764
0Æ061
0Æ047
)21 362Æ713
–
)0Æ896, )0Æ774
)1Æ575, )1Æ481
–
0Æ020, 0Æ126
)1Æ771, 0Æ299
)1Æ509, )1Æ407
–
)0Æ036, 0Æ074
)2Æ959, )0Æ121
)1Æ505, )1Æ403
–
SE
95% CI
)0Æ112, 0Æ124
–
)2Æ884, )0Æ038
)1Æ501, )1Æ399
)0Æ182, 0Æ058
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
Mixed-effects models for habitat selection 553
were used. Population-averaged fixed-effects RSF indicated
a general selection for farmlands over forest areas, whereas
mixed-effects model revealed that bison had no preference
for one land-cover type over the other. The mixed-effects
model provided a better depiction of bison selection than the
fixed-effects RSF (likelihood-ratio test: P < 0Æ0001). The
mixed-effects RSF revealed important heterogeneity in the
response to farmlands within the population (Table 2), with
41% (N[)0Æ275, 1Æ538]) of female bison having a positive
selection coefficient for farmlands.
Discussion
Fig. 1. Changes in the selection coefficient (±95% confidence intervals) for patch type H1 by simulated foragers as function of the percentage of the landscape comprised refuge patch H3, as assessed by
resource selection functions estimated from fixed- or mixed-effects
conditional logistic regression. Simulations were made according to
scenario 3 where the probability that a forager selects H1 compared
to H2 increased when a refuge H3 was in close proximity. We also
indicated the expected proportion of the population having positive
^ r
^2 ) estimate of the
coefficient for patch type H1, based on the N(b;
distribution of b + b. Notice that values for fixed and random coefficients were slightly offset from one another to increase clarity.
no overall selection for H1 (Fig. 1). In the most complex scenario 4, virtual foragers not only violated the premise of IIA,
but the strength of selection for H1 also differed among them.
Modelling habitat selection under this scenario required,
once again, the use of a random coefficient for H1 (likelihood-ratio test: P < 0Æ0001).
EXAMPLE 2: HABITAT SELECTION OF FREE-RANGING
BISON
Compared to the forest matrix, female bison selected meadows, water bodies and roads, but displayed no preference for
riparian areas (Table 2). The response of bison to farmlands
differed depending on whether fixed- or mixed-effects RSFs
We used spatially explicit simulations to demonstrate how
mixed-effects conditional logistic regression can capture
inter-individual variation in selection induced by differences
in movement rules among simulated foragers and by the presence or absence of refuge patches (which led to the violation
of the IIA assumption). When the relative preference of
resource patches was the same for all individuals and IIA was
true (scenario 1), the fixed- and mixed-effects models estimated almost identical regression coefficients. Fixed-effects
RSFs then provided an accurate representation of habitat
selection within the population and were more parsimonious
than mixed-effects RSFs. In contrast, when habitat selection
probabilities varied among individuals but the IIA was still a
valid assumption (scenario 2), the likelihood-ratio test indicated that the selection for H1 varied significantly within the
population, thereby rejecting the fixed-effects model. These
conclusions for scenario 2 also held under scenarios 3 (no
inter-individual variation in selection and violation of the
IIA assumption) and 4 (inter-individual variability in selection and violation of the IIA assumption). In these cases,
RSFs that include random effects gave a more accurate representation of habitat selection in the population.
The simulation study also demonstrated that individuallevel heterogeneity can be identified and taken into account
in RSFs, even when data are collected under a matched sampling design. Furthermore, the simulations (i.e. scenario 3)
Table 2. Resource selection functions for radiocollared female bison in Prince Albert National Park during the springs of 2005–2008, as
estimated with fixed- or mixed-effects conditional logistic regressions, with normally distributed coefficients
Fixed-effects model
Variable
Fixed coefficient
Meadow
Water
Riparian area
Road
Farmlands
Random coefficient
Farmlands
H4
SD of coefficient
Max. log likelihood
Likelihood-ratio test
b
Mixed-effects model
SE
2Æ024
0Æ399
)0Æ315
0Æ942
0Æ348
0Æ046
0Æ094
0Æ163
0Æ143
0Æ118
–
)1Æ520
–
–
0Æ024
–
)5947Æ846
95% CI
b
SE
95% CI
1Æ934, 2Æ114
0Æ215, 0Æ583
)0Æ635, 0Æ005
0Æ663, 1Æ222
0Æ117, 0Æ579
2Æ024
0Æ401
)0Æ301
0Æ953
–
0Æ046
0Æ094
0Æ163
0Æ143
–
–
)1Æ567, )1Æ473
)0Æ275
)1Æ520
1Æ243
0Æ377
0Æ024
0Æ344
)5930Æ033
P < 0Æ0001
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
1Æ934, 2Æ114
0Æ217, 0Æ585
)0Æ620, 0Æ018
0Æ673, 1Æ233
–
)1Æ014, 0Æ464
)1Æ567, )1Æ473
554 T. Duchesne et al.
stress that inter-individual variations in movement rules are
only one potential source of heterogeneity which may entail
the use of mixed-effects conditional logistic regression to analyse habitat selection data gathered from matched sampling
designs. The trade-offs between food intake and predator
avoidance can shape movement decisions, potentially leading
to the violation of the IIA assumption. A faulty assumption
of IIA may introduce sufficient heterogeneity in the response
of animals to their habitat for random effects to be needed to
adequately model animal distribution in response to spatial
heterogeneity. Situations where the observed selection
violates the IIA assumption can still be modelled with
fixed-effects models when animals are homogeneous in their
landscape preference. In this situation, the strength of preference for one habitat type over another depends on available
alternatives and this dependence has to be modelled correctly
and explicitly in the RSF using proper interaction terms. This
precise knowledge is likely to be missing a priori in many
studies and mixed-effects model offer a robust safeguard in
such cases.
Findings from the simulations imply that the heterogeneity
in selection for farmlands expressed by the female bison of
Prince Albert National Park can be due to several factors,
including inter-individual variations in movement decisions
and the violation of the IIA assumption. Mixed-effects logistic regression can conveniently handle both sources of heterogeneity and thereby provide a robust framework for
ecological inference. We concurrently modelled the response
of bison to multiple habitat attributes before drawing conclusions about their response to farmlands. For example, we
found that bison selected roads, as well as meadows where
individuals can find large quantities of high-quality food
(Fortin, Fryxell & Pilote 2002; Craiu et al. 2008; Fortin et al.
2009). Fixed- and mixed-effects RSFs then pointed out distinct response of bison to farmlands. Fixed-effects models
implied that bison generally made selective use of farmlands,
whereas the mixed-effects RSFs refuted this assessment by
revealing heterogeneous selection for farmlands. A likelihood-ratio test revealed that the mixed-effects RSF was superior to its fixed-effects counterpart. We thus conclude that
the problem of cross-boundary movements is linked to a subset, though a fairly large one, of individuals within the population, with c. 40% of female bison making selective use of
farmlands. The mixed-effects RSF thus draw park managers
a very different picture from the general selection for farmlands that was implied by the population-averaged fixedeffects RSF. Solving human-wildlife conflicts may depend on
whether the problem originates from a restricted number of
individuals. In this case, the translocation of ‘problematic’
individuals can be the solution (Sukumar 1991; Jones &
Nealson 2003). On the other hand, this management
approach might not be as effective when all members of the
population adopt an ‘unacceptable’ behaviour. Management
or conservation actions should be tailored to the nature
of the problem, and mixed-effects models are often better
suited than fixed-effects models to evaluate adequately the
situation.
Our study stressed how drawing robust inference from
RSFs may require the use of random effects in conditional
logistic regression models. We demonstrated how fixed and
mixed conditional logistic regression can lead to different
conclusions about animal–habitat interactions. Our simulations illustrated that in some situations models with random
coefficients, which yield individual-specific inferences, can
provide a more accurate assessment of resource selection by
animals compared with fixed-effects models that provide
population-averaged inference (Fieberg et al. 2009; Koper &
Manseau 2009). We found that the selection for agricultural
lands by the population of free-ranging bison of Prince
Albert National Park can differ depending on whether random coefficients are used or not. Such differences could have
important management and conservation implications.
Indeed, habitat selection is commonly used to identify critical
resources (Arthur et al. 1996), suitable habitat (Fortin et al.
2008), response to anthropogenic disturbances (Hebblewhite
& Merrill 2008), ecological consequences of species reintroduction (Whittaker & Lindzey 2004; Mao et al. 2005). A
biased assessment of habitat selection may therefore result in
inadequate management or conservation actions. Matched
sampling designs and conditional logistic regressions are
increasingly used in ecological research (e.g. for RSFs, Boyce
2006; for step selection functions, Fortin et al. 2005), and
fixed-effects models may lead to mistaken inferences about
selection whenever hypotheses such as IIA or homogeneous
strength of selection among animals are not respected. We
suggest that mixed-effects conditional logistic regression
should become a valuable, and sometimes necessary, statistical tool for valid inference in ecological research.
Acknowledgements
Funding for this study was provided by Parks Canada Species at Risks Recovery Action and Education Fund, a program supported by the National Strategy for the Protection of Species at Risk, Natural Sciences and Engineering
Research Council of Canada, Canada Foundation for Innovation, and l’Université Laval. We are grateful to L. O’Brodovich and D. Frandsen, M.-E. Fortin, K. Dancose and S. Courant for their assistance in the field, and to Pierre
Racine for his help with SELES, and James Hodson for his editorial comments
on the study.
References
Arthur, S.M., Manly, B.F.J., McDonald, L.L. & Garner, G.W. (1996) Assessing habitat selection when availability changes. Ecology, 77, 215–227.
Bhat, C.R. (2001) Quasi-random maximum simulated likelihood estimation of
the mixed multinomial logit model. Transportation Research Part B-Methodological, 35, 677–693.
Boyce, M.S. (2006) Scale for resource selection functions. Diversity and Distributions, 12, 269–276.
Boyce, M.S., Mao, J.S., Merrill, E.H., Fortin, D., Turner, M.G., Fryxell, J. &
Turchin, P. (2003) Scale and heterogeneity in habitat selection by elk in Yellowstone National Park. Ecoscience, 10, 421–431.
Bruun, M. & Smith, H.G. (2003) Landscape composition affects habitat use
and foraging flight distances in breeding European starlings. Biological Conservation, 114, 179–187.
Chen, Z. & Kuo, L. (2001) A note on the estimation of the multinomial logit
model with random effects. The American Statistician, 55, 89–95.
Compton, B.W., Rhymer, J.M. & McCollough, M. (2002) Habitat selection by
wood turtles (Clemmys insculpta): an application of paired logistic regression. Ecology, 83, 833–843.
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
Mixed-effects models for habitat selection 555
Cooper, A.B. & Millspaugh, J.J. (1999) The application of discrete choice models to wildlife resource selection studies. Ecology, 80, 566–575.
Craiu, R.V., Duchesne, T. & Fortin, D. (2008) Inference methods for the conditional logistic regression model with longitudinal data. Biometrical Journal,
50, 97–109.
Fall, A. & Fall, J. (2001) A domain-specific language for models of landscape
dynamics. Ecological Modelling, 141, 1–18.
Fieberg, J., Rieger, R.H., Zicus, M.C. & Schildcrout, J.S. (2009) Regression
modelling of correlated data in ecology: subject-specific and population
averaged response patterns. Journal of Applied Ecology, 46, 1018–1025.
Forester, J.D., Im, H.K. & Rathouz, P.J. (2009) Acccounting for animal movement in estimation of resource selection functions: sampling and data analysis. Ecology, 90, 3554–3565.
Fortin, D., Fryxell, J.M. & Pilote, R. (2002) The temporal scale of foraging
decisions in bison. Ecology, 83, 970–982.
Fortin, D., Beyer, H.L., Boyce, M.S., Smith, D.W., Duchesne, T. & Mao, J.S.
(2005) Wolves influence elk movements: behavior shapes a trophic cascade
in Yellowstone National Park. Ecology, 86, 1320–1330.
Fortin, D., Courtois, R., Etcheverry, P., Dussault, C. & Gingras, A. (2008)
Winter selection of landscapes by woodland caribou: behavioural response
to geographical gradients in habitat attributes. Journal of Applied Ecology,
45, 1392–1400.
Fortin, D., Fortin, M.E., Beyer, H.L., Duchesne, T., Courant, S. & Dancose,
K. (2009) Group-size-mediated habitat selection and group fusion-fission
dynamics of bison under predation risk. Ecology, 90, 2480–2490.
Gillies, C.S., Hebblewhite, M., Nielsen, S.E., Krawchuk, M.A., Aldridge, C.L.,
Frair, J.L., Saher, D.J., Stevens, C.E. & Jerde, C.L. (2006) Application of
random effects to the study of resource selection by animals. Journal of Animal Ecology, 75, 887–898.
Hay, M.E. & Fuller, P.J. (1981) Seed escape from heteromyid rodents – the
importance of microhabitat and seed preference. Ecology, 62, 1395–1399.
Hebblewhite, M. & Merrill, E. (2008) Modelling wildlife–human relationships
for social species with mixed-effects resource selection models. Journal of
Applied Ecology, 45, 834–844.
Hochman, V. & Kotler, B.P. (2007) Patch use, apprehension, and vigilance
behavior of Nubian Ibex under perceived risk of predation. Behavioral Ecology, 18, 368–374.
Johnson, C.J., Nielsen, S.E., Merrill, E.H., McDonald, T.L. & Boyce, M.S.
(2006) Resource selection functions based on use-availability data: theoretical motivation and evaluation methods. Journal of Wildlife Management, 70,
347–357.
Jones, N.D. & Nealson, T. (2003) Management of aggressive Australian magpies by translocation. Wildlife Research, 30, 167–177.
Keating, K.A. & Cherry, S. (2004) Use and interpretation of logistic regression in habitat selection studies. Journal of Wildlife Management, 68,
774–789.
Koper, N. & Manseau, M. (2009) Generalized estimating equations and generalized linear mixed-effects models for modelling resource selection. Journal
of Applied Ecology, 46, 590–599.
Manly, B.F.J., McDonald, L.L., Thomas, D.L., McDonald, T.L. & Erickson,
W.P. (2002) Resource Selection by Animals: Statistical Design and Analysis
for Field Studies, 2nd edn. Kluwer Academic, Dordrecht.
Mao, J.S., Boyce, M.S., Smith, D.W., Singer, F.J., Vales, D.J., Vore, J.M. &
Merrill, E.H. (2005) Habitat selection by elk before and after wolf reintroduction in Yellowstone National Park. Journal of Wildlife Management, 69,
1691–1707.
MathWorks Inc. (2008) MATLAB Software: The Language of Technical
Computing, Version R2008a. MathWorks Inc., Natick, MA, USA.
McDonald, T.L., Manly, B.F.J., Nielson, R.M. & Diller, L.V. (2006) Discrete-choice modelling in wildlife studies exemplified by Northern Spotted
Owl nighttime habitat selection. Journal of Wildlife Management, 70,
375–383.
McFadden, D. & Train, K. (2000) Mixed MNL models for discrete response.
Journal of Applied Econometrics, 15, 447–470.
McLoughlin, P.D., Morris, D.W., Fortin, D., Vander Wal, E. & Contasti, A.L.
(2010) Considering ecological dynamics in resource selection functions. Journal of Animal Ecology, 79, 4–12.
Morrison, S., Barton, L., Caputa, P. & Hik, D.S. (2004) Forage selection by
collared pikas, Ochotona collaris, under varying degrees of predation risk.
Canadian Journal of Zoology, 82, 533–540.
Revelt, D. & Train, K. (1998) Mixed logit with repeated choices: households’
choices of appliance efficiency level. Review of Economics and Statistics, 80,
647–657.
Skrondal, A. & Rabe-Hesketh, S. (2003) Multilevel logistic regression for polytomous data and rankings. Psychometrika, 68, 267–287.
Sukumar, R. (1991) The management of large mammals in relation to male
strategies and conflict with people. Biological Conservation, 55, 93–102.
Train, K.E. (2003) Discrete Choice Models With Simulation. Cambridge University Press, Edinburgh.
Train, K.E. (2006) Mixed Logit Estimation by Maximum Simulated Likelihood.
Matlab package. Available at: http://elsa.berkeley.edu/Software/abstracts/
train1006mxlmsl.html, accessed 3 February 2010.
Verbeke, G. & Molenberghs, G. (2000) Linear Mixed Models for Longitudinal
Data. Springer-Verlag, New York.
Whittaker, D.G. & Lindzey, F.G. (2004) Habitat use patterns of sympatric deer
species on Rocky Mountain Arsenal, Colorado. Wildlife Society Bulletin, 32,
1114–1123.
Received 26 October 2009; accepted 15 January 2010
Handling Editor: Fanie Pelletier
Supporting Information
Additional Supporting Information may be found in the online version of this article.
Appendix S1. Detailed description of the four simulation scenarios
used to assess the effect of heterogeneous habitat selection among
animals.
Appendix S2. Calculation of the long-run probabilities of being in a
given patch type and theoretical value of the RSF coefficient for
patch type H1 under simulation scenario 1.
Appendix S3. matlab code to estimate mixed-effects resource selection function for the free-ranging bison of Prince Albert National
Park, Saskatchewan, Canada.
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials may be
re-organized for online delivery, but are not copy-edited or typeset.
Technical support issues arising from supporting information (other
than missing files) should be addressed to the authors.
2010 The Authors. Journal compilation 2010 British Ecological Society, Journal of Animal Ecology, 79, 548–555
Download