“First Law” of Population Dynamics (like Newton`s First Law

advertisement
Bio 292: Population Ecology
Instructor: Bill Morris
MW 2:50-4:05pm, 130 BioSci
Fall, 2011
Tentative syllabus:
Week: Topic:
Readings in
Morris & Doak book:
1-2
Density-independent models based on the total number
of individuals in a population under a randomly
varying environment;
Introduction to programming in R
Ch. 1-3
3-4
More complex models based on total population size:
negative and positive density dependence, environmental
autocorrelation, and catastrophes/bonanzas
Ch. 4
5-6
Deterministic projection matrix models for age- or
size-structured populations
Ch. 6, 7
7
Sensitivity analysis for deterministic projection matrix
models
Ch. 9 to p. 351
8-9
Stochastic projection matrix models;
Additional complications in matrix models
Ch. 8
10
Meta-population and other spatial models
Ch. 10-11
11-12 Models of species interactions (if there is time)
13
Students present results of research projects
Course requirements:
1) Do problem sets and worksheets (not graded; only credit/no credit)
2) Class project:
- create and analyze a population model using your own data or data from the literature
- report results to the class in the last week of the semester
Day 1: Intro and Understanding the effects of variability
Introduction and Goals of the Course:
Definition of a population: the set of individuals of a single species in a defined area
Goals of Population Ecology
To find answers to questions such as:
- Why are there so many (or few) of a given species in a particular place?
- Why do numbers change (or not) over time or space?
- Why do we see the observed ratio of different sized or aged individuals in a population?
- How does a population persist for an indefinite period of time despite the fact that every
individual in the population will die relatively soon (that is, what keeps births and deaths
in approximate balance)?
- What is the probability that a given population will go extinct by a given future time,
AND HOW IS THAT PROBABILITY INFLUENCED BY HUMAN ACTIVITIES?
Why Population Ecol MUST be quantitative:
ALL of the preceding questions involve numbers. To answer them in anything but the
most cursory way requires quantitative tools. In applications, it is often not enough to
know what factors influence, e.g., extinction risk. We need to have RELATIVE answers:
- which factors have the most influence on populations
- which of several populations is most likely to avoid extinction
- which management strategy will reduce extinction risk more?
To provide such answers requires…
Quantitative tools:
1) mathematics
deterministic
stochastic
2) computing languages – MATLAB, C, R
We will learn about both in this course
>>Equivalent of Newton’s Law of Constant Motion:
constant PER-CAPITA growth in absence of “other forces”;
constant per-capita growth means GEOMETRIC increase (or decline)
Simplest population model:
N(t) = no. of individuals in population in year t (literally, at census point t)
Assume births immediately follow the census, then individuals may (or may not) survive
to the next census.
B = average no. of surviving newborns PER CAPITA before the next census
D = fraction of adults dying before the next census (also PER CAPITA)
N(t+1) = B N(t) + (1 – D) N(t)
N(t+1) = (B – D + 1) N(t)
N(t+1) = lambda N(t)
where lambda = (B – D + 1) measures excess of births over deaths
if B = D, lambda = 1, no change in population
if B < D, lambda < 1, population declines
if B > D, lambda > 1, population grows
Prediction: total population size grows or declines GEOMETRICALLY but at a constant
PER-CAPITA rate, lambda
Simulate growth of populations with constant lambda
FIRST, BASIC INTRO TO R
Opening R
Changing directory to a folder for the current project (I always do this first thing)
In Console window:
lam=1.1
lam
N=10
N=lam*N
N
n (case sensitive)
Iteration:
N=lam*N; N
alternate up-arrow and Enter
To iterate, better to write a “script” or program:
opening a script
saving a script
running a script
Script will need to:
- repeat N=lam*N tmax times
- use 3 different lams: <1, 1, >1
- plot results vs. time
To do this, we will need to:
- define constants (to make the script generic)
- use matrices: one name for many items N=matrix(c(4,3,6,2,1),1,5)
accessing items: N[3] (=6) using square brackets
- allocate memory to store numbers at each time – e.g., N1=matrix(0,tmax+1,1)
- repeat an action using a “for loop” – for(t in 1:tmax)
what does 1:tmax do?
- use a plotting function – help(matplot) or ?matplot
THEN WRITE FIRST PROGRAM IN R – lambda.const.R
5 (or 6) basic components of every program (or script):
[ clear memory - rm(list=ls(all=TRUE)) ]
define constants
allocate memory to variables ( to speed program )
assign initial values to variables
perform actions (e.g., loops)
output results
# define constants
lam1=0.9
lam2=1
lam3=1.1
tmax=20
n0=10
# allocate memory
N1=N2=N3=matrix(0,tmax+1,1) # could also use N1=N2=N3=numeric(tmax+1)
# initialize variables
N1[1]=N2[1]=N3[1]=n0
# perform iterations
for (t in 1:tmax){
N1[t+1]=lam1*N1[t]
N2[t+1]=lam2*N2[t]
N3[t+1]=lam3*N3[t]
}
# first run above and look at N1, N2, and N3;
# then add the following to plot results and run again
matplot(0:tmax,cbind(N1,N2,N3),xlab="Year,t",ylab="N(t)",main="Lambda
Constant",type="b",pch=19)
NEXT:
First deviation from the first law: population growth rate is not constant
Now
N(t+1) = lambda(t) N(t)
lambda(t)>1 in years when B>D
lambda(t)<1 in years when B<D
Simulate uniform variation in population growth rate
Let lambda vary uniformly between lambar – dl and lambar + dl, so that lambar is the
arithmetic mean lambda - GRAPH OF THE PDF OF LAMBDA and how it’s affected by
lambar and dl
In-class Assignment: Write a program – really, modify lambda.const.R - to simulate
multiple population trajectories with lambda varying as described above
We will use it to explore the effect of varying dl (i.e., making lambda more variable)
First, two useful things:
1) Random number generation:
runif(npops,lmin,lmax) >>> generate npops uniform nos. between lmin and lmax
lambar=1
dl=.1
x=runif(5000,lambar-dl,lambar+dl)
hist(x, xlim=c(.5,1.5))
similar syntax for other prob. distributions; e.g. Normal:
x=rnorm(5000,.1,.1)
hist(x, xlim=c(-1,1))
2) Multiplying a row of a matrix by a row vector:
N=matrix(10,10,5)
L=runif(5,.9,1.1)
L
N[2,]=L*N[1,]
N
L=runif(5,.9,1.1)
N[3,]=L*N[2,]
N
Now work on writing your own GENERIC script:
SCRIPT (this is lambda.variable.R):
tmax=50
npops=20
lambar=1.01
dl=.05
n0=10
lmin=lambar-dl
lmax=lambar+dl
n=matrix(0,tmax+1,npops) # 2D array n will hold all npops trajectories
n[1,]=n0
for(t in 1:tmax) n[t+1,]=n[t,]*runif(npops,lmin,lmax) # EXPLAIN THIS LINE
matplot(0:tmax, n, type="l", xlab="Year", ylab="Population size", main="Lambda
variable",ylim=c(0,1.05*max(n)))
Start with npop=1 and run a few times.
Then increase the number of populations to 50, then slowly increase dl, and see what
happens
It would be convenient to look at n(t) on the log scale
Here, use stoc.lambda.R instead of modifying the program above
Clearly, lambar doesn’t describe population growth, because most populations can
decline even if lambar>1
What is a better measure of the population growth rate?
Geometric mean as a measure of long-term growth rate
N(1) = L(0) N(0)
N(2) = L(1) N(1) = L(1) L(0) N(0)
N(t) = L(t-1) L(t-2) … L(1) L(0) N(0)
What L (call it Lg), when multiplied by itself t times, would equal L(t-1) L(t-2) … L(1)
L(0)? This is a good measure of AVERAGE ANNUAL GROWTH
Solve Lg^t = [ L(t-1) L(t-2) … L(1) L(0) ] for Lg
Answer: Lg = [ L(t-1) L(t-2) … L(1) L(0) ]^(1/t)
But, this is NOT the arithmetic mean of the L’s, which is:
(1/t) [ L(t-1) + L(t-2) + …+ L(1) + L(0) ] = lambar (or La)
Instead, it is the GEOMETRIC MEAN. That’s why we called it Lg.
The geometric mean is more strongly depressed by an L x units below the arithmetic
mean than it is increased by an L x units above the arith. mean (unlike the arith. mean
itself)
Why? Math fact: rel. small values decrease a product more than rel. large values increase
it.
For example, what if a single L(t) = 0? What is Lg? 0.
Another example showing deviations above and below La are not equal in terms of their
effects on Lg:
La = 1.1; dL = 0.2
Case 1:
even years lambda = La
odd years lambda is La plus dL
Lg=sqrt(1.1 x 1.3 ) = 1.196, which is .096 above La
Case 2:
even years lambda = La
odd years lambda is La minus dL
Lg=sqrt(1.1 x 0.9 ) = 0.995, which is .105 below La
A third way to look at Lg:
A useful approximation for the geometric mean in terms of the arithmetic mean and the
year-to-year variance of lambda, Var(lam) ( Var is average squared deviation from La )
Lg ≈ La exp{ - Var(lam)/ [ 2 La^2 ] }
If Var(lam) = 0, exp{ - Var(lam)/ [ 2 La^2 ] } = 1 and Lg = La. But whenever
Var(lam) > 0, exp{ - Var(lam)/ [ 2 La^2 ] } < 1 and Lg < La
On board: plot Lg vs Var(lam) according to approximation: exponential decline with
increasing Var(lam)
Show that Lg does better at “splitting the difference” between possible trajectories than
does La.
PROGRAM: median.vs.lamg.&.lama.R
This program uses another way to compute Lg:
if Lg = [ L(t-1) L(t-2) … L(1) L(0) ]^(1/t)
then
log Lg = log{ [ L(t-1) L(t-2) … L(1) L(0) ]^(1/t) }
log Lg = (1/t) * [ log{L(t-1)} + log{L(t-2)} + … log{L(1)} + log{L(0)} ]
= arithmetic mean of log(L)’s
So, Lg = exp{ arithmetic mean of log(L) }
Important messages:
1) Environmental stochasticity, by introducing variation in the lambdas, actually
depresses long-term population growth relative to the prediction of lambar. We can even
have a population that has an (arithmetic) average growth >1 but that is virtually certain
to decline over the long run if variation is sufficiently high.
2) So when we relax the assumption of the first law that lambda is constant, not only do
we get variation in population growth from year to year, we get a lower long-term rate of
growth.
Even if Lg>1, some possible trajectories may reach low population sizes. For a sexually
reproducing species, we might consider 1 to be effective extinction (“quasi-extinction”).
Some trajectories might hit 1 even if Lg>1.
Next Big Topic
What is the probability that populations will go (quasi-)extinct by some future time?
Effects of mean and variance in annual growth rate
{{ SKIP THE FOLLOWING IF PRESSED FOR TIME:
First:
uniform variation not realistic, and can’t increase variation indefinitely because lambda
can’t be negative, but it often has a weaker constraint above than below
Alternative: normal variation, but then lambda could still be negative
Better alternative: let lambda follow a lognormal distribution
if X ~ normal(mean=mu,sd=sig)
then Y=exp(X) ~ lognormal
lognormal.demo.R
}}}
POSSIBLY START with YGB as motivation for wanting to quantify Prob(quasiext.)
Note that in running stoc.lambda.R, population size was LOGNORMALLY
DISTRIBUTED
The Lognormal distribution:
mu=.1
sig=.5
x=rnorm(10000,mu,sig);
split.screen(c(2,1))
screen(1)
hist(x,breaks=50)
screen(2)
hist(exp(x),breaks=50)
This means that the LOG of N will follow a CHANGING Normal distribution
But this resembles the physical process of DIFFUSION (e.g. of molecules of a gas) in a
moving fluid
[ here and everywhere LOG means NATURAL LOG]
Fig. 3.3 from Morris&Doak (M&D)
Start from LOG of current population size
The lower boundary: “quasi-extinction threshold”
WE can set this boundary wherever we wish, based on BIOLOGICAL and POLITICAL
considerations (some disc. of this in Ch. 2)
What determines the likelihood of hitting the threshold is:
1. How fast (and in what direction) the position of the mean of the distribution
moves
2. How fast the VARIANCE of the distribution increases over time
3. How far down the population must go from current (log) size to (log) threshold
4. How long into the future we are considering
VARIANCE – mathematical definition:
average SQUARED distance between the “particles” (or pop sizes) and their (arithmetic)
mean
Eg for a set of n trajectories at time t
Var(N(t)) = (1/n) * sum(over i) of [ ( Ni(t) – Nbar(t) )^2 ]
If mean declines, CERTAIN to hit threshold (and more quickly the quicker the mean
declines)
If mean increases quickly, LESS likely to hit threshold quickly
If var increases quickly, MORE likely to hit threshold quickly
Mean(t) = mu * t
Var(t) = sigma_squared * t (sigma_squared = sig2 below)
mu and sig2 are constants that determine RATE of increase in mean and variance, resp.,
over time
MU and SIGMA – Greek letters
[ Plot Mean and Var vs t for diff. values of mu (-.1 0 .1) and sig2 (positive only) ]
The effects of mu and sig2 on likelihood of hitting threshold most easily seen by plotting
the QUASI-EXTINCTION TIME CUMULATIVE DISTRIBUTION FUNCTION or
CDF vs time in the future (which we will abbreviate as G(T) )
Fig. 3.5 in M&D
G(T) is PROBABILITY that quasi-ext. threshold has been reached AT ANY TIME
BETWEEN NOW AND FUTURE TIME T.
Because it is a probability, it must always lie between 0 and 1.
If mu > 0, G_max < 1 (because some pops go to infinity – NOT REALISTIC)
If mu < 0, G_max = 1
The equation that governs G(T) given Nc (current pop size), Nq, mu, and sig2 has been
derived by physicists.
It is eq. 3.5 in M&D
where d = log(Nc) – log(Nq)
NOTE in fig. 3.5 that for any given mu, the prob. of extinct. increases more rapidly early
on if sig2 is larger.
All of this is called the DIFFUSION APPROXIMATION for estimating extinction risk.
Advantages: easy to apply
Disadvantages: several assumptions that are often unrealistic (LATER)
NEXT:
Using the DA with real data
Intro:
The YGB
- isolated population – part of once-much-larger range
- long (44 yr.) record of counts of adult females (actually 3 yr. running sum of females
with YOY cubs – 3 yrs. between births, so sum is approx. of total no. of adult females)
- 6 more years of data than in M&D
- counts only – no info. on pop structure in these data
- counts at dumps before, and by aerial survey after 1973
- fires in 1988 – did it change mu or sig2?
- legally protected from hunting, but contact with humans still a source of mortality;
pressure to remove protection to have a SUCCESS STORY for the law (Endangered
Species Act – delisted in 2007 and then Relisted in 2010 due to political pressure from
conservation groups)
QUESTION: What is the current risk of extinction? Impt. to know before delisting.
Goal: compute extCDF for YGB, but to do so we must first estimate mu and sig2
TWO methods to do so.
But first, getting data into R (follows ygb.R, but do step by step in command window
Read csv data file into a DATA FRAME
data=read.csv("ygb_females_1959_2003.csv")
Cols. of data have names, Year and N, inherited from csv file
attach datafile – now can use col’s as variables
Use plot to plot the data
plot(Year,N,etc.)
1st, what is estimate of PER-CAPITA pop growth rate over 1 yr?
lam_t = N_t+1/N_t
can get all annual lam’s in 1 step:
n=length(N)
lam=N[-1]/N[-n]
Discuss above
NOTE: one fewer lam than censuses
Method 1:
mu – how much mean LOG pop size changes per year
therefore, (arith.) average of LOG( N(t+1) / N(t) ) is an estimate of mu
sig2 – how much VAR(log pop size) changes per year
therefore, variance of log( N(t+1) / N(t) ) is an estimate of sig2
compute mu and sig2 by “standard method”
mu=mean(log(lam)); mu
sig2=var(log(lam)); sig2
Now go to ygb.R
Using mu and sig2 to compute extCDF
writing your own functions in R – eg extcdf - stored in pva.functions.R
NOTES:
input variables must be given in correct order, or variable names must be given in
function call
[ functions can define other internal functions, but they will not work outside the function
]
storing functions in other files and “source”-ing them
Result: very low prob.(extinction) at t=50
BUT
using only the best estimate of mu and sig2 ignores the fact that these are only estimates,
and may miss the “true” values.
We can use the confidence intervals for mu and sig2 to put confidence limits on the CDF.
Procedure:
draw a random mu from a Normal distn. with mean mu_best and var. SEmubest
(=sig2/q^.5); keep if within mu’s CI, repeat if not
draw a random number from a chi-squared distn. with 1 and q df and multiply it by
sig2best/q; keep if within sig2’s CI, repeat if not
compute CDF and save any values (over t) that exceed past values, both above and below
CDFbest
repeat many times, and plot limits above and below
use extprob.R
NOTE: wide CI around best CDF
MESSAGE: ext. risk of YGB could be as high as 16% at T=50
BUT what if there were a gap in the census – mean and var should change more over the
gap than in the one-yr. intervals between other censuses – standard method inappropriate
DIFFUSION APPROXIMATION FOR PROBABILITY OF QUASI-EXTINCTION:
When censuses are taken every year, the easiest way to estimate the parameters mu and
sigma^2 for the diffusion approximation are:
mu = mean(log(N[-1]/N[-n])) = mean(diff(log(N)))
sig2 = var(log(N[-1]/N[-n])) = var(diff(log(N)))
BUT... not appropriate if some intercensus intervals are longer than others (b/c pop
should change more in such intervals)
Alternative: linear regression approach –
allows for diff. intercensus intervals
other advantages:
- easy confidence interval (at least for mu)
- can use regression tools to identify outliers
- can test for changes in mu and sigma^2 in different time periods (before/after dumps
closed; before/after fires)
Estimating mu and sig2 by regression:
Regress x = sqrt(diff(Year)) vs y = diff(log(N))/x WITH ZERO INTERCEPT
Slope is estimate of mu
Mean of squared residuals around regression line is estimate of sig2
Plot what this looks like.
Doing a linear regression in R:
x=1:10
y=3+2*x+rnorm(10,0,1)
plot(x,y)
out=lm(y~x)
summary(out)
coef(out)
confint(out)
anova(out)
sig2=anova(out)[2,3]; sig2
s2=sum((out$resid)^2)/8; s2
CIsig2=(q-1)*sig2/qchisq( c(.975,.025),df=(q-1) )
IN-CLASS ASSIGNMENT, WORKING IN PAIRS:
1988 WAS THE YEAR OF THE YELLOWSTONE FIRES. IMAGINE THAT,
BECAUSE OF THE FIRES, IT WAS NOT POSSIBLE TO DO THE GRIZZLY
BEAR CENSUS THAT YEAR. USING THE YGB DATA, DELETE THE COUNT
FROM 1988, ESTIMATE MU AND SIGMA^2 BY LINEAR REGRESSION, AND
USE THOSE ESTIMATES TO PRODUCE A QUASI-EXTINCTION TIME CDF
USING THE PROGRAM YGB.R.
other advantages of regression approach:
tests for outliers
tests for changes in mu and sig2 (e.g after 1983 fires)
confidence intervals on mu (can also be calculated directly from mu and sig2)
doing regression with YGB data: see more.ygb.stuff.R
Before running any of this prog – save standard estimates for comparison:
mu_s=mu
sig2_s=sig2
confidence limits on mu produced directly in bint
confidence limits on sig2 computed using chi2 distn.
[possibly skip or give overview]:
Two ways to look for outliers using regression output:
dffts
Rstudentized
Both indicate 1983 is odd – UNUSUALLY HIGH – consequences for estimated
extinction risk?
1983 is not the year the dumps were closed, or the fire year. If a reason to discard it is
determined, could delete this lambda and estimate mu and sig2 using the remaining
lambdas.
Can also ask statistically if mu, sig2 change before/after fires or before/after dumps
closed – see details in more.ygb.stuff. R
Uses of CDF
comparison of different pops w/ diff. Nc, mu, or sig2, diff thresholds [ M&D figs.
3.9,3.10 ]
Review and tests of assumptions:
I. Parameters mu and sig2 constant
Violations:
1) dens density dependence could change mu (and even sig2)
mu declines as N increases
mu could decline as N decreases
2) dem. stoch could change sig2
3) environmental trends could change both mu and sig2
II. No envtal autocorrelation
Whether lam was large last year has no effect on whether lam is lg. or small. this year
We’ll see how to test for and incorporate this next time
III. No extremely large or small values of lam (no bonanzas or catastrophes)
tests for outliers
how to include if found
IV. No observation error
- counts are assumed to be accurate – if not, they inflate sig2, making calculated ext. risk
too high
HOMEWORK:
Read Ch. 4 (skip Ceiling Model) including Appendix
Next Big Topic: More complex count-based models:
- Density-Dependence (negative and positive, i.e., Allee effects)
- Environmental Autocorrelation and its interaction with D.D.
- Catastrophes and Bonanzas
(for later: Demographic Stochasticity)
I. An overview of models with negative density dependence
[ SKIP 1. Ceiling model (eq. 4.1 in M&D) ]
2. More realistic models with continuous change in lambda as N(t) increases
The DI model assumed loglam indep. of N(t), with mean mu that doesn’t change
as N(t) changes. Var in loglam around mu is sig2, assumed to be caused by
environmental variation.
[[
Why log?
1) log ( N(t+1)/N(t) ) = log ( N(t+1) ) – log( N(t) )
errors in estimating N(t+1) and N(t) have same effects on log lambda,
but errors in estimating denominator of N(t+1)/N(t) have much stronger effect
than errors in estimating numerator
2) log ( N(t+1)/N(t) ) can go from –Inf to Inf, but N(t+1)/N(t) only goes from 0 to
Inf (if N(t)>0). Log ratio more likely to be normal, and so tools that assume
normal variation more appropriate.
]]
But more realistically, we might expect the mean log lambda (i.e., mu) to decline
as N(t) increases. This is called “NEGATIVE density dependence”, because log
lambda DECLINES.
Several patterns for this decline. Illustrated by the so-called “Theta Logistic
Model”:
(on board)
N(t+1) = N(t) * exp{ r*( 1 – [N(t)/K]^theta ) }
Take logs of both sides:
log( N(t+1)/N(t) ) = r*( 1 – [N(t)/K]^theta )
Plot log population growth rate log ( N(t+1)/N(t) ) vs N(t)
show.theta.logistic.R
Patterns:
log lambda decreases linearly with N(t) (theta=1); this model is also
known as the RICKER MODEL, widely used in fisheries modeling
log lambda decreases sharply only as N(t) approaches K (like ceiling
model); theta>1
log lambda decreases sharply at first, then changes little as N(t)->K;
theta<1
KEY POINT: With any of the above patterns, population tends to decline above K
(although with envtal. stoch. it could still grow in very favorable years). K is by def. the
value of N where loglam=0. Therefore population tends to stay below K (the
“CARRYING CAPACITY”) and therefore closer to an extinction threshold than it might
with DI growth (if r>0).
FIRST,
> how do we decide which of these patterns of density dependence best describes a given
dataset?
> what are the consequences of negative density dependence for extinction risk?
To illustrate this, we will use a new data set. The Bay Checkerspot Butterfly,
Euphydryas editha bayensis, the subject of a long-term population study by Paul
Ehrlich’s laboratory at Stanford University in CA, USA.
Run checkerspot.m to observe counts (arith. and log. scales) and loglam vs Nt and logNt
II. Testing for density dependence using maximum likelihood and AIC
[ Probably skip to *****************, except these parts
Review of “maximum likelihood” parameter estimation with Normal errors (from Ch. 4
appendix of M&D)
Example: fitting the theta logistic function:
log( N(t+1)/N(t) ) = r*( 1 – [N(t)/K]^theta )
^^ Like a regression equation:
dep. variable: y[t] = log( N(t+1)/N(t) )
indep. var.: N(t)
params. or coefficients: r, K, theta
Rewrite eq. as
y[t] = f(p,N[t])
where y[t] is log growth rate in year t, the DEPENDENT VARIABLE, p=[r, K, theta] is
a vector of parameter values, and f(p,N[t]) = p[1]*( 1 – ( N(t)/p[2] )^p[3] ) is the theta
logistic function, where N[t] is the INDEPENDENT VARIABLE
What we are trying to do in maximum likelihood parameter estimation is to find the
values of the parameters (or the “value” of the parameter vector p) that maximize the
probability of observing the y[t]’s given the N[t]’s (and, of course, given the particular
form of the model, f(p,N[t]) )
The Normal probability of seeing log growth rate y[t] at time t given a value of p and N[t]
is
  ( y[t ]  f ( p, N [t ])) 2 
1
 (use show.normal.R to plot this)
Pr{y[t] | p,N[t]}=
exp 
2Vr
2Vr


where
1 q
 ( y[t ]  f ( p, N[t ]) )2 is the average squared deviation between the
q t 1
observations and the prediction of the model (the “residual variance”).
Vr 
We want to pick p so that f(p,Nt) is close to yt, because then Pr(y|p,N) will be maximum.
But we want f(p,N) to be close to ALL yt’s given the Nt’s, so we may have to
compromise.
The OVERALL probability of seeing ALL the data is the product of these probabilities
over all times, keeping the model and the parameter values fixed as we cycle through the
pairs of dep. and indep. variables. This overall probability is the LIKELIHOOD L of the
observed data given the model equation and the value of p:
q
L=

t 1
  ( y[t ]  f ( p, N [t ])) 2 
1

exp 
2
V
2Vr
r


where q is the number of pairs of values of the dep. and indep. variables (the “SAMPLE
SIZE”)
Because this is a product of many small numbers (because prob’s are between 0 and 1), it
will be very small. To prevent rounding errors, we take its log to get the LOG
LIKELIHOOD
 q
log L = log  
 t 1

  ( y[t ]  f ( p, N [t ])) 2  
1
 
exp 

2
V
2Vr
r


q

  ( y[t ]  f ( p, N [t ])) 2 
 1


exp 
=  log 
2Vr

t 1


 2Vr


   ( y[t ]  f ( p, N [t ])) 2 
 1 


=  log 
  log exp 
2
V
2

V


t 1
r

r 
 

2
q

( y[t ]  f ( p, N[t ])) 
=   12 log 2Vr 

2Vr
t 1 

q
q
=  12 q log 2Vr  
 ( y[t ]  f ( p, N [t ]))
2
t 1
2Vr
BUT…from the definition of Vr :
q
 ( y[t ]  f ( p, N[t ]))
t 1
2
 qVr
************************************
Therefore
log L   12 q log 2Vr  12 q   12 qlog 2Vr  1
“THE LOG LIKELIHOOD FUNCTION”
So, all we need to compute the log likelihood function with Normal errors is the sample
size q and the residual variance Vr , but as seen in the eq. for Vr , we need p and N[t] to
compute it.
So what we do is to use a search algorithm (a minimization routine) to find the value of p,
given the sequence of N’s and therefore the y’s (the log lambdas computed from the N’s)
that MAXIMIZES THE LOG LIKELIHOOD. Because most routines are set up to find
the minimum of a function, we will search for the value of p that MINIMIZES THE
NEGATIVE LOG LIKELIHOOD. Because the NLL for normally distributed errors
contains only Vr (and q, which is fixed), minimizing NLL is equivalent to minimizing
Vr, or minimizing the sum of squared deviations between the data points and the model.
For this reason, maximum likelihood fitting with Normal errors is equivalent to LEAST
SQUARES PARAMETER ESTIMATION.
Now, to ask if there is negative DD in the data, and if so what form it takes, we need to fit
several models AND TO CHOSE WHICH MODEL IS BEST. Using maximum
likelihood estimation and AIC, there is a natural way to do this.
We’ll fit 3 models (which happen to be “nested” – simpler ones can be obtained by
constraining parameters in more complex ones to particular values):
The DI model:
The Ricker model:
The theta-log model:
f(p,N[t])
r
r(1-N/K)
r(1-(N/K)^theta
No. parameters (incl. Vr):
2
3
4
Easiest fitting methods differ (although we could use the most complex method for all
models):
For DI:
the best r is simply the mean log lambda. Vr is the (biased) variance of the log lambdas
(around r=mu)
Because we only need Vr and q to compute logL, we can do so directly once we have the
variance of the log lambdas
using Nt from checkerspot.m:
loglamt=diff(log(Count));
r=mean(loglamt)
Vr=mean( (loglamt - r)^2 )
% NOTE: can also use Vr=(q-1)*var(loglamt)/q
For Ricker:
loglam is a linear function of N, so we can estimate r and K with a linear regression,
where Vr is the residual variance (the mean squared deviation between the points and the
linear regression line).
[ students can write code to do this following the syntax for “lm” in ygb.R ]
Finally, the theta-logistic model is nonlinear, so we must use a nonlinear fitting procedure
to estimate its parameters. We use nlm (non-linear minimization):
Note: we could also use the R function “optim” to minimize the neg. log likelihood
function directly
Finally, having obtained Vr for all 3 models, we can compute AICc (the corrected Akaike
Information Criterion, where “correction” is for small sample size) for all three models.
The best model has the smallest AICc
AICc = 2*logL + 2* (p * q ) / (q – p – 1 )
logL is negative. The better the fit of a model, the larger the log likelihood (i.e., the less
negative it is, so the smaller is the first term in AICc. In general, more parameters
(higher p) should increase logL, and so decrease this first term. However, as p increases,
the second term increases.
[ as q gets large relative to (p+1), the 2nd term becomes 2*p*q/q = 2*p , and AICc
converges to AIC = 2*logL + 2*p]
Thus AICc (and AIC) attempts to achieve a balance between goodness-of-fit (measured
by logL) and the number of parameters used to achieve that fit.
For JRC population, the AICc values are:
1
2
3
Model
DI
Rick
TLog
p
2
3
4
logL
-41.26566
-37.79878
-37.10478
AICc
87.05306
82.68847
84.11433
SO, even though the TLog model has the highest (least negative) logL, it uses 1 more
parameter to achieve this only-slightly-better fit than does the Ricker, so it has a higher
AICc. In constrast, the Ricker has much lower logL than the DI, and has the lowest AICc
of all models. Therefore we conclude that the model with linear negative DD (the
Ricker) is the best model.
NEXT…
Simulating ext. prob. using the Ricker
We’ll need best estimate of r, K, Vr, and the last population size and the QET.
Using the script theta.logistic.R
> uses rnorm
> if N falls below nx, it is set to 0, because it will then always remain below nx
Results:
* tight overlap of lines means 50K trajectories is sufficient to characterize ext. risk FOR
A GIVEN SET OF PARAMETERS – we have not incorporated parameter uncertainty,
but could do so as we did using extprob.R (but we would then have to grapple with the
fact that estimates of r and K are not independent)
* high risk of extinction for the best-fit parameters
INDEED, this population went extinct 10 yrs. after the last census. The high value of Vr
may be why
NOTE: unlike in DI model, where Pr(ultimate extinction)<1 if mu>0, with a DI model, it
is always 1.
Consequences of negative DD in discrete time
Ricker fitted to checkerspot data shows high extinction risk.
But this is probably because the estimate of Vr is high, not because the pop. is predicted
to oscillate (due to the 1 big peak in plot of fitted Ricker) – we might want to treat that as
an outlier.
But with neg. density dependence, there can be another distinct extinction risk:
population oscillations causing pop size to occasionally visit low numbers
Period-doubling bifurcations in the Ricker model
observe how trajectory and recruitment curve changes as r changes, using ricker.R
r = 1.9, 2.4, 2.6, 2.7, 3
Ricker model has “overcompensatory” (neg) DD –
- when above K, pop can decline below it in 1 time step
- when below K, pop can climb above it in 1 time step
depending on steepness of recruitment curve
Then use bifurc8.R to produce a bifurcation diagram
1. Positive density dependence: Allee effects
Models above assume that loglam only declines with increasing Nt. But it could also
deline at DECREASING Nt, a phenomenon known as an
ALLEE EFFECT (after Warder Clyde Allee)
Potential causes:
- declining birth due to difficulties finding mates at low densities
- declining survival due to failure of group defense (including against abiotic conditions,
as in conspecific nurse effects) or group foraging
A model with declining birth at low density (e.g. due to reduced mating) and declining
survival at high density (due to resource limitation)
Birth = B(N) = a + (ra) N/(A+N) [ Note: per-capita ]
if N=0, B=a
if N=Inf, B=a+(ra) = r
when N=A, B=a + .5r - .5a = (a+r)/2 (birth is half way between its minimum and
maximum values when N=A; A is the “half-saturation constant”)
Survival = S(N) =exp(-b*N)
if N=0, S=1
if N=Inf, S=0
Assume univoltine or annual life cycle:
Lambda = B * S = [a + (r-a) N/(A+N) ] exp(-bN)
(this is equivalent to eq. 4.12 in M&D if a=0, r=exp(r), and b=beta)
Plot B, S, Lambda, and N(t+1) vs N(t)
a = L(N=0)
if a>1, L(0) > 1
if a<1, L(0) < 1
PROGRAM allee.R
“WEAK” Allee effect
“STRONG” Allee effect
with a strong AE, there is an ALLEE THRESHOLD, Na
if population falls below Na, it will decline to extinction (in a deterministic world – in a
stochastic world, it may bounce back, but there is still a strong tendency to decline below
Na)
Even a weak AE can slow the rate at which a population climbs out of a period of low
numbers
Next Topic: other factors in count-based models
2. Environmental autocorrelation – correlations between years in the ENVT.
KEY: stochastic envt. effect is seen as the DEVIATIONS in (log) growth rate once the
(deterministic) effect of density has been taken into account.
Correlations could be:
Positive – if this year was above average, next year is likely to be too
Negative - if this year was above average, next year is likely to be below avg.
Computing the deviations:
loglam2=diff(log(Count))
loglamp=br[1]*(1-Count[-tmax]/br[2])
dr=loglam2-loglamp
Viewing “auto-correlation” two ways:
plot(dr,type='b',xlab='Year',ylab='Deviation')
windows()
plot(dr[-length(dr)],dr[-1],xlab='Deviation(t) ',ylab='Deviation(t+1) ')
Testing for environmental autocorrelation:
- compute DEVIATIONS between lambda ea. year and prediction of BEST (dens. indep.
or dens. dep.) model, accounting for starting density ea. year
- make two vectors:
d1: deviations in years 1 to tmax – 1
d2: deviations in years 2 to tmax
- compute correlation between these 2 vectors and test its significance
e.g. using the following code from checkerspot.R
rho=cor.test(dr[-length(dr)],dr[-1],method="pearson")
rho$estimate
rho$p.value
If there is a significant correlation between successive deviations, how do we include it?
Use show.corr.R to demonstrate a method for generating correlated random variables
with a specified mean, variance, and autocorrelation coefficient (pos. or neg.)
next year’s deviation = rho * this year’s deviation + sqrt(1-rho^2)*new random Normal
deviate with mean zero and desired SD
Demonstrate with show.corr.R that we get the correct mean, SD, and rho for
rho_true = 0, -.7, .7
Long strings of similar envt’al conditions will cause positively autocorrelated deviations
from the expected lambda.
Harder to identify reasons why env’tally driven deviations would be negatively correlated
(if a DI model used, neg. autocorr. could be caused by underlying neg. density
dependence)
In DI case, pos. autocorr. increases ext. risk (M&D fig 4.8)
Use fig. 4.9 in M&D to show effect of autocorrelation on extinction risk with over- vs
under-compensatory (neg) density dep.
Use ricker.corr2.R to demonstrate how to simulate extinction risk when there is
autocorrelation.
3. Catastrophes and bonanzas
- Testing for outliers (high or low)
linear regression approach for density-independent model (or perhaps Ricker)
compute standard deviations for deviations between observed lambda and values
predicted by density dep. models (e.g. the outlier for JR
checkerspot population) –
identify as outliers values > 2 SDs above/below mean
- Using extremes.R to incorporate outliers
replace “run-of-mill” values with extremes, with freq. determined from data
Issues:
* match single pos. (or neg.) outliers with complements?
* may need to modify extremes.R if there are BOTH outliers and density dependence
(and other factors), as in checkerspot
Next Topic: Structured populations
Rationale:
We will begin by going backwards and removing several things we added to count-based
models (environmental stochasticity and density dependence). After we see how the
dynamics of structured populations behaves without these complications, we will put
them back in. In addition, we will consider the effects of another force, demographic
stochasticity.
Basic reason why structure matters:
Individuals don’t contribute equally to population growth
Example: semi-palmated sandpiper - PHOTO
REFERENCE: Hitchcock and Gratto-Trevor (Ecology 78:522-533, 1997)
Biological background:
individuals can live >3 yrs, can begin breeding at age 1, migrate to N. Canada, produce 1
nest per year. Common, but H&GT studied a declining population to identify causes of
decline
Data on average vital rates
s0=.1293
s1=.2543
s2=.593
b1=.25
b2=.875
b3=.95
c=1.8625
# juvenile survival – prob(survival) from hatching to age 1
# yr1 survival – prob(survival) from age 1 to age 2
# adult survival – 1yr. prob(surv) for all individuals age 2 or more
# prob(breed) at age 1
# prob(breed) at age 2
# prob(breed) at ages 3 and above
# chicks per nest
Clearly, age strongly affects survival and likelihood of breeding.
Therefore separating individuals into age classes and keeping track of their separate
contributions to next year’s population should improve our ability to predict/understand
population growth rate ( a pop. of all juveniles will grow differently than a pop. of all
adults)
Building a density-indep. model. As for the unstructured case, we’ll follow population
growth over discrete 1 yr. intervals.
First, consider when we are censusing the population. That will determine what age (or
size) classes we see.
Hitchcock and Gratto-Trevor censused just BEFORE nests produced, after birds returned
to N. Canada on Spring migration – A PRE-BREEDING CENSUS
So they see 1YOs (born just after last census), 2YOs, and 3+YOs. Just after the census,
new juveniles are born. The picture is this:
s2
3+ year olds
3+ year olds
s2
b3 c
2 year olds
1 year olds
1 year olds
b2 c
b1 c
Census t
2 year olds
s1
s0
Census t+1
Juveniles
Arrows are PER-CAPITA rates.
We could write equations for the change in ea. age class over 1 yr:
Let N1(t), N2(t), N3(t) == no. of 1, 2 and 3+ YOs, resp., in the population in yr. t
Sum up products of arrows times Ni(t):
N1(t+1) = b1*c*s0*N1(t) + b2*c*s0*N2(t) + b3*c*s0*N3(t)
N2(t+1) =
s1*N1(t)
N3(t+1) =
s2*N2(t) +
s2*N3(t)
Using linear algebra, we can write this more concisely by separating the N terms from the
rest of the r.h.s.:
n(t+1) = A * n(t)
where n(t) = [ N1(t) N2(t) N3(t)]’ and
A= [ b1 c s0 b2 c s0 b3 c s0; s1 0 0; 0 s2 s2]
Note: A contains all the PER-CAPITA effects, and n contains the numbers in ea. class
Using A as a table
cols = class this year
rows = class next year
elements: PER-CAPITA contributions from this year’s to next year’s population
Reviewing right mult of a matrix by a column vector - numerical example with starting
vector [1 1 1]’
If break between classes, use life-cycle diagram to (re)illustrate the matrix/life history
events – timing of annual census determines life stages “seen”:
b1 c s0
b3 c s0
b2 c s0
1 yo
3+yo
2 yo
s1
s2
s3
6 arrows above correspond to the 6 terms in the matrix, and represent one-year transitions
Put numbers in matrix (“POPULATION PROJECTION MATRIX”), and interpret matrix
cols = from life stage in yr. t
rows = to life stage in yr. t+1
entries are PER-CAPITA RATES
Rules for right-multiplying a matrix by a column vector – if
dim(vec,2)=dim(mat,1)=dim(mat,2)
RESULT: another col. vec. of same dimension as original vec
If vector is [n1(t); n2(t); n3(t)], show that this gives us back the separate recursion
equations for ea. life stage
Go to R to write a program to predict what the population will do when we repeatedly
multiply the matrix A by the (changing) vector n
First, use sandpiper.matrix.R to show 3 ways to construct a proj. matrix in R
Do some matrix/vector multiplication in the command window
n=c(1,1,1)
OR n = matrix(1,3,1)
n=A %*% n; n << repeat several times – what’s happening to N(t), N(t+1)/N(t), and
STRUCTURE?
STRUCTURE = fraction of pop in ea. life stage
First, what is the total population size in year t? sum(nt)
What is the one-year population growth rate in year t?
lambda(t) = sum ( n(t+1) ) / sum ( n(t) )
population structure = n/sum(n)
Now use sandpiper.R to examine the process of CONVERGENCE
then have students manipulate initial vector to ask:
** does convergence occur at the same time for lambda and structure?
** does asymptotic structure and lambda depend on initial vector?
notes: convergence is
- indep. of starting vector (but initial lambda is not) – use 2 starting vecs. to show
- simultaneous in structure and lambda – only when structure is stable is lambda stable
Because lambda and the population structure converge to same values regardless of
starting vector, they must BOTH be properties of the MATRIX, NOT THE INITIAL
VECTOR.
Indeed, they are the DOMINANT EIGENVALUE and DOMINANT RIGHT
EIGENVECTOR of the proj matrix
EIGEN = SELF in German
Computing Evals and Rt. Evecs:
On board using a simple 2x2 matrix
After convergence, all classes change by the same multiplier lambda
Therefore we have 2 equal expressions:
w(t+1) = A w(t) = L w(t) (where L is lambda) –
Above, WE USE w INSTEAD OF n TO EMPHASIZE THAT IT IS A SPECIAL TYPE
OF VECTOR, POST-CONVERGENCE
ie when w(t+1)/sum(w(t+1)) = w(t)/sum(w(t))
– ONLY THEN IS THE ABOVE EQ. TRUE
vector-scalar multiplication: L*w=[L*w1; L*w2]
Note:
[L 0; 0 L] = L [ 1 0; 0 1] = L I
define identity matrix I
But
L I w = [L 0; 0 L] w = [L w1 + 0; 0 + L w2] = L w
Therefore
Aw=LIw
or
(A – L I) w = 0
if A = [ a b; c d], A – L I = [ a – L b; c d – L]
Note: We require that there be infinitely many vectors that satisfy
A w = L w or (A – L I) w = 0
(because any multiple of an eigenvector w is an eigenvector)
The mathematical condition for infinitely many solutions is
det (A – L I) = 0 << this is the “CHARACTERISTIC EQUATION”, used to find L
using only A
det = “determinant”
det of a 2 x 2 matrix:
prod of diag minus prod of “anti”diag.: det [ a b; c d] = a d – b c
So if A = [ a11 a12; a21 a22]
A – L I = [ a11 – L a12; a21 a22 – L ]
so the char. eq. is
det (A – L I) = (a11 – L)(a22 – L) – a12 a21 = 0
or L^2 – ( a11 + a22) L + a11 a22 – a12 a21 = 0
or L^2 + B L + C = 0
where B = – ( a11 + a22) and C = a11 a22 – a12 a21
[B=–Tr(A), C=det(A)]
So, for a 2 x 2 matrix, the char. eq. is a quadratic eq.
More generally, for an n x n matrix, the char. eq. will be an nth order polynomial and so
there will be n eigenvalues (roots of the polynomial)
Solution:
L1,L2 = –B/2 +– sqrt(B^2 – 4C)/2
Two solutions because of +–
Work example with A = [.7 .3; .1 .9] (2 stages: “juveniles” and “adults”)
fraction of adults surviving one year?
fraction of juveniles surviving one year?
fraction of surviving juveniles maturing in one year?
no. of juveniles produced per adult per year?
B = -(.7+.9)=-1.6
C = .7*.9-.1*.3 = .6
L1,L2 = 1.6/2 +- sqrt(2.56 - 2.4)/2 = .8 +- sqrt(.16)/2 = .8 +- .2
L1 = 1 <<< DOMINANT EIGENVALUE
L2 = .6
[[ alternative:
(partially biennial plant model)
A = [g s0 f1 s1 f2; g s0 0]
postbreeding census
stages: seeds on ground, 1y.o. plants
A=[F1 F2; S0 0]
A = [ 1 2; ½ 0]
]]
One eigenvalue is larger than the other in absolute value (more generally, in magnitude,
because some eigenvalues can be complex numbers).
This is called the DOMINANT EIGENVALUE
(with only a few exceptions, for realistic population projection matrices, 1 e’val will
always be larger than the others)
To EACH eigenvalue corresponds a RIGHT EIGENVECTOR
Can get DOMINANT RIGHT EIGENVECTOR w1 (which has 2 entries, w1_1 and
w1_2) by:
plugging L1 and w1=[1; w1_2] into char. eq. (A-L1 I)*w1=0, solving for w1_2, and
rescaling as w1=[1; w1_2]/(1+w1_2);
( the logic here is that we only care about the relative sizes of the elements in w1, so we
can arbitrarily choose the first element to be 1 and then solve for the second)
Using above example:
A = [.7 .3; .1 .9]
L1=1
A-L1 I = [-.3 .3; .1 -.1]
w1=[1; w1_2]
First eq. of (A-L1 I)*w1=0 is
-.3 + .3 w1_2 = 0 so w1_2 = 1 (2nd eq. yields same result – as it must)
So w1 = [1; 1] and rescaling it by dividing each term by its sum yields
w1 = [.5; .5]
which says that after convergence, this population will have an equal number of
“juveniles” and “adults”
In fact, as we have 2 lambdas (L1 and L2) we can get two w’s, w1 and w2
(get w2 by repeating the process above, but using L2 instead of L1)
EXERCISE: Compute w2 for the 2x2 matrix A above
We could also use algebra to compute directly the e’vals and e’vecs for the 3x3 sandpiper
matrix. But char. eq. will now be a cubic eq. (with an L^3 term) which has a MUCH
more cumbersome solution than a quadratic. For a quartic eq. (with an L^4 term) there is
no known closed-form solution.
So in practice, numerical methods are used to compute e-vals and e-vecs.
Return to sandpiper.m and activate Section 1 to show how to calculate e-vals and e-vecs
in R
NOTES:
- For sandpiper matrix, “subdominant” evals are a pair of complex numbers
- The magnitude of these is < that of the dom. eval.
[ mag(L2) = mag(L3) = sqrt( (-0.0160)^2 + (0.0610)^2 ), using Pythagorean theorem ]
- eigen produces a matrix with e’vecs as its columns corresponding to the order of
lambdas, but they are not necessarily scaled to represent proportions
- but because c w1 is also an e’vec, we can rescale by c=1/sum(w1) to get proportions
Return to simulation results (fig. 1 from sandpiper.R):
asymptotic population growth rate equals numerically calculated L1
asymptotic population structure equals numerically calculated w1
But numerical procedure only used A, showing asymptotic growth and structure are
functions of the matrix alone.
Why the e’vals and e’vecs are important for understanding the process of convergence:
SOLUTION OF THE MATRIX PROJECTION EQUATION n(t+1) = A %*% n(t)
In the scalar model
N(t+1) = lambda N(t)
the equation is a recursion relationship, not a solution. Its solution is:
N(t) = lambda^t N(0)
and allows us to predict any future population size knowing lambda and the initial
population size.
The equivalent solution of a 2-stage matrix equation is:
n(t) = c1 L1^ t w1 + c2 L2^t w2
where c1 and c2 are (scalar) constants determined by initial vector n(0)
[ because n(0) = c1*w1 + c2*w2, a system of 2 eq’s w/ 2 unknowns]
Soln. explains why there is convergence:
with t sufficiently large, L1^t becomes much larger than L2^t and “drags” w1 along with
it.
Key result: for biologically realistic projection matrices, L1 will be positive and real
Effect of raising Evals to higher powers
lam>1 but real: lam^t grows exponentially
0<lam<1 but real: lam^t declines exponentially
so for 2 real evals, the larger one comes to be much greater over time, even if both are >1
lam<0 but real: oscillations
damped if -1<lam<0
growing amplitude if lam<-1
But even in the latter case, a pos. lam with greater magnitude will dominate over time
What about complex eigenvalues?
Sandpiper has 3 eigenvals, L1 real and L2,L3 a pair of complex eigenvalues
The solution is now:
n(t) = c1 L1^ t w1 + c2 L2^t w2 + c3 L3^t w3
Use R to show effect of raising complex e’vals for sandpiper to higher and higher
powers:
L2 = -0.016 + 0.061i;
L3 = -0.016 - 0.061i;
Lt=L2
Lt=Lt*L2; Lt
repeat last command to iterate
Sequence produced:
-0.0160 + 0.0610i
-0.0035 - 0.0019i
1.7403e-004 -1.8041e-004i
8.2287e-006 +1.3494e-005i
-9.5440e-007 +2.8668e-007i
Result:
real (and imag.) part oscillates between pos. and neg., but because mag(L2) < 1, Lt -> 0 +
0i as t increases
if mag(L_complex) is >1, real part will oscillate and grow, but more slowly than do
powers of the dominant e’val.
[ note: for all t, imag. parts of L2^t w2 cancel with those of L3^t w3 (and c2=c3), so
solution is purely real ]
The closer L1 and L2 are in magnitude (the smaller the “damping ratio”), the longer
convergence will take to occur.
NEXT BIG TOPIC:
For conservation, we want to know:
How will changing MATRIX ELEMENTS and UNDERLYING VITAL RATES affect
lambda?
Can envision changes in MEs, but they are caused by changes in VRs (see the sandpiper
matrix as a fc. of its vital rates). Still, to understand how changing VRs will change
lambda, since lambda is a function of the matrix and its elements, we need to first
understand how changing matrix elements changes lambda.
First, explore how lam1 changes as we vary two vital rates (or matrix elements),
KEEPING ALL OTHERS CONSTANT
b1 = .25, a_11=.06
So if b1 can vary from 0 to 1, a_11 can go from 0 to .24
s3+ (really s2 in matrix, but let’s only consider survival at age 3+) = a_33 is .593, but can
go from 0 to 1
CLASS ASSIGNMENT:
Vary a11 over 100 values from 0 to .24, keeping all other elements constant, and
vary a33 over 100 points from 0 to 1, keeping all other elements constant.
For ea. value of a11 or a33, compute lam1 for the corresponding projection matrix, and
then plot lam1 va a11 and lam1 vs a33 over their entire ranges of feasible values.
( resulting script should be like sandpiper.lam1.vs.a11.&.a33.R )
RESULTS:
- lam1 is a nonlinear function of a11 or a33
- the slope of lam1 vs a33 is generally steeper than the slope of lam1 vs a11; more ‘bang
for the buck’ for a given amount of change in a33 than in a11 (but actual bang depends
on current value of m.e.)
- it is impossible to make the sandpiper pop. grow by changing b1 alone, whereas it is
possible by changing s3+ alone
Usually we don’t know by how much we can actually change b1 or s3+ (or any other
v.r.). But we might ask: given the current values of the v.r.’s, what SMALL change
would be most effective?
Two ways to think about changes:
Additive vs multiplicative (or alternatively absolute vs proportional) changes in matrix
elements:
multiplicative accounts for different scales at which different elements are measured
- esp. in size-based matrices for plants, or age-based models for fish, for example,
fecundities can be much larger than size/age transitions (which must be <= 1)
activate Section 2 in sandpiper.m to explore additive changes
(or use sandpiper.num.sens.elast.R )
Notes:
- additive change is small: 0.01 – can’t increase age transitions to above 1
express change as ABSOLUTE CHANGE:
(new lam1 – original lam1)/absolute change in m.e. [ie, .01]
- resulting change in lambda put in corresponding position in matrix
- all changes are positive >>> increasing ME increases lambda
- largest effect of increasing adult “stasis”
- there is an effect (sometimes strong) of changing “impossible” MEs
Next, activate Section 3 in sandpiper.R to explore multiplicative changes
Notes:
- still using small (0.01) but now proportional change
express change as PROPORTIONAL:
(new lam1 – original lam1)/orig. lam1/proportional change in ME. [ie, .01]
where proportional change in ME = (ME.orig*(1+eps) – ME.orig)/ME.orig
= ME.orig*eps/ME.orig = eps
- still largest effect of increasing adult stasis; now even more dominant
- zero effect of “changing” zero MEs (because proportions of zero are zero)
- sum of all elements of Emat is approx. 1
SENSITIVITY = absolute change in lambda in response to an absolute change in a ME
ELASTICITY = proportional change in lambda in response to a multiplicative
(proportional) change in a ME
IF matrix embodies ultimate growth rate and struc of population, it must also be able to
tell us DIRECTLY how changing MEs or VRs will change lambda
Above is crude, brute force approach, which doesn’t give us much insight into WHY
changes to particular MEs have large or small effects on lam1.
The MATHEMATICALLY ELEGANT approach (which does yield insight) is to
recognize that:
Sensitivity=partial derivative of lambda with respect to a ME or VR
To compute these derivatives directly, we must first learn about LEFT EIGENVECTORS
DOMINANT LEFT EVEC contains the “reproductive values”
Repro. value = the relative contribution of a single individual in a given life-history
stage to the future size of the population
So, individuals with high RV contribute more to future population size
Use repro.value.R to explore differences in repro. value among 1yo’s, 2yo’s, and 3+yo’s
in sandpiper.
How to get left evecs:
Directly:
Solve: v1 (A – lam1 I) = 0 using v1 = [1 v1_2]
(review left multiplication of a matrix by a row vector)
Using R:
rows of V = (complex conjugate of) inverse of W (where cols of W are right evecs) are
the left eigenvecs
meaning of matrix inverse:
just as 1/x = x^-1 and x * 1/x = 1 (x scalar)
1/W = W^-1 and W * W^-1 = I (W a matrix)
that is, the inverse of W is the matrix such that, when W is rt. multiplied by this matrix,
the identity matrix results.
The left evec “associated with” lambda1 is the DOMINANT LEFT EIGENVECTOR
Finally, computing S and E analytically (see sandpiper.sens.elast.R)
denom = sum over all i of ( v1_i * w1_i ) – same for all matrix elements.
Sij = v1_i * w1_j /denom
So, the effect on lambda1 of changing matrix element aij depends on:
- the fraction of the population at the stable structure that is in stage j, w1_j
- the relative contribution to future population growth of ea. individual PRODUCED by
matrix element aij, which is v1_i
Eij = aij Sij/L1
computing all elements of S and E matrices simultaneously:
v1 = row vec of repro vals
w1 = col vec of stable fractions
S = v1’ * w1’ (where * is vector multiplication)
E = A .* S/L (where .* is element-wise multiplication)
In R:
"The sensitivity matrix"
S=(cbind(v1) %*% w1)/((v1 %*% w1)[1]); S
"The elasticity matrix"
E=A*S/lam1; E
EXAMPLE of the use of sensitivities: loggerhead sea turtles - figs in CROWDER et al.
1992
We now know how to get sens/elast of lam1 to matrix elements. What about underlying
vital rates (accounting for their possible contributions to multiple m.e.’s)
Numerical approach:
Activate sandpiper.change.vrs.R to explore changes in VRs
“Elasticity of lam1 to vital rate p”
= proportional change in lam1 / prop. change in p
~~ [ del lam1/ lam1]/ [del p /p ]
= [ (lam1new-lam1)/lam1 ]/[ (pnew-p)/p ]
= [ (lam1new-lam1)/lam1 ]/[ (p*(1+eps)-p)/p ]
= [ (lam1new-lam1)/(lam1*eps) ]
Results:
- strongest effect of changing s2 – why?
- next strongest effect of changing c or s0 – why?
- same effect of changing s0 or c – why?
- smallest effect of changing b1 – why?
Directly computing sensitivities and elasticities to underlying VRs:
The chain rule:
if f = f(g(x),h(x))
df/dx = df/dg dg/dx + df/dh dh/dx (partials)
In our case,
lam1 = lam1(aij, akl, …) = lam1( aij(vr1, vr2, …), aij(vr1, vr2, …), …)
Sv = sumi sumj Sij daij/dv
Example: compute S_s0 for vr s0 for the sandpiper mat.
Sc = S11 b1 c + S12 b2 c + S13 b3 c
accounts for influence of c on all MEs
Including management in the matrix:
Let
s2^ = survival class 2 at current livestock abundance
s2* = survival class 2 no livestock
A = livestock abundance
A^ = current livestock abundance
Graph s2 as line with negative slope between (0, s2*) and (A^, s2^)
Equation of line:
s2 = s2* - ((s2* - s2^)/A^) * A
similarly for s1, etc.
some functions could increase with A
Next, incorporate functions directly into matrix. Now can compute a sensitivity to A, and
can vary A to calculate effect on lambda
Sandpiper example:
Assume fledgling survival is reduced by an introduced predator:
P^ = current predator abundance
s0^ = estimate of current fledgling survival
s0* = estimate of fledgling survival in absence of pred. (e.g. from pred-free sites P=0)
s0(P) = s0* + [(s0^ - s0*)/(P^ - 0)]*P
put s0(P) in matrix in place of s0
compute d.lam1/d.P using chain rule – accounts for fx. of P on all relevant matrix
elements
Question: by inspection, what is the sign of d.lam1/d.P, and how do you interpret this?
Another example: 5x5 version of loggerhead matrix
0
s1
0
0
0
0
s2(1-g2)
s2 g2
0
0
0
0
s3(1-g3)
s3 g3
0
f4 s4
0
0
0
s4
f5 s5
0
0
0
s5
Question: why s4 in a_14 and s5 in a_15 ?
Question: is this an age, stage, or size based matrix ?
Make s2-s5 increasing (linear) functions of T (0 to 1), the fraction of trawlers using TEDs
e.g. s2(T) = s2^ + [(s2* - s2^)/(1-0)]*T (assumes no TEDs used when s2^ measured)
******************************************************************
NEXT TOPIC:
Incorporating envt’al stochasticity into structured population models
Using sandpiper.show.var.vrs.R, plot annual variation in vital rates
data=read.csv("sandpiper_vrs.csv")
vrs=data.matrix(data) # convert the data frame "data" to a matrix "vrs"
# NOTE: rows of "vrs" are the different vital rates, columns are years
# use the following to show that the vital rate means are the same
# as the values given in sandpiper.R:
M=rowMeans(vrs); M
examine matrix ‘vrs’ and its row means
Data from Hitchcock and Gratto-Trevor showing annual variation in 4 vital rates over 6
years (did not estimate annual variation in breeding probabilities, and note that some of
the other vr’s are replaced by means in some years)
Also note that some of the vr’s are correlated with ea. other
Notably, s0 covaries with c (perhaps b/c good years are good for both no. fledged and
their subsequent surv. – caveat: one point in this graph is the means of the 2 vrs.)
Measuring covariation: - see fig. 7.4 in M&D
definition of Cov:
1 n
Cov( x, y )   ( xi  x )( yi  y )
n i 1
meaning of pos. and neg. cov.
Note similarity between formulas for cov and var:
1 n
Var ( x)   ( xi  x ) 2
n i 1
Relationship between covariance and correlation
Corr(x,y) = Cov(x,y)/[ SD(x) SD(y) ]
Computing a matrix of covariances or correlations in R:
apply cov or cor to a matrix to get the cov or cor matrix of the COLUMNS of the matrix
( so need to transpose vrs before applying cov or cor )
Result:
large positive cor between s0 and c for sandpiper (why are corr’s NA for b1-b3?)
Can simulate envt’al stoch. by creating a proj. matrix for ea. year and choosing them at
random among years - the IID CASE (independently and identically distributed)
(can also chose some more often than others, or build in envt’al autocorrelation by
choosing sequences of matrices).
NEW TRICKS TO BE USED:
* Store annual matrices in 3-D array (depths = years)
* Use function make.sand.mat.R to make matrix from vector of vital rates
use standpiper.stoc.sim.R
Simulate trajectories of total population size (sum of population vector)
On log scale, trajectories fall on normal curve w/ var. increasing with time, as in scalar
case
Long-term stoch. growth rate – can compute 2 ways
1) limit of t’th root of N(t)/N(0) as t goes to inf. (subject to rounding error)
2) arithmetic mean of annual log growth rates ( see program – note vectors renormalized
to 1 ea. year to prevent very small or very large numbers, because need to do this for a
long time until vectors settle into their stoch. distribution).
use sandpiper.stoc.lam.by.sim.R
stoclam is less than lam1 computed from mean matrix
So var reduces population growth, as in scalar case, but we haven’t learned much about
how var in particular underlying vital rates or matrix elements, and cov. between them,
affects stoclam
All variability not created equal:
EXERCISE
Varying sandpiper.stoc.lam.by.sim.R:
make ONLY s0 OR ONLY s2 variable, fixing all other rates at their means, and compute
stochastic lambda numerically, comparing the results
RESULT:
letting s2 vary but not s0 depresses stoclam more (rel. to lam1) than does the opposite
^ BUT this exercise uses OBSERVED var in different rates, rather than comparing
EQUAL var. in the two.
Also, vr’s covary, as we’ve seen.
How do Var’s and Cov’s of underlying rates affect stoclam in a structured population?
Recall from scalar models:
approx. for geom. mean in terms of arith. mean and var:
Lg ≈ La exp{ - Var(lam)/ [ 2 La^2 ] }
In matrix land,
stoclam ≈ lam1bar * exp{ - Var(lam1)/ [ 2 * lam1bar ^2 ] }
But lam1=f(a11(v1,v2,..), a12(v1,v2,...), etc.)
What is the var. of a function in terms of var’s and cov’s of underlying variables?
Simple case: a function of only 1 variable:
Var(f(x)) ~ (df/dx)^2 Var(x)
interpretation using graph of f(x) vs. x
why square?
More complex case: variance of a function of 2 variables:
Var(f(x,y)) ~ (df/dx)^2 Var(x) + (df/dy)^2 Var(y) - 2 * (df/dx) * (df/dy) * Cov(x,y)
Tulja’s approx. (in terms of vital rates):
If only 1 vital rate varied:
Var( lambda(v_i) ) ~ (d lambda/d v_i)^2 Var(v_i) = ( Si )^2 Var(v_j)
But (d lambda/d v_i) = Si (the sensitivity computed earlier) SO…
For a given amount of var. in a vital rate, lambda will vary more when lambda has a
higher sensitivity to that vr.
- graphical interpretation
- Pfister prediction
Now full Tulja approx.:

1 
log s  log 1  2   (Si )2Var (vi )  2 Si S jCov(vi , v j ) 
21  i
i j

Note: for me’s, sensitivities (S’s) always positive
1st summation understandable from above graphical explanation
Impt. message: pos. cov. reduces lam_s more than does neg. cov. between matrix
elements (or vital rates with positive sens’s)
Above arguments can be made in terms of me’s or vr’s (but sensitivities could be
negative for vr’s, but not me’s).
NEXT TOPIC: Incorporating d.d. into matrix models
Example: Chinook salmon (from M&D Ch. 8)
Biology:
- semelparous
- can spawn at age 3, 4, or 5, but die after spawning
Census just before spawning, so youngest class is 1 yo’s, and oldest is 5 yo’s
Let
b_i = breeding probability of age i females (i = 3, 4)
f_i = eggs per age i female (i = 3, 4, or 5)
s_0[E(t)] = survival of eggs to 1 yo’s as a function of the total eggs at time t, E(t)
s_2 to s_4 = survival of females aged 2, 3, 4
Matrix:
 0 0 b3 f3s0[ E (t )] b4 f 4 s0[ E (t )]
s 0
0
0
1
 0 s2
0
0

s3 (1  b3 )
0
0 0
 0 0
0
s4 (1  b4 )
f5 s0[ E (t )]

0


0

0


0
5
where E (t )   bi fi ni (t )
i 3
where ni (t ) is no. of females of age i at time t and b_5 = 1.
Notes:
- entries only on 1st row and subdiagonal – age-based (“Leslie”) matrix
- spawning reduces survival in ages 3 and 4
- all age 5 individuals spawn and die
- egg survival depends on total no. of eggs, not on age of mom, but NUMBER of eggs
does depend on age of mom and her breeding prob.
- because E(t) is a function of n_3, n_4, and n_5, we have introduced densities into the
matrix itself
Possibilities for egg survival:
Ricker function:
s0[ E (t )]  s0 (0) exp[ E (t )]
where s0 (0) is survival when egg density is close to zero.
Graph this – meaning of beta
Beverton-Holt function:
s0 (0)
1  E (t )
Graph this. Params. have same basic meanings as for Ricker.
s0 [ E (t )] 
Shape of curves is similar, they make different predictions about how no. of survivors
depends on initial no. of eggs
No. survivors = E (t ) s0[ E (t )]
Ricker: No. survivors  s0 (0) E (t ) exp[ E (t )]  0 as E(t) goes to inf.
“Overcompensatory (negative) density dependence”
additional eggs reduce survival of all eggs
BH: No. survivors 
s0 (0) E (t )
s (0)
 0
as E(t) goes to inf.
1  E (t )

“Compensatory (negative) density dependence”
beyond some point, all extra eggs added die but don’t reduce survival of others
Possible causes of compensatory d.d. in salmon:
later redds merely replace earlier ones
Possible causes of over-compensatory d.d. in salmon:
- more eggs -> less O2 in streams, lower survival for all eggs
- more eggs, more hatchlings, more competition for food, fewer can reach size to develop
to next stage
These 2 types of negative d.d. have very different implications for the possible population
dynamics.
Use salmon_dd_2.m to explore effect of increasing fertilities in the 2 models.
Harding et al. 2001 Cons Biol
Download