Bio 292: Population Ecology Instructor: Bill Morris MW 2:50-4:05pm, 130 BioSci Fall, 2011 Tentative syllabus: Week: Topic: Readings in Morris & Doak book: 1-2 Density-independent models based on the total number of individuals in a population under a randomly varying environment; Introduction to programming in R Ch. 1-3 3-4 More complex models based on total population size: negative and positive density dependence, environmental autocorrelation, and catastrophes/bonanzas Ch. 4 5-6 Deterministic projection matrix models for age- or size-structured populations Ch. 6, 7 7 Sensitivity analysis for deterministic projection matrix models Ch. 9 to p. 351 8-9 Stochastic projection matrix models; Additional complications in matrix models Ch. 8 10 Meta-population and other spatial models Ch. 10-11 11-12 Models of species interactions (if there is time) 13 Students present results of research projects Course requirements: 1) Do problem sets and worksheets (not graded; only credit/no credit) 2) Class project: - create and analyze a population model using your own data or data from the literature - report results to the class in the last week of the semester Day 1: Intro and Understanding the effects of variability Introduction and Goals of the Course: Definition of a population: the set of individuals of a single species in a defined area Goals of Population Ecology To find answers to questions such as: - Why are there so many (or few) of a given species in a particular place? - Why do numbers change (or not) over time or space? - Why do we see the observed ratio of different sized or aged individuals in a population? - How does a population persist for an indefinite period of time despite the fact that every individual in the population will die relatively soon (that is, what keeps births and deaths in approximate balance)? - What is the probability that a given population will go extinct by a given future time, AND HOW IS THAT PROBABILITY INFLUENCED BY HUMAN ACTIVITIES? Why Population Ecol MUST be quantitative: ALL of the preceding questions involve numbers. To answer them in anything but the most cursory way requires quantitative tools. In applications, it is often not enough to know what factors influence, e.g., extinction risk. We need to have RELATIVE answers: - which factors have the most influence on populations - which of several populations is most likely to avoid extinction - which management strategy will reduce extinction risk more? To provide such answers requires… Quantitative tools: 1) mathematics deterministic stochastic 2) computing languages – MATLAB, C, R We will learn about both in this course >>Equivalent of Newton’s Law of Constant Motion: constant PER-CAPITA growth in absence of “other forces”; constant per-capita growth means GEOMETRIC increase (or decline) Simplest population model: N(t) = no. of individuals in population in year t (literally, at census point t) Assume births immediately follow the census, then individuals may (or may not) survive to the next census. B = average no. of surviving newborns PER CAPITA before the next census D = fraction of adults dying before the next census (also PER CAPITA) N(t+1) = B N(t) + (1 – D) N(t) N(t+1) = (B – D + 1) N(t) N(t+1) = lambda N(t) where lambda = (B – D + 1) measures excess of births over deaths if B = D, lambda = 1, no change in population if B < D, lambda < 1, population declines if B > D, lambda > 1, population grows Prediction: total population size grows or declines GEOMETRICALLY but at a constant PER-CAPITA rate, lambda Simulate growth of populations with constant lambda FIRST, BASIC INTRO TO R Opening R Changing directory to a folder for the current project (I always do this first thing) In Console window: lam=1.1 lam N=10 N=lam*N N n (case sensitive) Iteration: N=lam*N; N alternate up-arrow and Enter To iterate, better to write a “script” or program: opening a script saving a script running a script Script will need to: - repeat N=lam*N tmax times - use 3 different lams: <1, 1, >1 - plot results vs. time To do this, we will need to: - define constants (to make the script generic) - use matrices: one name for many items N=matrix(c(4,3,6,2,1),1,5) accessing items: N[3] (=6) using square brackets - allocate memory to store numbers at each time – e.g., N1=matrix(0,tmax+1,1) - repeat an action using a “for loop” – for(t in 1:tmax) what does 1:tmax do? - use a plotting function – help(matplot) or ?matplot THEN WRITE FIRST PROGRAM IN R – lambda.const.R 5 (or 6) basic components of every program (or script): [ clear memory - rm(list=ls(all=TRUE)) ] define constants allocate memory to variables ( to speed program ) assign initial values to variables perform actions (e.g., loops) output results # define constants lam1=0.9 lam2=1 lam3=1.1 tmax=20 n0=10 # allocate memory N1=N2=N3=matrix(0,tmax+1,1) # could also use N1=N2=N3=numeric(tmax+1) # initialize variables N1[1]=N2[1]=N3[1]=n0 # perform iterations for (t in 1:tmax){ N1[t+1]=lam1*N1[t] N2[t+1]=lam2*N2[t] N3[t+1]=lam3*N3[t] } # first run above and look at N1, N2, and N3; # then add the following to plot results and run again matplot(0:tmax,cbind(N1,N2,N3),xlab="Year,t",ylab="N(t)",main="Lambda Constant",type="b",pch=19) NEXT: First deviation from the first law: population growth rate is not constant Now N(t+1) = lambda(t) N(t) lambda(t)>1 in years when B>D lambda(t)<1 in years when B<D Simulate uniform variation in population growth rate Let lambda vary uniformly between lambar – dl and lambar + dl, so that lambar is the arithmetic mean lambda - GRAPH OF THE PDF OF LAMBDA and how it’s affected by lambar and dl In-class Assignment: Write a program – really, modify lambda.const.R - to simulate multiple population trajectories with lambda varying as described above We will use it to explore the effect of varying dl (i.e., making lambda more variable) First, two useful things: 1) Random number generation: runif(npops,lmin,lmax) >>> generate npops uniform nos. between lmin and lmax lambar=1 dl=.1 x=runif(5000,lambar-dl,lambar+dl) hist(x, xlim=c(.5,1.5)) similar syntax for other prob. distributions; e.g. Normal: x=rnorm(5000,.1,.1) hist(x, xlim=c(-1,1)) 2) Multiplying a row of a matrix by a row vector: N=matrix(10,10,5) L=runif(5,.9,1.1) L N[2,]=L*N[1,] N L=runif(5,.9,1.1) N[3,]=L*N[2,] N Now work on writing your own GENERIC script: SCRIPT (this is lambda.variable.R): tmax=50 npops=20 lambar=1.01 dl=.05 n0=10 lmin=lambar-dl lmax=lambar+dl n=matrix(0,tmax+1,npops) # 2D array n will hold all npops trajectories n[1,]=n0 for(t in 1:tmax) n[t+1,]=n[t,]*runif(npops,lmin,lmax) # EXPLAIN THIS LINE matplot(0:tmax, n, type="l", xlab="Year", ylab="Population size", main="Lambda variable",ylim=c(0,1.05*max(n))) Start with npop=1 and run a few times. Then increase the number of populations to 50, then slowly increase dl, and see what happens It would be convenient to look at n(t) on the log scale Here, use stoc.lambda.R instead of modifying the program above Clearly, lambar doesn’t describe population growth, because most populations can decline even if lambar>1 What is a better measure of the population growth rate? Geometric mean as a measure of long-term growth rate N(1) = L(0) N(0) N(2) = L(1) N(1) = L(1) L(0) N(0) N(t) = L(t-1) L(t-2) … L(1) L(0) N(0) What L (call it Lg), when multiplied by itself t times, would equal L(t-1) L(t-2) … L(1) L(0)? This is a good measure of AVERAGE ANNUAL GROWTH Solve Lg^t = [ L(t-1) L(t-2) … L(1) L(0) ] for Lg Answer: Lg = [ L(t-1) L(t-2) … L(1) L(0) ]^(1/t) But, this is NOT the arithmetic mean of the L’s, which is: (1/t) [ L(t-1) + L(t-2) + …+ L(1) + L(0) ] = lambar (or La) Instead, it is the GEOMETRIC MEAN. That’s why we called it Lg. The geometric mean is more strongly depressed by an L x units below the arithmetic mean than it is increased by an L x units above the arith. mean (unlike the arith. mean itself) Why? Math fact: rel. small values decrease a product more than rel. large values increase it. For example, what if a single L(t) = 0? What is Lg? 0. Another example showing deviations above and below La are not equal in terms of their effects on Lg: La = 1.1; dL = 0.2 Case 1: even years lambda = La odd years lambda is La plus dL Lg=sqrt(1.1 x 1.3 ) = 1.196, which is .096 above La Case 2: even years lambda = La odd years lambda is La minus dL Lg=sqrt(1.1 x 0.9 ) = 0.995, which is .105 below La A third way to look at Lg: A useful approximation for the geometric mean in terms of the arithmetic mean and the year-to-year variance of lambda, Var(lam) ( Var is average squared deviation from La ) Lg ≈ La exp{ - Var(lam)/ [ 2 La^2 ] } If Var(lam) = 0, exp{ - Var(lam)/ [ 2 La^2 ] } = 1 and Lg = La. But whenever Var(lam) > 0, exp{ - Var(lam)/ [ 2 La^2 ] } < 1 and Lg < La On board: plot Lg vs Var(lam) according to approximation: exponential decline with increasing Var(lam) Show that Lg does better at “splitting the difference” between possible trajectories than does La. PROGRAM: median.vs.lamg.&.lama.R This program uses another way to compute Lg: if Lg = [ L(t-1) L(t-2) … L(1) L(0) ]^(1/t) then log Lg = log{ [ L(t-1) L(t-2) … L(1) L(0) ]^(1/t) } log Lg = (1/t) * [ log{L(t-1)} + log{L(t-2)} + … log{L(1)} + log{L(0)} ] = arithmetic mean of log(L)’s So, Lg = exp{ arithmetic mean of log(L) } Important messages: 1) Environmental stochasticity, by introducing variation in the lambdas, actually depresses long-term population growth relative to the prediction of lambar. We can even have a population that has an (arithmetic) average growth >1 but that is virtually certain to decline over the long run if variation is sufficiently high. 2) So when we relax the assumption of the first law that lambda is constant, not only do we get variation in population growth from year to year, we get a lower long-term rate of growth. Even if Lg>1, some possible trajectories may reach low population sizes. For a sexually reproducing species, we might consider 1 to be effective extinction (“quasi-extinction”). Some trajectories might hit 1 even if Lg>1. Next Big Topic What is the probability that populations will go (quasi-)extinct by some future time? Effects of mean and variance in annual growth rate {{ SKIP THE FOLLOWING IF PRESSED FOR TIME: First: uniform variation not realistic, and can’t increase variation indefinitely because lambda can’t be negative, but it often has a weaker constraint above than below Alternative: normal variation, but then lambda could still be negative Better alternative: let lambda follow a lognormal distribution if X ~ normal(mean=mu,sd=sig) then Y=exp(X) ~ lognormal lognormal.demo.R }}} POSSIBLY START with YGB as motivation for wanting to quantify Prob(quasiext.) Note that in running stoc.lambda.R, population size was LOGNORMALLY DISTRIBUTED The Lognormal distribution: mu=.1 sig=.5 x=rnorm(10000,mu,sig); split.screen(c(2,1)) screen(1) hist(x,breaks=50) screen(2) hist(exp(x),breaks=50) This means that the LOG of N will follow a CHANGING Normal distribution But this resembles the physical process of DIFFUSION (e.g. of molecules of a gas) in a moving fluid [ here and everywhere LOG means NATURAL LOG] Fig. 3.3 from Morris&Doak (M&D) Start from LOG of current population size The lower boundary: “quasi-extinction threshold” WE can set this boundary wherever we wish, based on BIOLOGICAL and POLITICAL considerations (some disc. of this in Ch. 2) What determines the likelihood of hitting the threshold is: 1. How fast (and in what direction) the position of the mean of the distribution moves 2. How fast the VARIANCE of the distribution increases over time 3. How far down the population must go from current (log) size to (log) threshold 4. How long into the future we are considering VARIANCE – mathematical definition: average SQUARED distance between the “particles” (or pop sizes) and their (arithmetic) mean Eg for a set of n trajectories at time t Var(N(t)) = (1/n) * sum(over i) of [ ( Ni(t) – Nbar(t) )^2 ] If mean declines, CERTAIN to hit threshold (and more quickly the quicker the mean declines) If mean increases quickly, LESS likely to hit threshold quickly If var increases quickly, MORE likely to hit threshold quickly Mean(t) = mu * t Var(t) = sigma_squared * t (sigma_squared = sig2 below) mu and sig2 are constants that determine RATE of increase in mean and variance, resp., over time MU and SIGMA – Greek letters [ Plot Mean and Var vs t for diff. values of mu (-.1 0 .1) and sig2 (positive only) ] The effects of mu and sig2 on likelihood of hitting threshold most easily seen by plotting the QUASI-EXTINCTION TIME CUMULATIVE DISTRIBUTION FUNCTION or CDF vs time in the future (which we will abbreviate as G(T) ) Fig. 3.5 in M&D G(T) is PROBABILITY that quasi-ext. threshold has been reached AT ANY TIME BETWEEN NOW AND FUTURE TIME T. Because it is a probability, it must always lie between 0 and 1. If mu > 0, G_max < 1 (because some pops go to infinity – NOT REALISTIC) If mu < 0, G_max = 1 The equation that governs G(T) given Nc (current pop size), Nq, mu, and sig2 has been derived by physicists. It is eq. 3.5 in M&D where d = log(Nc) – log(Nq) NOTE in fig. 3.5 that for any given mu, the prob. of extinct. increases more rapidly early on if sig2 is larger. All of this is called the DIFFUSION APPROXIMATION for estimating extinction risk. Advantages: easy to apply Disadvantages: several assumptions that are often unrealistic (LATER) NEXT: Using the DA with real data Intro: The YGB - isolated population – part of once-much-larger range - long (44 yr.) record of counts of adult females (actually 3 yr. running sum of females with YOY cubs – 3 yrs. between births, so sum is approx. of total no. of adult females) - 6 more years of data than in M&D - counts only – no info. on pop structure in these data - counts at dumps before, and by aerial survey after 1973 - fires in 1988 – did it change mu or sig2? - legally protected from hunting, but contact with humans still a source of mortality; pressure to remove protection to have a SUCCESS STORY for the law (Endangered Species Act – delisted in 2007 and then Relisted in 2010 due to political pressure from conservation groups) QUESTION: What is the current risk of extinction? Impt. to know before delisting. Goal: compute extCDF for YGB, but to do so we must first estimate mu and sig2 TWO methods to do so. But first, getting data into R (follows ygb.R, but do step by step in command window Read csv data file into a DATA FRAME data=read.csv("ygb_females_1959_2003.csv") Cols. of data have names, Year and N, inherited from csv file attach datafile – now can use col’s as variables Use plot to plot the data plot(Year,N,etc.) 1st, what is estimate of PER-CAPITA pop growth rate over 1 yr? lam_t = N_t+1/N_t can get all annual lam’s in 1 step: n=length(N) lam=N[-1]/N[-n] Discuss above NOTE: one fewer lam than censuses Method 1: mu – how much mean LOG pop size changes per year therefore, (arith.) average of LOG( N(t+1) / N(t) ) is an estimate of mu sig2 – how much VAR(log pop size) changes per year therefore, variance of log( N(t+1) / N(t) ) is an estimate of sig2 compute mu and sig2 by “standard method” mu=mean(log(lam)); mu sig2=var(log(lam)); sig2 Now go to ygb.R Using mu and sig2 to compute extCDF writing your own functions in R – eg extcdf - stored in pva.functions.R NOTES: input variables must be given in correct order, or variable names must be given in function call [ functions can define other internal functions, but they will not work outside the function ] storing functions in other files and “source”-ing them Result: very low prob.(extinction) at t=50 BUT using only the best estimate of mu and sig2 ignores the fact that these are only estimates, and may miss the “true” values. We can use the confidence intervals for mu and sig2 to put confidence limits on the CDF. Procedure: draw a random mu from a Normal distn. with mean mu_best and var. SEmubest (=sig2/q^.5); keep if within mu’s CI, repeat if not draw a random number from a chi-squared distn. with 1 and q df and multiply it by sig2best/q; keep if within sig2’s CI, repeat if not compute CDF and save any values (over t) that exceed past values, both above and below CDFbest repeat many times, and plot limits above and below use extprob.R NOTE: wide CI around best CDF MESSAGE: ext. risk of YGB could be as high as 16% at T=50 BUT what if there were a gap in the census – mean and var should change more over the gap than in the one-yr. intervals between other censuses – standard method inappropriate DIFFUSION APPROXIMATION FOR PROBABILITY OF QUASI-EXTINCTION: When censuses are taken every year, the easiest way to estimate the parameters mu and sigma^2 for the diffusion approximation are: mu = mean(log(N[-1]/N[-n])) = mean(diff(log(N))) sig2 = var(log(N[-1]/N[-n])) = var(diff(log(N))) BUT... not appropriate if some intercensus intervals are longer than others (b/c pop should change more in such intervals) Alternative: linear regression approach – allows for diff. intercensus intervals other advantages: - easy confidence interval (at least for mu) - can use regression tools to identify outliers - can test for changes in mu and sigma^2 in different time periods (before/after dumps closed; before/after fires) Estimating mu and sig2 by regression: Regress x = sqrt(diff(Year)) vs y = diff(log(N))/x WITH ZERO INTERCEPT Slope is estimate of mu Mean of squared residuals around regression line is estimate of sig2 Plot what this looks like. Doing a linear regression in R: x=1:10 y=3+2*x+rnorm(10,0,1) plot(x,y) out=lm(y~x) summary(out) coef(out) confint(out) anova(out) sig2=anova(out)[2,3]; sig2 s2=sum((out$resid)^2)/8; s2 CIsig2=(q-1)*sig2/qchisq( c(.975,.025),df=(q-1) ) IN-CLASS ASSIGNMENT, WORKING IN PAIRS: 1988 WAS THE YEAR OF THE YELLOWSTONE FIRES. IMAGINE THAT, BECAUSE OF THE FIRES, IT WAS NOT POSSIBLE TO DO THE GRIZZLY BEAR CENSUS THAT YEAR. USING THE YGB DATA, DELETE THE COUNT FROM 1988, ESTIMATE MU AND SIGMA^2 BY LINEAR REGRESSION, AND USE THOSE ESTIMATES TO PRODUCE A QUASI-EXTINCTION TIME CDF USING THE PROGRAM YGB.R. other advantages of regression approach: tests for outliers tests for changes in mu and sig2 (e.g after 1983 fires) confidence intervals on mu (can also be calculated directly from mu and sig2) doing regression with YGB data: see more.ygb.stuff.R Before running any of this prog – save standard estimates for comparison: mu_s=mu sig2_s=sig2 confidence limits on mu produced directly in bint confidence limits on sig2 computed using chi2 distn. [possibly skip or give overview]: Two ways to look for outliers using regression output: dffts Rstudentized Both indicate 1983 is odd – UNUSUALLY HIGH – consequences for estimated extinction risk? 1983 is not the year the dumps were closed, or the fire year. If a reason to discard it is determined, could delete this lambda and estimate mu and sig2 using the remaining lambdas. Can also ask statistically if mu, sig2 change before/after fires or before/after dumps closed – see details in more.ygb.stuff. R Uses of CDF comparison of different pops w/ diff. Nc, mu, or sig2, diff thresholds [ M&D figs. 3.9,3.10 ] Review and tests of assumptions: I. Parameters mu and sig2 constant Violations: 1) dens density dependence could change mu (and even sig2) mu declines as N increases mu could decline as N decreases 2) dem. stoch could change sig2 3) environmental trends could change both mu and sig2 II. No envtal autocorrelation Whether lam was large last year has no effect on whether lam is lg. or small. this year We’ll see how to test for and incorporate this next time III. No extremely large or small values of lam (no bonanzas or catastrophes) tests for outliers how to include if found IV. No observation error - counts are assumed to be accurate – if not, they inflate sig2, making calculated ext. risk too high HOMEWORK: Read Ch. 4 (skip Ceiling Model) including Appendix Next Big Topic: More complex count-based models: - Density-Dependence (negative and positive, i.e., Allee effects) - Environmental Autocorrelation and its interaction with D.D. - Catastrophes and Bonanzas (for later: Demographic Stochasticity) I. An overview of models with negative density dependence [ SKIP 1. Ceiling model (eq. 4.1 in M&D) ] 2. More realistic models with continuous change in lambda as N(t) increases The DI model assumed loglam indep. of N(t), with mean mu that doesn’t change as N(t) changes. Var in loglam around mu is sig2, assumed to be caused by environmental variation. [[ Why log? 1) log ( N(t+1)/N(t) ) = log ( N(t+1) ) – log( N(t) ) errors in estimating N(t+1) and N(t) have same effects on log lambda, but errors in estimating denominator of N(t+1)/N(t) have much stronger effect than errors in estimating numerator 2) log ( N(t+1)/N(t) ) can go from –Inf to Inf, but N(t+1)/N(t) only goes from 0 to Inf (if N(t)>0). Log ratio more likely to be normal, and so tools that assume normal variation more appropriate. ]] But more realistically, we might expect the mean log lambda (i.e., mu) to decline as N(t) increases. This is called “NEGATIVE density dependence”, because log lambda DECLINES. Several patterns for this decline. Illustrated by the so-called “Theta Logistic Model”: (on board) N(t+1) = N(t) * exp{ r*( 1 – [N(t)/K]^theta ) } Take logs of both sides: log( N(t+1)/N(t) ) = r*( 1 – [N(t)/K]^theta ) Plot log population growth rate log ( N(t+1)/N(t) ) vs N(t) show.theta.logistic.R Patterns: log lambda decreases linearly with N(t) (theta=1); this model is also known as the RICKER MODEL, widely used in fisheries modeling log lambda decreases sharply only as N(t) approaches K (like ceiling model); theta>1 log lambda decreases sharply at first, then changes little as N(t)->K; theta<1 KEY POINT: With any of the above patterns, population tends to decline above K (although with envtal. stoch. it could still grow in very favorable years). K is by def. the value of N where loglam=0. Therefore population tends to stay below K (the “CARRYING CAPACITY”) and therefore closer to an extinction threshold than it might with DI growth (if r>0). FIRST, > how do we decide which of these patterns of density dependence best describes a given dataset? > what are the consequences of negative density dependence for extinction risk? To illustrate this, we will use a new data set. The Bay Checkerspot Butterfly, Euphydryas editha bayensis, the subject of a long-term population study by Paul Ehrlich’s laboratory at Stanford University in CA, USA. Run checkerspot.m to observe counts (arith. and log. scales) and loglam vs Nt and logNt II. Testing for density dependence using maximum likelihood and AIC [ Probably skip to *****************, except these parts Review of “maximum likelihood” parameter estimation with Normal errors (from Ch. 4 appendix of M&D) Example: fitting the theta logistic function: log( N(t+1)/N(t) ) = r*( 1 – [N(t)/K]^theta ) ^^ Like a regression equation: dep. variable: y[t] = log( N(t+1)/N(t) ) indep. var.: N(t) params. or coefficients: r, K, theta Rewrite eq. as y[t] = f(p,N[t]) where y[t] is log growth rate in year t, the DEPENDENT VARIABLE, p=[r, K, theta] is a vector of parameter values, and f(p,N[t]) = p[1]*( 1 – ( N(t)/p[2] )^p[3] ) is the theta logistic function, where N[t] is the INDEPENDENT VARIABLE What we are trying to do in maximum likelihood parameter estimation is to find the values of the parameters (or the “value” of the parameter vector p) that maximize the probability of observing the y[t]’s given the N[t]’s (and, of course, given the particular form of the model, f(p,N[t]) ) The Normal probability of seeing log growth rate y[t] at time t given a value of p and N[t] is ( y[t ] f ( p, N [t ])) 2 1 (use show.normal.R to plot this) Pr{y[t] | p,N[t]}= exp 2Vr 2Vr where 1 q ( y[t ] f ( p, N[t ]) )2 is the average squared deviation between the q t 1 observations and the prediction of the model (the “residual variance”). Vr We want to pick p so that f(p,Nt) is close to yt, because then Pr(y|p,N) will be maximum. But we want f(p,N) to be close to ALL yt’s given the Nt’s, so we may have to compromise. The OVERALL probability of seeing ALL the data is the product of these probabilities over all times, keeping the model and the parameter values fixed as we cycle through the pairs of dep. and indep. variables. This overall probability is the LIKELIHOOD L of the observed data given the model equation and the value of p: q L= t 1 ( y[t ] f ( p, N [t ])) 2 1 exp 2 V 2Vr r where q is the number of pairs of values of the dep. and indep. variables (the “SAMPLE SIZE”) Because this is a product of many small numbers (because prob’s are between 0 and 1), it will be very small. To prevent rounding errors, we take its log to get the LOG LIKELIHOOD q log L = log t 1 ( y[t ] f ( p, N [t ])) 2 1 exp 2 V 2Vr r q ( y[t ] f ( p, N [t ])) 2 1 exp = log 2Vr t 1 2Vr ( y[t ] f ( p, N [t ])) 2 1 = log log exp 2 V 2 V t 1 r r 2 q ( y[t ] f ( p, N[t ])) = 12 log 2Vr 2Vr t 1 q q = 12 q log 2Vr ( y[t ] f ( p, N [t ])) 2 t 1 2Vr BUT…from the definition of Vr : q ( y[t ] f ( p, N[t ])) t 1 2 qVr ************************************ Therefore log L 12 q log 2Vr 12 q 12 qlog 2Vr 1 “THE LOG LIKELIHOOD FUNCTION” So, all we need to compute the log likelihood function with Normal errors is the sample size q and the residual variance Vr , but as seen in the eq. for Vr , we need p and N[t] to compute it. So what we do is to use a search algorithm (a minimization routine) to find the value of p, given the sequence of N’s and therefore the y’s (the log lambdas computed from the N’s) that MAXIMIZES THE LOG LIKELIHOOD. Because most routines are set up to find the minimum of a function, we will search for the value of p that MINIMIZES THE NEGATIVE LOG LIKELIHOOD. Because the NLL for normally distributed errors contains only Vr (and q, which is fixed), minimizing NLL is equivalent to minimizing Vr, or minimizing the sum of squared deviations between the data points and the model. For this reason, maximum likelihood fitting with Normal errors is equivalent to LEAST SQUARES PARAMETER ESTIMATION. Now, to ask if there is negative DD in the data, and if so what form it takes, we need to fit several models AND TO CHOSE WHICH MODEL IS BEST. Using maximum likelihood estimation and AIC, there is a natural way to do this. We’ll fit 3 models (which happen to be “nested” – simpler ones can be obtained by constraining parameters in more complex ones to particular values): The DI model: The Ricker model: The theta-log model: f(p,N[t]) r r(1-N/K) r(1-(N/K)^theta No. parameters (incl. Vr): 2 3 4 Easiest fitting methods differ (although we could use the most complex method for all models): For DI: the best r is simply the mean log lambda. Vr is the (biased) variance of the log lambdas (around r=mu) Because we only need Vr and q to compute logL, we can do so directly once we have the variance of the log lambdas using Nt from checkerspot.m: loglamt=diff(log(Count)); r=mean(loglamt) Vr=mean( (loglamt - r)^2 ) % NOTE: can also use Vr=(q-1)*var(loglamt)/q For Ricker: loglam is a linear function of N, so we can estimate r and K with a linear regression, where Vr is the residual variance (the mean squared deviation between the points and the linear regression line). [ students can write code to do this following the syntax for “lm” in ygb.R ] Finally, the theta-logistic model is nonlinear, so we must use a nonlinear fitting procedure to estimate its parameters. We use nlm (non-linear minimization): Note: we could also use the R function “optim” to minimize the neg. log likelihood function directly Finally, having obtained Vr for all 3 models, we can compute AICc (the corrected Akaike Information Criterion, where “correction” is for small sample size) for all three models. The best model has the smallest AICc AICc = 2*logL + 2* (p * q ) / (q – p – 1 ) logL is negative. The better the fit of a model, the larger the log likelihood (i.e., the less negative it is, so the smaller is the first term in AICc. In general, more parameters (higher p) should increase logL, and so decrease this first term. However, as p increases, the second term increases. [ as q gets large relative to (p+1), the 2nd term becomes 2*p*q/q = 2*p , and AICc converges to AIC = 2*logL + 2*p] Thus AICc (and AIC) attempts to achieve a balance between goodness-of-fit (measured by logL) and the number of parameters used to achieve that fit. For JRC population, the AICc values are: 1 2 3 Model DI Rick TLog p 2 3 4 logL -41.26566 -37.79878 -37.10478 AICc 87.05306 82.68847 84.11433 SO, even though the TLog model has the highest (least negative) logL, it uses 1 more parameter to achieve this only-slightly-better fit than does the Ricker, so it has a higher AICc. In constrast, the Ricker has much lower logL than the DI, and has the lowest AICc of all models. Therefore we conclude that the model with linear negative DD (the Ricker) is the best model. NEXT… Simulating ext. prob. using the Ricker We’ll need best estimate of r, K, Vr, and the last population size and the QET. Using the script theta.logistic.R > uses rnorm > if N falls below nx, it is set to 0, because it will then always remain below nx Results: * tight overlap of lines means 50K trajectories is sufficient to characterize ext. risk FOR A GIVEN SET OF PARAMETERS – we have not incorporated parameter uncertainty, but could do so as we did using extprob.R (but we would then have to grapple with the fact that estimates of r and K are not independent) * high risk of extinction for the best-fit parameters INDEED, this population went extinct 10 yrs. after the last census. The high value of Vr may be why NOTE: unlike in DI model, where Pr(ultimate extinction)<1 if mu>0, with a DI model, it is always 1. Consequences of negative DD in discrete time Ricker fitted to checkerspot data shows high extinction risk. But this is probably because the estimate of Vr is high, not because the pop. is predicted to oscillate (due to the 1 big peak in plot of fitted Ricker) – we might want to treat that as an outlier. But with neg. density dependence, there can be another distinct extinction risk: population oscillations causing pop size to occasionally visit low numbers Period-doubling bifurcations in the Ricker model observe how trajectory and recruitment curve changes as r changes, using ricker.R r = 1.9, 2.4, 2.6, 2.7, 3 Ricker model has “overcompensatory” (neg) DD – - when above K, pop can decline below it in 1 time step - when below K, pop can climb above it in 1 time step depending on steepness of recruitment curve Then use bifurc8.R to produce a bifurcation diagram 1. Positive density dependence: Allee effects Models above assume that loglam only declines with increasing Nt. But it could also deline at DECREASING Nt, a phenomenon known as an ALLEE EFFECT (after Warder Clyde Allee) Potential causes: - declining birth due to difficulties finding mates at low densities - declining survival due to failure of group defense (including against abiotic conditions, as in conspecific nurse effects) or group foraging A model with declining birth at low density (e.g. due to reduced mating) and declining survival at high density (due to resource limitation) Birth = B(N) = a + (ra) N/(A+N) [ Note: per-capita ] if N=0, B=a if N=Inf, B=a+(ra) = r when N=A, B=a + .5r - .5a = (a+r)/2 (birth is half way between its minimum and maximum values when N=A; A is the “half-saturation constant”) Survival = S(N) =exp(-b*N) if N=0, S=1 if N=Inf, S=0 Assume univoltine or annual life cycle: Lambda = B * S = [a + (r-a) N/(A+N) ] exp(-bN) (this is equivalent to eq. 4.12 in M&D if a=0, r=exp(r), and b=beta) Plot B, S, Lambda, and N(t+1) vs N(t) a = L(N=0) if a>1, L(0) > 1 if a<1, L(0) < 1 PROGRAM allee.R “WEAK” Allee effect “STRONG” Allee effect with a strong AE, there is an ALLEE THRESHOLD, Na if population falls below Na, it will decline to extinction (in a deterministic world – in a stochastic world, it may bounce back, but there is still a strong tendency to decline below Na) Even a weak AE can slow the rate at which a population climbs out of a period of low numbers Next Topic: other factors in count-based models 2. Environmental autocorrelation – correlations between years in the ENVT. KEY: stochastic envt. effect is seen as the DEVIATIONS in (log) growth rate once the (deterministic) effect of density has been taken into account. Correlations could be: Positive – if this year was above average, next year is likely to be too Negative - if this year was above average, next year is likely to be below avg. Computing the deviations: loglam2=diff(log(Count)) loglamp=br[1]*(1-Count[-tmax]/br[2]) dr=loglam2-loglamp Viewing “auto-correlation” two ways: plot(dr,type='b',xlab='Year',ylab='Deviation') windows() plot(dr[-length(dr)],dr[-1],xlab='Deviation(t) ',ylab='Deviation(t+1) ') Testing for environmental autocorrelation: - compute DEVIATIONS between lambda ea. year and prediction of BEST (dens. indep. or dens. dep.) model, accounting for starting density ea. year - make two vectors: d1: deviations in years 1 to tmax – 1 d2: deviations in years 2 to tmax - compute correlation between these 2 vectors and test its significance e.g. using the following code from checkerspot.R rho=cor.test(dr[-length(dr)],dr[-1],method="pearson") rho$estimate rho$p.value If there is a significant correlation between successive deviations, how do we include it? Use show.corr.R to demonstrate a method for generating correlated random variables with a specified mean, variance, and autocorrelation coefficient (pos. or neg.) next year’s deviation = rho * this year’s deviation + sqrt(1-rho^2)*new random Normal deviate with mean zero and desired SD Demonstrate with show.corr.R that we get the correct mean, SD, and rho for rho_true = 0, -.7, .7 Long strings of similar envt’al conditions will cause positively autocorrelated deviations from the expected lambda. Harder to identify reasons why env’tally driven deviations would be negatively correlated (if a DI model used, neg. autocorr. could be caused by underlying neg. density dependence) In DI case, pos. autocorr. increases ext. risk (M&D fig 4.8) Use fig. 4.9 in M&D to show effect of autocorrelation on extinction risk with over- vs under-compensatory (neg) density dep. Use ricker.corr2.R to demonstrate how to simulate extinction risk when there is autocorrelation. 3. Catastrophes and bonanzas - Testing for outliers (high or low) linear regression approach for density-independent model (or perhaps Ricker) compute standard deviations for deviations between observed lambda and values predicted by density dep. models (e.g. the outlier for JR checkerspot population) – identify as outliers values > 2 SDs above/below mean - Using extremes.R to incorporate outliers replace “run-of-mill” values with extremes, with freq. determined from data Issues: * match single pos. (or neg.) outliers with complements? * may need to modify extremes.R if there are BOTH outliers and density dependence (and other factors), as in checkerspot Next Topic: Structured populations Rationale: We will begin by going backwards and removing several things we added to count-based models (environmental stochasticity and density dependence). After we see how the dynamics of structured populations behaves without these complications, we will put them back in. In addition, we will consider the effects of another force, demographic stochasticity. Basic reason why structure matters: Individuals don’t contribute equally to population growth Example: semi-palmated sandpiper - PHOTO REFERENCE: Hitchcock and Gratto-Trevor (Ecology 78:522-533, 1997) Biological background: individuals can live >3 yrs, can begin breeding at age 1, migrate to N. Canada, produce 1 nest per year. Common, but H&GT studied a declining population to identify causes of decline Data on average vital rates s0=.1293 s1=.2543 s2=.593 b1=.25 b2=.875 b3=.95 c=1.8625 # juvenile survival – prob(survival) from hatching to age 1 # yr1 survival – prob(survival) from age 1 to age 2 # adult survival – 1yr. prob(surv) for all individuals age 2 or more # prob(breed) at age 1 # prob(breed) at age 2 # prob(breed) at ages 3 and above # chicks per nest Clearly, age strongly affects survival and likelihood of breeding. Therefore separating individuals into age classes and keeping track of their separate contributions to next year’s population should improve our ability to predict/understand population growth rate ( a pop. of all juveniles will grow differently than a pop. of all adults) Building a density-indep. model. As for the unstructured case, we’ll follow population growth over discrete 1 yr. intervals. First, consider when we are censusing the population. That will determine what age (or size) classes we see. Hitchcock and Gratto-Trevor censused just BEFORE nests produced, after birds returned to N. Canada on Spring migration – A PRE-BREEDING CENSUS So they see 1YOs (born just after last census), 2YOs, and 3+YOs. Just after the census, new juveniles are born. The picture is this: s2 3+ year olds 3+ year olds s2 b3 c 2 year olds 1 year olds 1 year olds b2 c b1 c Census t 2 year olds s1 s0 Census t+1 Juveniles Arrows are PER-CAPITA rates. We could write equations for the change in ea. age class over 1 yr: Let N1(t), N2(t), N3(t) == no. of 1, 2 and 3+ YOs, resp., in the population in yr. t Sum up products of arrows times Ni(t): N1(t+1) = b1*c*s0*N1(t) + b2*c*s0*N2(t) + b3*c*s0*N3(t) N2(t+1) = s1*N1(t) N3(t+1) = s2*N2(t) + s2*N3(t) Using linear algebra, we can write this more concisely by separating the N terms from the rest of the r.h.s.: n(t+1) = A * n(t) where n(t) = [ N1(t) N2(t) N3(t)]’ and A= [ b1 c s0 b2 c s0 b3 c s0; s1 0 0; 0 s2 s2] Note: A contains all the PER-CAPITA effects, and n contains the numbers in ea. class Using A as a table cols = class this year rows = class next year elements: PER-CAPITA contributions from this year’s to next year’s population Reviewing right mult of a matrix by a column vector - numerical example with starting vector [1 1 1]’ If break between classes, use life-cycle diagram to (re)illustrate the matrix/life history events – timing of annual census determines life stages “seen”: b1 c s0 b3 c s0 b2 c s0 1 yo 3+yo 2 yo s1 s2 s3 6 arrows above correspond to the 6 terms in the matrix, and represent one-year transitions Put numbers in matrix (“POPULATION PROJECTION MATRIX”), and interpret matrix cols = from life stage in yr. t rows = to life stage in yr. t+1 entries are PER-CAPITA RATES Rules for right-multiplying a matrix by a column vector – if dim(vec,2)=dim(mat,1)=dim(mat,2) RESULT: another col. vec. of same dimension as original vec If vector is [n1(t); n2(t); n3(t)], show that this gives us back the separate recursion equations for ea. life stage Go to R to write a program to predict what the population will do when we repeatedly multiply the matrix A by the (changing) vector n First, use sandpiper.matrix.R to show 3 ways to construct a proj. matrix in R Do some matrix/vector multiplication in the command window n=c(1,1,1) OR n = matrix(1,3,1) n=A %*% n; n << repeat several times – what’s happening to N(t), N(t+1)/N(t), and STRUCTURE? STRUCTURE = fraction of pop in ea. life stage First, what is the total population size in year t? sum(nt) What is the one-year population growth rate in year t? lambda(t) = sum ( n(t+1) ) / sum ( n(t) ) population structure = n/sum(n) Now use sandpiper.R to examine the process of CONVERGENCE then have students manipulate initial vector to ask: ** does convergence occur at the same time for lambda and structure? ** does asymptotic structure and lambda depend on initial vector? notes: convergence is - indep. of starting vector (but initial lambda is not) – use 2 starting vecs. to show - simultaneous in structure and lambda – only when structure is stable is lambda stable Because lambda and the population structure converge to same values regardless of starting vector, they must BOTH be properties of the MATRIX, NOT THE INITIAL VECTOR. Indeed, they are the DOMINANT EIGENVALUE and DOMINANT RIGHT EIGENVECTOR of the proj matrix EIGEN = SELF in German Computing Evals and Rt. Evecs: On board using a simple 2x2 matrix After convergence, all classes change by the same multiplier lambda Therefore we have 2 equal expressions: w(t+1) = A w(t) = L w(t) (where L is lambda) – Above, WE USE w INSTEAD OF n TO EMPHASIZE THAT IT IS A SPECIAL TYPE OF VECTOR, POST-CONVERGENCE ie when w(t+1)/sum(w(t+1)) = w(t)/sum(w(t)) – ONLY THEN IS THE ABOVE EQ. TRUE vector-scalar multiplication: L*w=[L*w1; L*w2] Note: [L 0; 0 L] = L [ 1 0; 0 1] = L I define identity matrix I But L I w = [L 0; 0 L] w = [L w1 + 0; 0 + L w2] = L w Therefore Aw=LIw or (A – L I) w = 0 if A = [ a b; c d], A – L I = [ a – L b; c d – L] Note: We require that there be infinitely many vectors that satisfy A w = L w or (A – L I) w = 0 (because any multiple of an eigenvector w is an eigenvector) The mathematical condition for infinitely many solutions is det (A – L I) = 0 << this is the “CHARACTERISTIC EQUATION”, used to find L using only A det = “determinant” det of a 2 x 2 matrix: prod of diag minus prod of “anti”diag.: det [ a b; c d] = a d – b c So if A = [ a11 a12; a21 a22] A – L I = [ a11 – L a12; a21 a22 – L ] so the char. eq. is det (A – L I) = (a11 – L)(a22 – L) – a12 a21 = 0 or L^2 – ( a11 + a22) L + a11 a22 – a12 a21 = 0 or L^2 + B L + C = 0 where B = – ( a11 + a22) and C = a11 a22 – a12 a21 [B=–Tr(A), C=det(A)] So, for a 2 x 2 matrix, the char. eq. is a quadratic eq. More generally, for an n x n matrix, the char. eq. will be an nth order polynomial and so there will be n eigenvalues (roots of the polynomial) Solution: L1,L2 = –B/2 +– sqrt(B^2 – 4C)/2 Two solutions because of +– Work example with A = [.7 .3; .1 .9] (2 stages: “juveniles” and “adults”) fraction of adults surviving one year? fraction of juveniles surviving one year? fraction of surviving juveniles maturing in one year? no. of juveniles produced per adult per year? B = -(.7+.9)=-1.6 C = .7*.9-.1*.3 = .6 L1,L2 = 1.6/2 +- sqrt(2.56 - 2.4)/2 = .8 +- sqrt(.16)/2 = .8 +- .2 L1 = 1 <<< DOMINANT EIGENVALUE L2 = .6 [[ alternative: (partially biennial plant model) A = [g s0 f1 s1 f2; g s0 0] postbreeding census stages: seeds on ground, 1y.o. plants A=[F1 F2; S0 0] A = [ 1 2; ½ 0] ]] One eigenvalue is larger than the other in absolute value (more generally, in magnitude, because some eigenvalues can be complex numbers). This is called the DOMINANT EIGENVALUE (with only a few exceptions, for realistic population projection matrices, 1 e’val will always be larger than the others) To EACH eigenvalue corresponds a RIGHT EIGENVECTOR Can get DOMINANT RIGHT EIGENVECTOR w1 (which has 2 entries, w1_1 and w1_2) by: plugging L1 and w1=[1; w1_2] into char. eq. (A-L1 I)*w1=0, solving for w1_2, and rescaling as w1=[1; w1_2]/(1+w1_2); ( the logic here is that we only care about the relative sizes of the elements in w1, so we can arbitrarily choose the first element to be 1 and then solve for the second) Using above example: A = [.7 .3; .1 .9] L1=1 A-L1 I = [-.3 .3; .1 -.1] w1=[1; w1_2] First eq. of (A-L1 I)*w1=0 is -.3 + .3 w1_2 = 0 so w1_2 = 1 (2nd eq. yields same result – as it must) So w1 = [1; 1] and rescaling it by dividing each term by its sum yields w1 = [.5; .5] which says that after convergence, this population will have an equal number of “juveniles” and “adults” In fact, as we have 2 lambdas (L1 and L2) we can get two w’s, w1 and w2 (get w2 by repeating the process above, but using L2 instead of L1) EXERCISE: Compute w2 for the 2x2 matrix A above We could also use algebra to compute directly the e’vals and e’vecs for the 3x3 sandpiper matrix. But char. eq. will now be a cubic eq. (with an L^3 term) which has a MUCH more cumbersome solution than a quadratic. For a quartic eq. (with an L^4 term) there is no known closed-form solution. So in practice, numerical methods are used to compute e-vals and e-vecs. Return to sandpiper.m and activate Section 1 to show how to calculate e-vals and e-vecs in R NOTES: - For sandpiper matrix, “subdominant” evals are a pair of complex numbers - The magnitude of these is < that of the dom. eval. [ mag(L2) = mag(L3) = sqrt( (-0.0160)^2 + (0.0610)^2 ), using Pythagorean theorem ] - eigen produces a matrix with e’vecs as its columns corresponding to the order of lambdas, but they are not necessarily scaled to represent proportions - but because c w1 is also an e’vec, we can rescale by c=1/sum(w1) to get proportions Return to simulation results (fig. 1 from sandpiper.R): asymptotic population growth rate equals numerically calculated L1 asymptotic population structure equals numerically calculated w1 But numerical procedure only used A, showing asymptotic growth and structure are functions of the matrix alone. Why the e’vals and e’vecs are important for understanding the process of convergence: SOLUTION OF THE MATRIX PROJECTION EQUATION n(t+1) = A %*% n(t) In the scalar model N(t+1) = lambda N(t) the equation is a recursion relationship, not a solution. Its solution is: N(t) = lambda^t N(0) and allows us to predict any future population size knowing lambda and the initial population size. The equivalent solution of a 2-stage matrix equation is: n(t) = c1 L1^ t w1 + c2 L2^t w2 where c1 and c2 are (scalar) constants determined by initial vector n(0) [ because n(0) = c1*w1 + c2*w2, a system of 2 eq’s w/ 2 unknowns] Soln. explains why there is convergence: with t sufficiently large, L1^t becomes much larger than L2^t and “drags” w1 along with it. Key result: for biologically realistic projection matrices, L1 will be positive and real Effect of raising Evals to higher powers lam>1 but real: lam^t grows exponentially 0<lam<1 but real: lam^t declines exponentially so for 2 real evals, the larger one comes to be much greater over time, even if both are >1 lam<0 but real: oscillations damped if -1<lam<0 growing amplitude if lam<-1 But even in the latter case, a pos. lam with greater magnitude will dominate over time What about complex eigenvalues? Sandpiper has 3 eigenvals, L1 real and L2,L3 a pair of complex eigenvalues The solution is now: n(t) = c1 L1^ t w1 + c2 L2^t w2 + c3 L3^t w3 Use R to show effect of raising complex e’vals for sandpiper to higher and higher powers: L2 = -0.016 + 0.061i; L3 = -0.016 - 0.061i; Lt=L2 Lt=Lt*L2; Lt repeat last command to iterate Sequence produced: -0.0160 + 0.0610i -0.0035 - 0.0019i 1.7403e-004 -1.8041e-004i 8.2287e-006 +1.3494e-005i -9.5440e-007 +2.8668e-007i Result: real (and imag.) part oscillates between pos. and neg., but because mag(L2) < 1, Lt -> 0 + 0i as t increases if mag(L_complex) is >1, real part will oscillate and grow, but more slowly than do powers of the dominant e’val. [ note: for all t, imag. parts of L2^t w2 cancel with those of L3^t w3 (and c2=c3), so solution is purely real ] The closer L1 and L2 are in magnitude (the smaller the “damping ratio”), the longer convergence will take to occur. NEXT BIG TOPIC: For conservation, we want to know: How will changing MATRIX ELEMENTS and UNDERLYING VITAL RATES affect lambda? Can envision changes in MEs, but they are caused by changes in VRs (see the sandpiper matrix as a fc. of its vital rates). Still, to understand how changing VRs will change lambda, since lambda is a function of the matrix and its elements, we need to first understand how changing matrix elements changes lambda. First, explore how lam1 changes as we vary two vital rates (or matrix elements), KEEPING ALL OTHERS CONSTANT b1 = .25, a_11=.06 So if b1 can vary from 0 to 1, a_11 can go from 0 to .24 s3+ (really s2 in matrix, but let’s only consider survival at age 3+) = a_33 is .593, but can go from 0 to 1 CLASS ASSIGNMENT: Vary a11 over 100 values from 0 to .24, keeping all other elements constant, and vary a33 over 100 points from 0 to 1, keeping all other elements constant. For ea. value of a11 or a33, compute lam1 for the corresponding projection matrix, and then plot lam1 va a11 and lam1 vs a33 over their entire ranges of feasible values. ( resulting script should be like sandpiper.lam1.vs.a11.&.a33.R ) RESULTS: - lam1 is a nonlinear function of a11 or a33 - the slope of lam1 vs a33 is generally steeper than the slope of lam1 vs a11; more ‘bang for the buck’ for a given amount of change in a33 than in a11 (but actual bang depends on current value of m.e.) - it is impossible to make the sandpiper pop. grow by changing b1 alone, whereas it is possible by changing s3+ alone Usually we don’t know by how much we can actually change b1 or s3+ (or any other v.r.). But we might ask: given the current values of the v.r.’s, what SMALL change would be most effective? Two ways to think about changes: Additive vs multiplicative (or alternatively absolute vs proportional) changes in matrix elements: multiplicative accounts for different scales at which different elements are measured - esp. in size-based matrices for plants, or age-based models for fish, for example, fecundities can be much larger than size/age transitions (which must be <= 1) activate Section 2 in sandpiper.m to explore additive changes (or use sandpiper.num.sens.elast.R ) Notes: - additive change is small: 0.01 – can’t increase age transitions to above 1 express change as ABSOLUTE CHANGE: (new lam1 – original lam1)/absolute change in m.e. [ie, .01] - resulting change in lambda put in corresponding position in matrix - all changes are positive >>> increasing ME increases lambda - largest effect of increasing adult “stasis” - there is an effect (sometimes strong) of changing “impossible” MEs Next, activate Section 3 in sandpiper.R to explore multiplicative changes Notes: - still using small (0.01) but now proportional change express change as PROPORTIONAL: (new lam1 – original lam1)/orig. lam1/proportional change in ME. [ie, .01] where proportional change in ME = (ME.orig*(1+eps) – ME.orig)/ME.orig = ME.orig*eps/ME.orig = eps - still largest effect of increasing adult stasis; now even more dominant - zero effect of “changing” zero MEs (because proportions of zero are zero) - sum of all elements of Emat is approx. 1 SENSITIVITY = absolute change in lambda in response to an absolute change in a ME ELASTICITY = proportional change in lambda in response to a multiplicative (proportional) change in a ME IF matrix embodies ultimate growth rate and struc of population, it must also be able to tell us DIRECTLY how changing MEs or VRs will change lambda Above is crude, brute force approach, which doesn’t give us much insight into WHY changes to particular MEs have large or small effects on lam1. The MATHEMATICALLY ELEGANT approach (which does yield insight) is to recognize that: Sensitivity=partial derivative of lambda with respect to a ME or VR To compute these derivatives directly, we must first learn about LEFT EIGENVECTORS DOMINANT LEFT EVEC contains the “reproductive values” Repro. value = the relative contribution of a single individual in a given life-history stage to the future size of the population So, individuals with high RV contribute more to future population size Use repro.value.R to explore differences in repro. value among 1yo’s, 2yo’s, and 3+yo’s in sandpiper. How to get left evecs: Directly: Solve: v1 (A – lam1 I) = 0 using v1 = [1 v1_2] (review left multiplication of a matrix by a row vector) Using R: rows of V = (complex conjugate of) inverse of W (where cols of W are right evecs) are the left eigenvecs meaning of matrix inverse: just as 1/x = x^-1 and x * 1/x = 1 (x scalar) 1/W = W^-1 and W * W^-1 = I (W a matrix) that is, the inverse of W is the matrix such that, when W is rt. multiplied by this matrix, the identity matrix results. The left evec “associated with” lambda1 is the DOMINANT LEFT EIGENVECTOR Finally, computing S and E analytically (see sandpiper.sens.elast.R) denom = sum over all i of ( v1_i * w1_i ) – same for all matrix elements. Sij = v1_i * w1_j /denom So, the effect on lambda1 of changing matrix element aij depends on: - the fraction of the population at the stable structure that is in stage j, w1_j - the relative contribution to future population growth of ea. individual PRODUCED by matrix element aij, which is v1_i Eij = aij Sij/L1 computing all elements of S and E matrices simultaneously: v1 = row vec of repro vals w1 = col vec of stable fractions S = v1’ * w1’ (where * is vector multiplication) E = A .* S/L (where .* is element-wise multiplication) In R: "The sensitivity matrix" S=(cbind(v1) %*% w1)/((v1 %*% w1)[1]); S "The elasticity matrix" E=A*S/lam1; E EXAMPLE of the use of sensitivities: loggerhead sea turtles - figs in CROWDER et al. 1992 We now know how to get sens/elast of lam1 to matrix elements. What about underlying vital rates (accounting for their possible contributions to multiple m.e.’s) Numerical approach: Activate sandpiper.change.vrs.R to explore changes in VRs “Elasticity of lam1 to vital rate p” = proportional change in lam1 / prop. change in p ~~ [ del lam1/ lam1]/ [del p /p ] = [ (lam1new-lam1)/lam1 ]/[ (pnew-p)/p ] = [ (lam1new-lam1)/lam1 ]/[ (p*(1+eps)-p)/p ] = [ (lam1new-lam1)/(lam1*eps) ] Results: - strongest effect of changing s2 – why? - next strongest effect of changing c or s0 – why? - same effect of changing s0 or c – why? - smallest effect of changing b1 – why? Directly computing sensitivities and elasticities to underlying VRs: The chain rule: if f = f(g(x),h(x)) df/dx = df/dg dg/dx + df/dh dh/dx (partials) In our case, lam1 = lam1(aij, akl, …) = lam1( aij(vr1, vr2, …), aij(vr1, vr2, …), …) Sv = sumi sumj Sij daij/dv Example: compute S_s0 for vr s0 for the sandpiper mat. Sc = S11 b1 c + S12 b2 c + S13 b3 c accounts for influence of c on all MEs Including management in the matrix: Let s2^ = survival class 2 at current livestock abundance s2* = survival class 2 no livestock A = livestock abundance A^ = current livestock abundance Graph s2 as line with negative slope between (0, s2*) and (A^, s2^) Equation of line: s2 = s2* - ((s2* - s2^)/A^) * A similarly for s1, etc. some functions could increase with A Next, incorporate functions directly into matrix. Now can compute a sensitivity to A, and can vary A to calculate effect on lambda Sandpiper example: Assume fledgling survival is reduced by an introduced predator: P^ = current predator abundance s0^ = estimate of current fledgling survival s0* = estimate of fledgling survival in absence of pred. (e.g. from pred-free sites P=0) s0(P) = s0* + [(s0^ - s0*)/(P^ - 0)]*P put s0(P) in matrix in place of s0 compute d.lam1/d.P using chain rule – accounts for fx. of P on all relevant matrix elements Question: by inspection, what is the sign of d.lam1/d.P, and how do you interpret this? Another example: 5x5 version of loggerhead matrix 0 s1 0 0 0 0 s2(1-g2) s2 g2 0 0 0 0 s3(1-g3) s3 g3 0 f4 s4 0 0 0 s4 f5 s5 0 0 0 s5 Question: why s4 in a_14 and s5 in a_15 ? Question: is this an age, stage, or size based matrix ? Make s2-s5 increasing (linear) functions of T (0 to 1), the fraction of trawlers using TEDs e.g. s2(T) = s2^ + [(s2* - s2^)/(1-0)]*T (assumes no TEDs used when s2^ measured) ****************************************************************** NEXT TOPIC: Incorporating envt’al stochasticity into structured population models Using sandpiper.show.var.vrs.R, plot annual variation in vital rates data=read.csv("sandpiper_vrs.csv") vrs=data.matrix(data) # convert the data frame "data" to a matrix "vrs" # NOTE: rows of "vrs" are the different vital rates, columns are years # use the following to show that the vital rate means are the same # as the values given in sandpiper.R: M=rowMeans(vrs); M examine matrix ‘vrs’ and its row means Data from Hitchcock and Gratto-Trevor showing annual variation in 4 vital rates over 6 years (did not estimate annual variation in breeding probabilities, and note that some of the other vr’s are replaced by means in some years) Also note that some of the vr’s are correlated with ea. other Notably, s0 covaries with c (perhaps b/c good years are good for both no. fledged and their subsequent surv. – caveat: one point in this graph is the means of the 2 vrs.) Measuring covariation: - see fig. 7.4 in M&D definition of Cov: 1 n Cov( x, y ) ( xi x )( yi y ) n i 1 meaning of pos. and neg. cov. Note similarity between formulas for cov and var: 1 n Var ( x) ( xi x ) 2 n i 1 Relationship between covariance and correlation Corr(x,y) = Cov(x,y)/[ SD(x) SD(y) ] Computing a matrix of covariances or correlations in R: apply cov or cor to a matrix to get the cov or cor matrix of the COLUMNS of the matrix ( so need to transpose vrs before applying cov or cor ) Result: large positive cor between s0 and c for sandpiper (why are corr’s NA for b1-b3?) Can simulate envt’al stoch. by creating a proj. matrix for ea. year and choosing them at random among years - the IID CASE (independently and identically distributed) (can also chose some more often than others, or build in envt’al autocorrelation by choosing sequences of matrices). NEW TRICKS TO BE USED: * Store annual matrices in 3-D array (depths = years) * Use function make.sand.mat.R to make matrix from vector of vital rates use standpiper.stoc.sim.R Simulate trajectories of total population size (sum of population vector) On log scale, trajectories fall on normal curve w/ var. increasing with time, as in scalar case Long-term stoch. growth rate – can compute 2 ways 1) limit of t’th root of N(t)/N(0) as t goes to inf. (subject to rounding error) 2) arithmetic mean of annual log growth rates ( see program – note vectors renormalized to 1 ea. year to prevent very small or very large numbers, because need to do this for a long time until vectors settle into their stoch. distribution). use sandpiper.stoc.lam.by.sim.R stoclam is less than lam1 computed from mean matrix So var reduces population growth, as in scalar case, but we haven’t learned much about how var in particular underlying vital rates or matrix elements, and cov. between them, affects stoclam All variability not created equal: EXERCISE Varying sandpiper.stoc.lam.by.sim.R: make ONLY s0 OR ONLY s2 variable, fixing all other rates at their means, and compute stochastic lambda numerically, comparing the results RESULT: letting s2 vary but not s0 depresses stoclam more (rel. to lam1) than does the opposite ^ BUT this exercise uses OBSERVED var in different rates, rather than comparing EQUAL var. in the two. Also, vr’s covary, as we’ve seen. How do Var’s and Cov’s of underlying rates affect stoclam in a structured population? Recall from scalar models: approx. for geom. mean in terms of arith. mean and var: Lg ≈ La exp{ - Var(lam)/ [ 2 La^2 ] } In matrix land, stoclam ≈ lam1bar * exp{ - Var(lam1)/ [ 2 * lam1bar ^2 ] } But lam1=f(a11(v1,v2,..), a12(v1,v2,...), etc.) What is the var. of a function in terms of var’s and cov’s of underlying variables? Simple case: a function of only 1 variable: Var(f(x)) ~ (df/dx)^2 Var(x) interpretation using graph of f(x) vs. x why square? More complex case: variance of a function of 2 variables: Var(f(x,y)) ~ (df/dx)^2 Var(x) + (df/dy)^2 Var(y) - 2 * (df/dx) * (df/dy) * Cov(x,y) Tulja’s approx. (in terms of vital rates): If only 1 vital rate varied: Var( lambda(v_i) ) ~ (d lambda/d v_i)^2 Var(v_i) = ( Si )^2 Var(v_j) But (d lambda/d v_i) = Si (the sensitivity computed earlier) SO… For a given amount of var. in a vital rate, lambda will vary more when lambda has a higher sensitivity to that vr. - graphical interpretation - Pfister prediction Now full Tulja approx.: 1 log s log 1 2 (Si )2Var (vi ) 2 Si S jCov(vi , v j ) 21 i i j Note: for me’s, sensitivities (S’s) always positive 1st summation understandable from above graphical explanation Impt. message: pos. cov. reduces lam_s more than does neg. cov. between matrix elements (or vital rates with positive sens’s) Above arguments can be made in terms of me’s or vr’s (but sensitivities could be negative for vr’s, but not me’s). NEXT TOPIC: Incorporating d.d. into matrix models Example: Chinook salmon (from M&D Ch. 8) Biology: - semelparous - can spawn at age 3, 4, or 5, but die after spawning Census just before spawning, so youngest class is 1 yo’s, and oldest is 5 yo’s Let b_i = breeding probability of age i females (i = 3, 4) f_i = eggs per age i female (i = 3, 4, or 5) s_0[E(t)] = survival of eggs to 1 yo’s as a function of the total eggs at time t, E(t) s_2 to s_4 = survival of females aged 2, 3, 4 Matrix: 0 0 b3 f3s0[ E (t )] b4 f 4 s0[ E (t )] s 0 0 0 1 0 s2 0 0 s3 (1 b3 ) 0 0 0 0 0 0 s4 (1 b4 ) f5 s0[ E (t )] 0 0 0 0 5 where E (t ) bi fi ni (t ) i 3 where ni (t ) is no. of females of age i at time t and b_5 = 1. Notes: - entries only on 1st row and subdiagonal – age-based (“Leslie”) matrix - spawning reduces survival in ages 3 and 4 - all age 5 individuals spawn and die - egg survival depends on total no. of eggs, not on age of mom, but NUMBER of eggs does depend on age of mom and her breeding prob. - because E(t) is a function of n_3, n_4, and n_5, we have introduced densities into the matrix itself Possibilities for egg survival: Ricker function: s0[ E (t )] s0 (0) exp[ E (t )] where s0 (0) is survival when egg density is close to zero. Graph this – meaning of beta Beverton-Holt function: s0 (0) 1 E (t ) Graph this. Params. have same basic meanings as for Ricker. s0 [ E (t )] Shape of curves is similar, they make different predictions about how no. of survivors depends on initial no. of eggs No. survivors = E (t ) s0[ E (t )] Ricker: No. survivors s0 (0) E (t ) exp[ E (t )] 0 as E(t) goes to inf. “Overcompensatory (negative) density dependence” additional eggs reduce survival of all eggs BH: No. survivors s0 (0) E (t ) s (0) 0 as E(t) goes to inf. 1 E (t ) “Compensatory (negative) density dependence” beyond some point, all extra eggs added die but don’t reduce survival of others Possible causes of compensatory d.d. in salmon: later redds merely replace earlier ones Possible causes of over-compensatory d.d. in salmon: - more eggs -> less O2 in streams, lower survival for all eggs - more eggs, more hatchlings, more competition for food, fewer can reach size to develop to next stage These 2 types of negative d.d. have very different implications for the possible population dynamics. Use salmon_dd_2.m to explore effect of increasing fertilities in the 2 models. Harding et al. 2001 Cons Biol