Bill Shipley - Community assembly through trait selection - Eco

advertisement
Community assembly through
trait selection (CATS):
Modelling from incomplete information
Bill.Shipley@USherbrooke.ca
A seminar in three parts
The ecological concept
The maximum entropy formalism
Edwin Jaynes
The CATS model
Part 1
The ecological concept
Trait-based filtering
Regional species pool: determined by traits + history
Immigration rate: determined by abundance in region
+ traits
abiotic filters
(determined by traits)
biotic filters
(determined by traits)
Local community
Relative abundance in local community
A plant strategy is a suite of physiological, morphological or
phenological traits of individuals that affect probabilities of
survival, reproduction or immigration and that is systematically
associated with particular environmental conditions
The most common trait values in a local community will be possessed by
those individuals having the greatest probabilites of survival,
reproduction and immigration.
S
relative abundance~
probability of passing filters
Philip Grime
∑p t
i i
= t wm
community mean trait value
i =1
This average trait value will reflect the selective
advantage/disadvantage of this trait in passing
through the various abiotic and biotic filters
A B C D E species
tA tB tC td tE traits of species
Two consequences of Grime’s ideas
for community assembly…
Philip Grime
1. If we know the values of particular abiotic variables determining the filters,
then we should be able to predict the « typical » values of
the functional traits found in this local community; i.e. community-weighted
means.
2. If we know the « typical » values of the functional traits that are found in this
community, and we know the actual trait values of the species in a regional
species pool, then we should be able to predict which of these species will be
dominants, which will be subordinates, and which will be rare or absent.
The three basic parts of the CATS model
metacommunity
metacommunity
q : distribution of relative abundance of species in the
metacommunity
Trait (t)
λ: selection on trait in the local community
(greater probability of passing if smaller…)
p : distribution of relative abundance of species in the
local community
Local community
𝑑 = 𝑆𝑖=1 𝑝𝑖 𝑑𝑖
𝑑 = {𝑑1 , … , 𝑑𝑗 }
Trait (t)
Grimes’s plant strategy
Measuring abundances
Local abundance
Abundance of plant species can be measured as
- numbers of stems
- biomass or indirect measures of this (cover, dbh …)
Units of observation are ambiguous
- numbers of what? (ramets ≠ genets, biomass units?)
Units of observations are never independent
Most values will be zero (species present in region but not in local site)
Measuring abundances
Meta-community abundances
Sometimes no information
Sometimes vague information (very common, common, rare)
Sometimes more quantitative information
Part 2
The maximum entropy formalism
Edwin Jaynes
Jaynes, E.T. 2003. Probability theory. The logic of science.
Cambridge U Press.
How can we quantify learning?
If the sun does set tonight (and our previous information told us
it would, p=0.99999) then we will have learned almost nothing
new (almost no new information)
If the sun doesn’t set tonight (and our previous information told
us it would, 1-p=0.00001) then we will have learned something
incredible (lots of new information)
The amount of learning (new information):
Historical log2, we will use loge=ln
Claude Shannon
1
I = l𝑛
= −l𝑛(𝑝)
𝑝
Average information content (i.e.
new information)
Specify all of the logically possible states in which some phenomenon can exist
(i=1,2,…,S)
Based on what we know before observing the actual state, assign values between
0 and 1 to each possible state: p=(p1, p2, …, pS)
𝐼𝑖 = −𝑙𝑛𝑝𝑖
The amount of new information that we will learn if state i occurs is
The average amount of new information that we will learn is
S
I ο€½ οƒ₯ pi I i
i ο€½1
S
S
S
i ο€½1
i ο€½1
i ο€½1
I ο€½ οƒ₯ pi I i ο€½ οƒ₯ pi  ο€­ ln  pi   ο€½ ο€­οƒ₯ pi ln pi
Information entropy
Information content and uncertainty
New information = information that we don’t yet possess = uncertainty
Information entropy measures the amount of new information we will gain
once we learn the truth
Information entropy therefore measures the amount of information that we
don’t yet possess
Information entropy is a measure of the degree of uncertainty that we have
about the state of a phenomenon
Maximum uncertainty = maximum entropy
A betting game: game A
I bring you, blindfolded, to a field outside Sherbrooke.
I tell you that there is a plant at your feet that belongs to either species A or
species B.
You have assign probabilities to the two possible states (species A or B) and
will get that proportion of $1,000,000 once you learn the species name.
What is your answer?
S
I ο€½ ο€­ οƒ₯ pi ln pi ο€½ ο€­(0.5 ln(0.5)  0.5 ln  0.5 ) ο€½ 0.69
i ο€½1
A betting game: game B
New game with some new clues:
(i) The site is a former cultivated field that was abandoned by a farmer last
year.
(ii) Species A is an annual herb. Species B is a climax tree.
You have assign probabilities to the two possible states and will get that
proportion of $1,000,000.
What is your answer?
S
I ο€½ ο€­ οƒ₯ pi ln pi ο€½ ο€­(0.9 ln(0.9)  0.1ln  0.1) ο€½ 0.33
i ο€½1
If you changed your bet in this second game then these new clues provided you
with some new, but incomplete, information before you learned the answer.
0.6
0.7
Maximum entropy (maximum uncertainty) at p=(0.5,0.5)
0.4
0.3
0.2
Amount of remaining
uncertianty, given the
ecological clues
0.1
entropy
0.5
Amount of information
contained in the ecological
clues
0.0
0.2
0.4
0.6
0.8
1.0
p(species A)
p(annual herb)=0.9
p(climax tree)=0.1)
Relative entropy (Kullback-Leibler divergence)
−
𝑝𝑖 ln
𝑝𝑖
π‘žπ‘–
Relative to a “prior” or reference distribution (q)
Let qi=1/S and S be the fixed number of unordered, discrete states; q is a
uniform distribution
−
𝑝𝑖 ln
𝑝𝑖
1
𝑆
=−
𝑝𝑖 ln𝑝𝑖 + ln𝑆 = −
𝑝𝑖 ln𝑝𝑖 + constant
Information entropy α relative entropy given a uniform distribution
Maximizing relative entropy = maximizing entropy
Jaynes, E.T. 2003. Probability theory.
The logic of science. Cambridge U
Press.
Edwin Jaynes
The maximum entropy formalism in an
ecological context: A three-step program
The maximum entropy formalism
Edwin Jaynes in an ecological context
Jaynes, E.T. 2003. Probability theory.
The logic of science. Cambridge U
Press.
Step 1: specifying the relative abundances in the regional metacommunity (the prior distribution)
Specify a reference (q, prior) distribution that encodes what you know about the
relative abundances of each species in the species pool before obtaining any
information about the local community.
“All I know is that there are S species in the meta-community”
“All I know is that there are S species in the meta-community
and I know their relative abundances in this meta-community”
1
π‘žπ‘– =
𝑆
qi
The maximum entropy formalism
Edwin Jaynes in an ecological context
Jaynes, E.T. 2003. Probability theory.
The logic of science. Cambridge U
Press.
Step 2: quantifying what we know about trait-based community
assembly in the local community
𝑑1 = 23.3
𝑑2 = 147.1
𝑒𝑑𝑐.
“I have measured the average value of my functional traits
(community-weighted traits values)”
y
3
2
1
“ I have measured the environmental conditions, and know
the function linking these environmental conditions to the
community-weighted trait values”
4
5
6
y=2.5+0.23x
2
4
6
x
8
10
The maximum entropy formalism
Edwin Jaynes in an ecological context
Jaynes, E.T. 2003. Probability theory.
The logic of science. Cambridge U
Press.
Step 3: choose a probability distribution which agrees with what
we know, but doesn’t include any more information (i.e. don’t lie).
𝑆
π‘ž,
Choose values of p that agree with what we know:
𝑝𝑖
𝑖=1
𝑆
𝑑1 =
𝑝𝑖 𝑑𝑖1 = 23.3
𝑖=1
𝑆
𝑑2 =
𝑝𝑖 𝑑21 = 147.1
𝑖=1
And that maximizes the
remaining uncertainty
−
𝑝𝑖 ln
𝑝𝑖
π‘žπ‘–
𝑒𝑑𝑐.
The maximum entropy formalism
Edwin Jaynes in an ecological context
Jaynes, E.T. 2003. Probability theory.
The logic of science. Cambridge U
Press.
The solution is a general exponential distribution
qi: Prior distribution of species i
𝑝𝑖 =
π‘žπ‘– 𝑒
𝑇
−
𝑗=1
𝑆
π‘žπ‘– 𝑒
−
tij: Trait value of trait j of species i
πœ†π‘— 𝑑𝑖𝑗
𝑇
𝑗=1
πœ†π‘— 𝑑𝑖𝑗
𝑖=1
λj:The amount by which a one-unit increase in trait j will result in a
proportional change in relative abundance (pi)
The maximum entropy formalism
Edwin Jaynes in an ecological context
Jaynes, E.T. 2003. Probability theory.
The logic of science. Cambridge U
Press.
Practical considerations
Except for maxent models with only a few constraints, we need numerical
methods in order to fit them to data. I use a proportionality between the
maximum likelihood of a multinomial distribution and the λ values of the
solution to the maxent problem (Improved Iterative Scaling), available in
the maxent() function of the FD library in R.
Della Pietra, S. Della Pietra, V., Lafferty, J. 1997. Inducing features of random fields. IEEE Transactions Patern Analysis and
Machine Intelligence 19:1-13
Laliberté, E., Shipley, B. 2009. Measuring functional diversity from multiple traits, and other tools for functional ecology R.
package, Vienna, Austria
By permuting trait vectors relative to species’ observed relative abundances, one
can develop permutation tests of significance concerning model fit
Shipley, B. 2010. Ecology 91:2794-2805
CATS
There are now many empirical applications of this model in many places
around the world, and applied at many geographical scales.
Two examples…
SHIPLEY et al. 2011. A strong test of a maximum entropy model of trait-based community assembly. Ecology 92: 507–517.
Daniel Laughlin’s PhD thesis
96 1m2 quadrats containing the understory herbaceous plants of ponderosa pine forests (Arizona USA).
Quadrats were distributed across seven permanent sites within a 120 km2 landscape between 2000-2500 m
altitude.
Available information
12 environmental variables measured in
each quadrat
20 functional traits measured per species
Relative abundance of each species in
each quadrat
If species is present, and not rare (>10%), CATS predicts
its abundance well
After ~7 traits, mostly redundant information
1e+00
1e-01
1e-02
Predicted relative abundance
0.6
0.4
0.0
1e-04
0.2
Predicted relative abundance
0.0
0.2
0.4
0.6
0.8
1.0
Observed relative abundance
Significant predictive ability by 3 traits
r= 0.58
1e-03
r= 0.97
0.8
1.0
If species is present, but rare (<10%), CATS can’t
distinguish degrees of rarity
1e-04
1e-02
1e+00
Observed relative abundance
If species is absent (X) then CATS will predict
it to be rare
If we only know the environmental conditions of a site, and the general
relationship between community-weighted traits and the environment, how well
can CATS do?
Actual measured community-weighted
value for this site = 9.5
6
4
2
CWM trait
8
Predicted value for community-weighted
trait, given that we know the environmental
value is 7 = 7.61
General relationship: Y=2.5+0.73X
2
4
6
environment X
8
10
The best possible prediction given the environment would be obtained with 79 separate
generalized additive (form-free) regressions – one for each species in the species pool – of
relative abundance vs. the environmental variables.
20
15
10
5
Abundance of species S
25
gam(S~X)
0
2
4
6
environment X
8
10
1.0
Explained variance, 3 types of model
R
2
0.6
0.8
Maxent using observed CWM
0.4
Using 79 gam models~environment
0.2
Maxent using predicted CWM~environment
0.0
significant at p<1/1000
0
5
10
Number of traits in model
15
20
Second example: tropical forests in French Guiana
Shipley, Paine & Baraloto. 2012. Quantifying the importance of local niche-based and stochastic processes to tropical tree community
assembly. . Ecology 93: 760-769
The unified neutral theory of biodiversity and biogeography
The per capita probabilities of immigration from the meta-community, and
the per capita probabilities of survival and reproduction of all species are
equal (demographic neutrality).
Subsequent population dynamics in the local community are determined
purely by random drift.
Stephen Hubbell
Local relative abundance ~meta-community relative abundance + drift
« Neutral prior » = meta-community relative abundances
1. Fit model using traits but a uniform prior (i.e. no effect of meta-community immigration)
2. Fit model permuting traits but with a neutral prior (i.e. no effects of local trait-based selection, but contribution from immigration)
3. Fit model with traits and with neutral prior(i.e. locat trait-based selection plus immigration from meta-community
Partition the total variance explained into:
1.
2.
3.
4.
That due only to immigration
That due only to local trait-based selection
That due jointly to immigration and traits (correlations with meta-community
Unexplained variation due to demographic stochasticity
We did this at three spatial scales: 1, 0.25 and 0.01 ha.
Figure 1
Traits
Dispersal
Traits x Dispersal
Demographic stochasticity
a) Site-level trait means
0.6
0.5
0.4
Trait-based selection
Proportion of the total information explained
0.3
Demographic stochasticity
Dispersal limitation from metacommunity
0.2
0.1
0.0
0.05
0.10
0.20
0.50
1.00
0.20
0.50
1.00
b) Metacommunity trait means
1.0
0.8
0.6
0.4
0.2
0.0
0.05
0.10
Spatial scale (ha)
Practical considerations. How do we fit this model
to empirical data? A trick involving likelihood
A total of A independent allocations of individuals or
units of biomass into each of the S species in the
species pool
the number of independent allocations to species i
P  a; A, p  ο€½
A!
S
 ai !
S
ai
p
 i  λ, Ti , qi 
i ο€½1
i ο€½1
The probability of a single allocation going to species i (i.e. the probability
of sufficient resources being captured to produce one individual or unit of
biomass for species i); a function of its traits (Ti), the strength of the trait
on selection (λ), and its meta-community abundance (qi)
The likelihood:
L (p; a, A) ο€½
S
A!
S
a !
i ο€½1
S
ai
p
(
λ
,
T
,
q
)
ο€½
C
p

 ik (λ, Ti , qi )
i
i
i ο€½1
ai
i
i ο€½1
i
In practice we can never know this, since neither
individuals or resources (biomass) are ever allocated
independently…
Taking logarithms, dividing both sides by A, and re-arranging, one obtains
ln  L  p; a, A  ο€­ ln(C )
A
S
S
ai
ο€½ οƒ₯ ln  pi  λ, Ti , qi   ο€½ οƒ₯ oi ln  pi  λ, Ti , qi  
i ο€½1 A
i ο€½1
S
ln  L  p; a, A  ~ οƒ₯ oi ln  pi  λ, Ti , qi  
i ο€½1
Maximising this…
Will maximise the likelihood of the
unknown multinomial distribution
k
But we already know that
pi   , Ti , qi  ο€½
qi e
ο€­ οƒ₯  j tij
j ο€½1
k
S
οƒ₯ qie
ο€­ οƒ₯  j tij
j ο€½1
i ο€½1
 S
 k
οƒΆοƒΆ S
ln  L  p | a, A   ~ ο€­οƒ₯οƒ₯ oi  j tij ο€­ οƒ₯ oi ln  οƒ₯ exp  ο€­οƒ₯  j tij οƒ· οƒ·  οƒ₯ oi ln  qi 
 i ο€½1
οƒ·
i ο€½1 j ο€½1
i ο€½1
 j ο€½1
οƒΈ οƒΈ i ο€½1

S
k
S
Choose values of λ that maximise this function, given the vector of traits (ti) for
each species in the species pool and their meta-community abundances (qi):
 S
 k
οƒΆοƒΆ S
ln  L  p | a, A   ~ ο€­οƒ₯οƒ₯ oi  j tij ο€­ οƒ₯ oi ln  οƒ₯ exp  ο€­οƒ₯  j tij οƒ· οƒ·  οƒ₯ oi ln  qi 
 i ο€½1
οƒ·
i ο€½1 j ο€½1
i ο€½1
 j ο€½1
οƒΈ οƒΈ i ο€½1

S
k
S
The dual solution (Della Pietra et al. 1997): the values of λ that
maximise the relative entropy in the maximum entropy formalism are the
same as the values that maximise the likelihood of a multinomial
distribution.
Della Pietra, S. Della Pietra, V., Lafferty, J. 1997. Inducing features of random fields. IEEE Transactions Patern
Analysis and Machine Intelligence 19:1-13.
Improved Iterative Scaling algorithm οƒ maxent() in the FD package of R.
Laliberté, E., Shipley, B. 2009. Measuring functional diversity from multiple traits, and other tools for functional
ecology R. package, Vienna, Austria
Permutation tests for the CATS model
𝑆
π‘ž,
Choose values of p that agree with what we know:
And that maximizes the
remaining uncertainty
−
𝑝𝑖 ln
𝑝𝑖
π‘žπ‘–
𝑝𝑖
𝑖=1
𝑆
𝑑j =
𝑝𝑖 𝑑𝑖j =
𝑖=1
𝑆
π‘œπ‘– 𝑑𝑖j
𝑖=1
H0: trait values (tij) are independent of the observed relative abundances (oi)
Permutation tests for the CATS model
S
1. Calculate f  o,p,q  ο€½ οƒ₯ oi ln  pi   , Ti , q  qi 
i ο€½1
The smaller the value, the better the fit
2. Randomly permute the vector of trait values (T*i) between the species
(so traits are independent of observed relative abundances)
 o,p,q  ο€½ οƒ₯ oi ln  pi   , Ti* , q 
S
3. Calculate
f
*
i ο€½1
qi

4. Repeat steps 2 & 3 a large number of times.
5. Count the number of times f*> f. This is an estimate of the null probability.
Shipley, B. 2010. Ecology 91:2794-2805
maxent.test() function in the FD library.
Download