phylogenetic signal - ecoevol

advertisement
Tópicos Avançados em Ecologia Filogenética e Funcional
Modelos evolutivos, sinal filogenético,
conservação de nicho
José Alexandre Felizola Diniz-Filho
Departamento de Ecologia, UFG
Modelos evolutivos, sinal filogenético,
conservação de nicho
1. Introdução (programas de pesquisa)
2. Filogenias e matrizes de relação entre taxa
3. Modelos de Evolução
3.1 . Conceitos gerais
3.2. Métodos Estatisticos
3.3. Abordagens baseadas em modelos de evolução
3.4. Comparação de métodos
4. Conservação de nicho
4.1. Conceitos gerais
4.2. Sinal filogenético e conservação de nicho
1. Introduction: on the research traditions...
Phylogenetic
Comparative Methods
Paul Harvey
(1980’s)
Phylogenetic
Diversity
Dan Faith
(1992)
Community
Phylogenetics
Campbell Webb
(2002)
Marc Cadotte
(University of Toronto)
Traits
Ecophylogenetics
Assemblages
1985
TRAITS
Phylogenetic Signal
Traits
Correlated Evolution
2. Phylogenies and relationship matrices
A
B
C
2
2
5
3
Pairwise (patristic) distances
A
A
B
C
>primcor <- cophenetic(primtree)
>
B
0
10
10
C
10
0
4
10
4
0
Shared proportion of
branch lenght from
root to tips
A
A
B
C
B
1.0
0
0 1.0
0 0.39
C
0
0.39
1.0
((((homo: 0.22,pongo: 0.22): 0.25,macaca:0.47):0.14,ateles: 0.62): 0.38,galago: 1.00): 0.00;
galago
0
ateles
0.38
macaca
0.53
pongo
0.78
homo
>primcor <- vcv.phylo(primtree, cor=TRUE)
>
1.00
0.78
0.53
0.38
0.00
0.78
1.00
0.53
0.38
0.00
0.53
0.53
1.00
0.38
0.00
0.38
0.38
0.38
1.00
0.00
0.00
0.00
0.00
0.00
1.00
Phylogenetic variance-covariance (vcv) matrix ( )
This is an ultrametric
tree...distance
from
root to TIP is constant
for all species
Main diagonal
PHYLOGENETIC CORRELATION =
Standardized Variance-Covariance =
Shared proportion of branch lenght
This ultrametric tree
has a total lenght of 1.0
t4
t5
t2
t8
t6
t3
t1
t7
t4
1.830
1.215
0.761
0.761
0.761
0.761
0.000
0.000
t5
1.215
1.761
0.761
0.761
0.761
0.761
0.000
0.000
t2
0.761
0.761
1.818
1.115
0.774
0.774
0.000
0.000
t8
0.761
0.761
1.115
1.536
0.774
0.774
0.000
0.000
t6
0.761
0.761
0.774
0.774
1.846
1.412
0.000
0.000
t3
0.761
0.761
0.774
0.774
1.412
1.524
0.000
0.000
t1
0.000
0.000
0.000
0.000
0.000
0.000
1.029
0.558
t7
0.000
0.000
0.000
0.000
0.000
0.000
0.558
0.816
The species “covary”, but in
terms of “what”?
PHENOTYPES!
So, the phylogenetic vcv matrix
gives
na
EXPECTED
covariance based on traits
species (which is actually
similarity of mean values)
among the species...
ERM (Expected Relationship
Matrix; Martins 1995)
The same phylogeny can
generate different OBSERVED
vcv matrices, for different
traits, for example...
EVOLUTIONARY MODELS
3. EVOLUTIONARY MODELS
Mechanisms (selection,
drift, mutations…)
Evolutionary models
Interspecific data
The analytical core of comparative analysis
Mechanisms (selection,
drift, mutations…)
?
The path from evolutionary mechanisms
(selection, drift, mutation and so on) to
Evolutionary models
interspecific variation is a conceptual
idea, but it may be hard (or even
impossible) to reverse it and actually
recover such processes from empirical
data...
Interspecific data
I = selection intensity
R = response
T = time
h2 = heritability
Vp = phenotypic variance
‘Mechanistic’ versus
phenomenological
evolutionary models
Statistical models that “capture”
the
expectation
evolutionary
mechanisms
of alternative
processes
or
BROWNIAN MOTION
-After Robert Brown (1827)
- Simplest continuous-time
stochastic process
Simple discrete
Random walks...
UNDERSTANDING BROWNIAN MOTION
In Excel, when A1=0...
=A1+(ALEATÓRIO()-0.5)
Uniform distribution (0-1)
6
20
5
4
10
3
1
Y
Y
2
0
0
-1
-10
-2
-3
-4
-20
0
10
0
20
0
30
0
40
0
50
0
time
60
0
70
0
80
0
0
0
90 100
0
200
400
600
time
800 1000 1200
15 replications of the same process through time
The distribution of Y at time step 1000, replicated 2000 times...
300
Count
200
100
0
-40 -30 -20 -10 0 10 20
Y at time 1000
30
40
WHAT ABOUT PHYLOGENY?
50 time-steps
50 time-steps
50 time-steps
Speciation
1
0
-1
Y
-2
-3
-4
-5
-6
0
20
40
60
80
Index of Case
100
120
100 time-steps
50 time-steps
100 time-steps
50 time-steps
100 time-steps
50 time-steps
50 time-steps
50 time-steps
Expected VCV matrix
1
0.333
0
0
0
1
0
0
0
1
0.333
0.333
1
0.666
1
10
Y
5
0
-5
-10
0
50
100
time
150
200
10
10
5
0
0
Y
Y
20
-10
-5
-20
-10
0
200
400
600
time
800 1000 1200
Here we assumed that
species are INDEPENDENT
(the started all at the root)
0
50
100
time
150
Here
species
are
PHYLOGENETICALLY
STRUCTURED
200
10
Y
5
0
-5
-10
0
50
100
time
150
200
If we repeat
this many
times...
But how?????
...
trait1000
...
-3.246
...
0.329
...
-4.418
...
-2.767
10
Each line is a simulation
that gives Y values for
each species...
5
Y
trait1
trait2
trait3
trait4
trait5
trait6
trait7
trait8
trait9
trait10
trait11
trait12
trait13
trait14
trait15
trait16
trait17
trait18
trait19
trait20
sp1
sp2
sp3
sp4
sp5
-0.928 -3.010
0.246 -0.433 -0.422
-2.914
0.788
2.486
3.308
1.628
6.631
2.590
4.200
2.394
3.227
-6.380 -5.593 -2.074
1.013 -0.208
-0.593
9.725
0.968
3.546
2.101
2.627 -4.549
1.953 -1.208
3.152
4.411 -2.070
0.513
5.043
6.609
-1.565 -9.055 -1.118
2.523 -3.547
1.329
1.315
5.062 -1.551 -0.145
-0.292 -1.601 -2.935 -5.727 -5.107
-1.430 -3.896 -2.494
0.280 -0.925
-0.585
2.413 -1.444 -1.901 -0.052
-2.029 -2.192 -3.938 -2.575 -5.659
-1.281 -1.863
3.187 -0.340 -1.974
4.104
9.415 -0.205
4.210
7.856
-2.212 -3.050 -4.495 -6.210 -6.638
-0.649 -7.015 -0.971 -2.823
2.670
-3.046
0.229 -4.418 -1.767
1.183
1.134
1.465
0.842 -2.105
0.011
1.241 -1.303 -0.091
4.491
0.607
0
-5
-10
0
50
100
time
150
Calculate a Pearson
(or covariance) matrix among
Taxa (in “R mode”)
“Observed” matrix (10000 “traits”)
-1.827
1
0.539
0.341
0.354
0.274
1
0.350
0.360
0.285
1
0.333
0.333
1
0.666
1
200
ape
> rTraitCont(phy, model = "BM", sigma = 0.1, alpha = 1, theta = 0, ancestor =
FALSE, root.value = 0, ...)
ntimes=100
nsp=5
simbw <- matrix(data=NA,nrow=ntimes,ncol=nsp)
for(i in 1:ntimes){
simbw[ i, ]<-rTraitCont(primtree)
}
> simbw
[,1]
[,2]
[,3]
[,4]
[,5]
[1,]
-0.04001
-0.053
0.07408
-0.05225
-0.13472
[2,]
0.246995
0.188368
0.210539
0.161954
-0.04256
[3,]
0.034313
0.015872
-0.02537
0.042092
-0.03787
[4,]
0.024264
-0.08208
-0.07415
-0.05169
-0.02666
[5,]
-0.07504
-0.09173
-0.05418
-0.09041
0.091738
[6,]
0.281138
0.210935
0.121205
0.162539
0.081836
[7,]
0.152936
0.169856
-0.01267
-0.00268
-0.00039
[8,]
0.009934
-0.09725
-0.08152
-0.20757
0.099189
[9,]
-0.03726
0.026658
-0.17218
-0.14235
-0.0787
[10,]
-0.33382
-0.20617
-0.17718
-0.29438
0.061293
[11,]
-0.05479
-0.16742
0.064186
-0.03345
0.003819
[12,]
0.046365
-0.08393
-0.11845
-0.19607
0.107281
[13,]
-0.15355
-0.10313
-0.19682
-0.2495
0.07867
[14,]
0.185026
0.130559
0.017491
0.111212
0.033344
[15,]
0.089726
0.031212
0.035245
-0.08706
0.059088
[16,]
0.009616
-0.01897
-0.00993
0.08443
-0.15238
[17,]
-0.01019
0.009079
-0.04108
0.072125
0.119902
[98,]
0.115672
0.091517
0.213318
-9.59E-03
-0.0636
[99,]
0.018725
-0.00479
-0.12521
1.13E-01
-0.0851
[100,]
-0.10961
-0.11279
-0.08101
-1.66E-01
-0.11171
...
100 time-steps
95 time-steps
100 time-steps
5 time-steps
100 time-steps
95 time-steps
75 time-steps
25 time-steps
Expected VCV
(standardized) matrix
1
0.487
0
0
0
1
0
0
0
1
0.487
0.487
1
0.872
1
Expected VCV
(standardized) matrix
1
0.487
0
0
0
1
0
0
0
1
0.487
0.487
1
0.872
0.9
1
0.8
r = 0.991!!!!
OBSERVED
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Observed matrix (10000 “traits”)
1
0.425
0.042
0.044
0.061
1
0.046
0.098
0.095
1
0.569
0.497
1
0.861
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
EXPECTED
1
Properties or Brownian motion in comparative
analysis
-Normal distribution of phenotypes (tips)
-Mean constant through time (absence of trends)
-Variance increases linearly with time (but remember that we do not know the
absolute expected variance)
The evolutionary interpretation of Brownian motion
-Genetic drift + Mutation = Neutral (sensu Kimura) evolution
-Stochastic adaptation in each lineage at each time step (multiple
independent adaptive forces)
Constrained Brownian motion: Ornstein-
Uhlenbeck (O-U) process
…The Ornstein–Uhlenbeck (O-U) process (named after Leonard
Ornstein and George Eugene Uhlenbeck), is a stochastic process
that, roughly speaking, describes the velocity of a massive Brownian
particle under the influence of friction.
Stabilizing selection...
Interspecific covariance
Brownian motion e Ornstein-Uhlenbeck (OU) processes…
Brownian motion
(O-U with alpha equal to zero)
O-U process
Time since divergence
Creating alternative models by
warping the branch lenghts...
The tip is to move from a “real”
phylogeny (the sequence of branching
events in time) to a “trait” or “model”
phylogenetic structure that must be
used in the statistical analyses....
Several options to transform branch lenghts in GEIGER
deltaTree(phy, delta, rescale = T)
lambdaTree(phy, lambda)
kappaTree(phy, kappa)
ouTree(phy, alpha)
tworateTree(phy, breakPoint, endRate)
linearchangeTree(phy, endRate=NULL, slope=NULL)
exponentialchangeTree(phy, endRate=NULL, a=NULL)
speciationalTree(phy)
rescaleTree(phy, totalDepth)
galago
galago
ateles
ateles
macaca
macaca
pongo
pongo
homo
BM
> primtreeOU <-ouTree(primtree,2.5)
> plot(primtreeOU)
homo
OU
>primcorOU <-vcv.phylo(primtreeOU,cor=TRUE)
> write.table(primcorOU, file="primcorOU.txt")
homo
pongo
macaca ateles
galago
homo
1.000
0.328
0.089
0.040
0.000
pongo
0.328
1.000
0.089
0.040
0.000
macaca
0.089
0.089
1.000
0.040
0.000
ateles
0.040
0.040
0.040
1.000
0.000
galago
0.000
0.000
0.000
0.000
1.000
BM
THIS IS THE EXPECTED
VCV UNDER OU PROCESS
WITH  = 2.5!
galago
galago
ateles
ateles
macaca
macaca
pongo
pongo
homo
homo
OU
“COMPARATIVE” versus “NONCOMPARATIVE” ANALYSIS: The
“STAR-PHYLOGENY”
-This is actually what you assume when you say
that did not use comparative methods (so, they
actually use, but with a particular vcv matrix)
20
-Doing a standard regression or correlation is a
particular form of comparative analyses assuming
a Star-Phylogeny
- This assumption indicates that the trait has no
pattern (the interspecific variation is random in
respect to phylogeny)
This does not indicate that there is no
phylogenetic relationships among species, of
course, only that the processes driving trait
variation occurred in such a way that the patterns
is completely lost.
Y
10
0
-10
-20
0
1
0
0
0
0
200
0
1
0
0
0
400
600
time
0
0
1
0
0
800 1000 1200
0
0
0
1
0
0
0
0
0
1
PHYLOGENETIC SIGNAL: BASIC CONCEPTS
Relationship between species’ similarity for a trait and phylogenetic distance
- phylogenetic pattern;
- phylogenetic component;
- phylogenetic signal;
- phylogenetic correlation;
- phylogenetic inertia
Patterns and processes...
MEASURING PHYLOGENETIC
SIGNAL
Statistical
Metrics
?
Model Based
Moran’s I coefficient for phylogenetic
autocorrelation
Matrix W with weights
Number of spp
I
Species
trait
centered
for
species i e j
n wij zi z j
Phylogenetic covariance
ij
n
Wz
i 1
Sum of weights in W
2
i
Z
the
variance
CORRELOGRAMS IN POPULATION GENETICS
Robert Sokal (1924-2012)
Sokal, R. R. & Oden, N. L. 1978. Spatial
autocorrelation in biology:
1. methodology
2. Some biological implications and four
applications
of
evolutionary
and
ecological interest
Biological Journal of Linnean Society 10:
199-249.
I
n wij zi z j
ij
n
W  zi2
i 1
Matrix Zi * Zj (Z)
Patristic distances
Matriz W (1/Dij)
I
n wij zi z j
ij
n
W  zi2
i 1
Sum of W = 10.38333
W
Z
ZijWij
Sum ZijWij = 8.400781
-1.0 < Moran’s I < 1.0
Moran’s I
I
n wij zi z j
ij
Maximum and minimum are a function
of eigenvalues of W (see Lichstein et
al. 2002)
1/ 2
2
 


I max (d )  (n / W )  wij ( y j  y ) /  ( yi  y ) 2 
 i  j

 i
n
W  zi2
i 1
Numerator
phylogenetic covariance = 8.400781 / 10.3833 = 0.809
Denominator
variance = 23.375 / 8 = 2.984
I = 0.809 / 2.984 = 0.276
What is wrong?
The W matriz: “inverting” the relationship between W and D
Gittleman used something like
this, but this is empirical...
W
Wij = 1 / dij
Phylogenetic distance
Wij = 1/ Dij
Wij = 1/ (Dij ^ 2)
-
Wij = 1 / Dij2
I de Moran = 0.72
Other possible functons linking W and D
- Wij = 1 / dij
W
- Wij = 1 / dij2
- Wij = e (- dij)
Phylogenetic distance
Or we can use directly any VCV matrix, previously
defined...!!!!
The R matrix (shared branch
lenghts when root age is 1.0) is
already a W matrix that can be
used directly in Moran’s I
Testing significance: the analytical solution...
1
E(I ) 
n 1
0.6
0.5
Z
I  ei
Vari
Standard normal deviate,
(SND, or Z) assuming
normal distribution of the
statistics
n
f
0.4
0.3
0.2
0.1
0.0
-3
-2
-1
0
1
2
3
Z
If | Z | > 1.96, then Moran’s I is
significant at P < 0.05
Permutation test
4.0
3.5
Randomize the
tip values in the
phylogeny...
3.0
and recalculate
Moran’s
many
times...
6.0
7.5
8.0
5.0
6.0
400
Frequency
300
200
100
0
The P-value (Type I error) is
given by how many times the
Moran’s I was higher than the
randomized values
500
600
Histogram of I
-0.6
-0.4
-0.2
0.0
I
0.2
The PRIMATE example (Lynch 1991):
Body weight and Longevity (log-scale)
galago
spp
bw
long
homo
4.094
4.745
pongo
3.611
3.332
macaca
2.370
3.367
ateles
2.028
2.890
galago
-1.470
2.303
Let’s use R as a weighting matrix
0
ateles
0.38
macaca
0.53
pongo
0.78
homo
1.00
0.78
0.53
0.38
0.00
0.78
1.00
0.53
0.38
0.00
0.53
0.53
1.00
0.38
0.00
0.38
0.38
0.38
1.00
0.00
0.00
0.00
0.00
0.00
1.00
Moran’s I results
Body weight:
I = 0.200 ± 0.217;
E(I) = (-1/(n-1) = -0.25
Z = 2.07
P = 0.038
Significant phylogenetic signal...
Longevity:
I = -0.121 ± 0.209;
E(I) = (-1/(n-1) = -0.25
Z = 0.617
P = 0.537
Not significant phylogenetic signal...
> primlog <- read.table("primlog.txt",header=TRUE,row.name="spp")
> primtree <-readtree("primtree.txt")
> primcor <-vcv.phylo(primtree)
> diag(Rprim) <-0
> Moran.I(primlog[,c(1)],primcor)
The matriz W is
wrongly defined
in Paradis’ book
This is the “null” distribution for 1000 random normal values
(close to theoretical inferred distribution).
Histogram of I
500
400
300
200
100
0
diag(primcor) <- 0
a<-Moran.I(rnd_vec,primcor)
I[i]<- a$observed
}
hist(I)
mean(I)
median(I)
Frequency
rnd_vec <- as.numeric(5)
rnd_vec <-rnorm(5,0,1)
Mean = -0.2506
Median = -0.287
600
ntimes<-5000
I <- numeric(ntimes)
for(i in 1:ntimes){
-0.6
-0.4
-0.2
0.0
I
0.2
This is the distribution randomizing BW 1000 times
Out of 1000 randomized I, none was
larger than the observed 0.2009, so
P = 1/1000 = 0.001
In ade4...
>gearymoran(primcor,primlog[,c(1)])
800
600
200
400
obs
0
for(i in 1:ntimes){
vec <- vetor[sample(length(vetor))]
diag(primcor) <- 0
a<-Moran.I(vec,primcor)
I[i]<- a$observed
}
hist(I)
mean(I)
median(I)
Frequency
vetor <- primlog[,c(1)]
ntimes<-5000
I <- numeric(ntimes)
Histogram of I
-0.4
-0.2
0.0
I
Mean = -0.256
Median = -0.318
0.2
Moran’ I Correlograms
Moran`s I
I
n wij zi z j
ij
n
W  zi2
i 1
Allows evaluation of more
complex structures in the
matrix W of phylogenetic
relationships…
Time slices
Gittleman & Kot (1990) used taxonomic levels to generate
matrices W
1.0
2.0
A
B
4.0
C
D
1.5
E
A
A
B
C
D
E
F
B
0
1
2
4
4
4
C
0
2
4
4
4
D
0
4
0
4 1.5
4 1.5
E
F
0
1
0
A
A
B
C
D
E
F
Distances 0 - 2
B
C
0
1
0
1.2 1.2
4
4
4
4
4
4
D
E
0
4
0
4 1.5
4 1.5
A
B
C
D
E
F
B
0
1
1
0
0
0
C
0
1
0
0
0
0
W2
D
0
0
0
0
Distances > 2
0
1
W1
A
F
E F
0
1
1
0
1
Moran’s I for the first class
A
0
A
B
C
D
E
F
B
0
0
0
1
1
1
C
0
0
1
1
1
D
0
1
1
1
E F
0
0
0
0
0
Moran’s I for the second class
0
Diniz-Filho & Torres (2002, Evol.Ecol. 16: 351-367)
70 species of
Carnivora in New
World
Body size,
geographic range
size
Supertree
CORRELOGRAM
Strong signal for body
size
Weak signal for
geographic range size
Autocorrelation statistics such as Moran’s I test for randomness of
trait variation in the phylogeny. But what about the
EVOLUTIONARY MODELS?
Os correlogramas filogeneticos
respondem bem à mudanças nos
modelos evolutivos…
(Diniz-Filho 2001 Evolution 55: 1104-1109)
Partition Methods
Phylogenetic
Component P
Total variation T
T=P+S
Specific
Component S
P
Ancestral
environments,
constraints, neutral
evolution
S
Recent adaptive or
unique variation
T
y  Wy  
Pure autoregressive model
The Y values are a function of all other Ys value,
“weighted” by the relationship in W matrix (i.e., ancestrality)
Y1 = Y2*W12+Y3*W13+Y4*W14+...Yn*W1n
> chev209 <-compar.cheverud(bs209,r209b)
> 1-var(chev209$residuals)/var(bs209)
Phylogenetic Eigenvector Regression (PVR)
Diniz-Filho`s et al. (1998) Phylogenetic eigenVector Regression (PVR)
(Evolution 52: 1247-1262.)
Phylogeny
Phylogenetic
distances
Multiple
regression
Y  Xβ  ε
Y
Eigenvectors
Double
centering
X
R2
(V)
Estimated
values
Regression
residuals
P
S
Diniz-Filho`s et al. (1998) phylogenetic eigenvector regression (PVR)
Phylogeny
-
+
Eigenvectors
(V)
Eigenvalues
Phylogenetic eigenvectors
represent linearly different
cuts of phylogeny, allowing
evaluation of phylogenetic
effects at different `scales`
100
Eigenvalues (%trace)
90
80
70
60
50
colour
YELLOW
BLUE
RED
GREEN
ORANGE
40
30
20
10
Phy
GR01
comb
bal
norm
gr50
l1%
9.77
22.06
30.61
32.49
78.03
0
0
5
10
RANK
15
Principal coordinate analysis of truncated
geographic distances W (PCNM)
Pierre Legendre
Eigenvectors of double centered binary
(0/1) connectivity matrix
Daniel Griffith
Diniz-Filho & Torres (2002, Evol.Ecol. 16: 351-367)
70 species of
Carnivora in New
World
Body size,
geographic range
size
Supertree (12 first
eigenvectors)
Body size
R2 = 0.75 (P << 0.01)
PVR
Geographic range
R2 = 0.28 (P = 0.06)
0.8
Body mass
Geographic range size
0.7
Squared-correlation (R 2)
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
2
4
6
8
10
12
Number of eigenvectors
The estimated phylogenetic signal (R2) depends on how many
axes are used…
14
2012
1
Table 1. Coefficient of determination (R2) and F-statistics evaluating the significance of
2
each phylogenetic eigenvector regression (PVR) between Carnivora body size (log-
3
transformed) and variable numbers of eigenvectors (k) under sequential and non-
4
sequential selection.
5
6
Criteria
R2
F
IRES
P
1 0.02
3.1
0.71
<0.001
0.73
0.86 408.50
5 0.40
26.8
0.53
<0.001
0.80
0.98 314.90
10 0.57
26.0
0.45
<0.001
0.77
0.99 255.79
15 0.62
21.8
0.32
<0.001
0.79
0.99 238.71
20 0.69
20.6
0.13
<0.001
0.81
0.99 212.44
25 0.76
23.5
-0.03
0.487
0.75
0.99 206.28
30 0.78
20.6
-0.05
0.256
0.73
1.00 168.78
40 0.81
17.5
-0.11
0.006
0.70
1.00 168.29
50 0.82
14.5
-0.12
0.002
0.69
1.00 184.86
60 0.84
13.0
-0.13
<0.001
0.65
1.00 199.05
70 0.86
11.6
-0.14
<0.001
0.64
1.00 223.20
ESRBS
27 0.82
29.9
-0.10
0.014
0.70
0.99 118.56
STEP
36 0.84
24.3
-0.12
0.001
0.69
1.00 126.43
MINI
14 0.70
31.6
0.04
0.172
0.77
0.96 191.00
Sequential
k
rPVR,ARM rcoph
AIC
Non-Sequential
PSR Curve (Phylogenetic Signal-Representation curve)
PSR Curve (Phylogenetic Signal-Representation curve)
(b)
Eigenvector
A
B
C
D
E
F
Trait A
Trait B
Curva PSR
1,0
Random
Mean BM
Min
Max
0,8
R2
0,6
0,4
0,2
0,0
0,0
0,2
0,4
0,6
Eigenvalues
0,8
1,0
Curva PSR
1,0
1,0 (a)
0,8 Ornstein-Uhlenbeck
restraining forces:
2
4
6
0,6
8
10
0,8
0,6
R2
R2
Brownian
0,4
0,4
0,2
0,2
0,0
0,0
0,0
0,2
(b)
 parameter:
0.1
0.5
1.0
2.5
5.0
0,4
0,6
Eigenvalues
0,8
1,0
0,0
0,2
0,4
0,6
Eigenvalues
0,8
1,0
Area da Curva PSR e K de Blomberg
Eigenvector selection based on PSR curve?
1,0
0,8
(b)
 parameter:
0.1
0.5
1.0
2.5
5.0
R2
0,6
0,4
0,2
0,0
0,0
0,2
0,4
0,6
Eigenvalues
0,8
1,0
Body mass
Geographic range size
-6
-4
4
6
PC4
Velociraptor
Tsaagan
Bambiraptor
Sinornithosaurus
Erlikosaurus
Incisiv osaurus
Citipati
Rinchenia
Khaan
Conchoraptor
Shuv uuia
Garudimimus
Gallimimus
Ornitholestes
Compsognathus
Guanlong
Dilong
Bistahiev ersor
Gorgosaurus
Daspletosaurus
Ty rannosaurus
Tarbosaurus
Sinraptor
Acrocanthosaurus
Allosaurus
Monolophosaurus
Majungasaurus
Carnotaurus
Ceratosaurus
Limusaurus
Sy ntarsus
Coelophy sis
Tawa
Eoraptor
Herrerasaurus
Velociraptor
Rinchenia
The PVR/PSR Package (Functions: PVRdecomp, PVR, PSR, VarPartplot)
Santos et al. (in prep)
Null expectation
Brownian expectation
MEASURING
PHYLOGENETIC SIGNAL
Statistical
Metrics
?
Model Based
Model-Based Methods for
Phylogenetic Signal
Blomberg’s K
This
is
the
variance of the
trait in respect to
ancestral states
This
is
the
phylogenetically
corrected variance
(var of PICs)
galago
ateles
Original Phylogeny
(“time”)
macaca
pongo
homo
galago
ateles
macaca
D-transform
0.25(“time”)
pongo
Trait will evolve like this,
but will be analyzed
using the “known” (time)
phylogeny
homo
galago
ateles
OU (alpha 2.5)
macaca
pongo
homo
Trait will evolve like this,
but will be analyzed
using the “known” (time)
phylogeny
galago
ateles
MSE = 0.996
macaca
pongo
K = 1.018 ± 0.388
homo
galago
ateles
MSE = 0.541
macaca
pongo
K = 1.258 ± 0.442
homo
galago
ateles
MSE = 1.727
macaca
pongo
homo
K = 0.810 ± 0.332
galago
Histogram of K
ateles
K = 1.018437 ± 0.388
200
macaca
50
0
ntimes<-1000
K<- numeric(ntimes)
for(i in 1:ntimes){
trait<-rTraitCont(primtree)
K[i]<-Kcalc(trait,primtree)
}
K
hist(K)
mean(K)
sd(K)
Frequency
homo
100
150
pongo
0.5
1.0
1.5
K
> bw <-data.frame(primlog[,c(1)])
> multiPhylosignal(bw,primtree)
2.0
Blomberg’s K
Body weight:
K = 0.728
K(null) = 0.796 ± 0.391
P(K=0) = 0.001
There is a significant phylogenetic signal
Longevity:
150
100
Frequency
200
250
Histogram of K
50
KOBS <- Kcalc(primlog[,c(1)],primtree)
vetor <- primlog[,c(1)]
ntimes<-1000
K <- numeric(ntimes)
for(i in 1:ntimes){
vec <- vetor[sample(length(vetor))]
K[i]<-Kcalc(vec,primtree)
}
hist(K)
mean(K)
sd(K)
P1 <- ((sum(K > KOBS[1,1]))+1)/ntimes
Phylogenetic signal is not significant...
0
K = 0.200
K(null) = 0.775 ± 0.327
P(K=0) = 0.422
0.5
1.0
1.5
K
2.0
...
FITTING GENERAL MODELS OF
TRAIT EVOLUTION USING PGLS
>library(motmot)
>primbw <-as.matrix(primlog[,c(1)])
>likTraitPhylo(primbw,primtree)
Gavin Thomas
Rob Freckleton
Get the maximum likelihood of
trait given the tree (the tree
can be transformed into trees
reflecting other models (in
GEIGER), or...
>transformPhylo.ML(primbw,primtree,model="OU")
> transformPhylo.ll(primbw,primtree,model="OU",alpha=2)
It
can
find
the
parameter alpha that
maximize the likelihood
Gives the likelihood for
a model and parameter
Several models, including lambda...
-6.1
>library(motmot)
-6.2
-6.5
-6.4
pglsfit
-9.8
-10.0
-6.7
-6.6
-10.2
-6.8
-10.4
pglsfit
LONG
-6.3
-9.6
BW
0.0
0.2
0.4
Alpha
0.6
0.8
0.0
0.2
0.4
0.6
lambda
primbw <-as.matrix(primlog[,c(1)])
pglsfit <- numeric(10)
lambda <- seq(0.000001,1,0.1)
for(i in 1:length(lambda)){
primll <- transformPhylo.ll(primbw,primtree,model="lambda",lambda=lambda[i])
pglsfit[i] <- primll$logLikelihood[1,1]
}
plot(lambda, pglsfit)
0.8
None of the models difer from
Brownian expectations...
Model
Likelihood
P(chi-squared)
Brownian
-7.080
*
Kappa
-5.940
0.133
Lambda
-6.080
0.156
Delta
-6.150
0.174
OU
-6.090
0.159
psi
-5.950
0.133
STAR
-6.090
0.160
MEASURING
PHYLOGENETIC SIGNAL
Statistical
Metrics
?
Model Based
Comparing metrics for phylogenetic signal...
But this is under the same phylogeny...
What are the different metrics “capturing” in trait evolution?
Lambda = 1 (Brownian)
t14
t36
t12
t31
t18
t41
t9
t10
t44
t4
t33
t34
t7
t20
t26
t2
t5
t37
t40
t45
t22
t42
t47
t19
t32
t28
t23
t6
t43
t35
t50
t21
t49
t39
t29
t30
t46
t1
t3
t27
t8
t38
t13
t24
t48
t25
t11
t17
t16
t15
Lambda = 0.5
t14
t36
t12
t31
t18
t41
t9
t10
t44
t4
t33
t34
t7
t20
t26
t2
t5
t37
t40
t45
t22
t42
t47
t19
t32
t28
t23
t6
t43
t35
t50
t21
t49
t39
t29
t30
t46
t1
t3
t27
t8
t38
t13
t24
t48
t25
t11
t17
t16
t15
Lambda = 0.1
t14
t36
t12
t31
t18
t41
t9
t10
t44
t4
t33
t34
t7
t20
t26
t2
t5
t37
t40
t45
t22
t42
t47
t19
t32
t28
t23
t6
t43
t35
t50
t21
t49
t39
t29
t30
t46
t1
t3
t27
t8
t38
t13
t24
t48
t25
t11
t17
t16
t15
1.0
0.8
Blomberg’s K
0.2
0.4
Type I error (correlation
among TIPs)
Moran’s I
0.0
Signal
0.6
PVR’s R2
0.0
0.2
0.4
0.6
ALPHA
0.8
1.0
PHYLOGENETIC SIGNAL &
NICHE CONSERVATISM
Townsend Peterson
John Wiens
Niche conservatism?
What is the relationship between phylogenetic signal
and niche conservatism?
The short answer: NONE
The long answer: depends, it is complicate...
1. No signal can indicate strong conservatism
2. Brownian motion can indicate strong conservatism with reducted
variance (use Quantitative Genetic Models?)
Under Losos’ / Wiens’ reasoning:
- Fit BM,OU, and “white noise” (random) models – niche conservatism is
better supported by OU (actually a balance between shift/conservatism)
Brownian motion with variable rates
These two  patterns
are very different...
Niche Conservatism under PSR Curve...
1,0
0,8
(b)
 parameter:
0.1
0.5
1.0
2.5
5.0
Multiple peak OU
R2
0,6
0,4
Standard (single peak) OU
0,2
0,0
0,0
0,2
0,4
0,6
Eigenvalues
0,8
1,0
Pierre Legendre
“A portion of the phylogenetic variation of the trait may be related to ecology. This
portion is called ‘‘phylogenetic niche conservatism’’, and we propose a method of
variation partitioning that allows users to quantify this portion of the variation,called
the ‘‘phylogenetically structured environmental variation.’’
First, compute the following regressions:
(1) Y and XE (“environmental variables”); R2 = [a]+[b]
(2) Y and P (eigenvectors); R2 = [b]+[c]
(3) Y = f (XE, P); R2 = [a]+[b]+[c]
The individual values of a, b, and c can be obtained by subtraction from
the previous results:
[a] = R2 (step 3) - R2 (step 2) or ([a]+[b]+[c]) – ([b]+[c])
[b] = R2 (step 1) + R2 (step 2) – R2 (step 3)
[c] = R2 (step 3) - R2 (step 1)
[d] = 1-(a+b+c)
Discussões,
sugestões,
idéias????
OBRIGADO!
Download