Modelling Phenotypic Evolution Stochastic Differential Equations by Combining statistical

advertisement
Modelling Phenotypic Evolution
by
Stochastic Differential Equations
Tore Schweder and Trond Reitan Jorijntje Henderiks
University of Oslo
University of Uppsala
Combining statistical
timeseries with fossil
measurements.
1
ICES 2010, Kent
Overview
1. Introduction:
• Motivating example:
 Phenotypic evolution: irregular time series related by common
latent processes
 Coccolith data (microfossils)
 205 data points in space (6 sites) and time (60 million years)
2. Stochastic differential equation vector processes
• Ito representation and diagonalization
• Tracking processes and hidden layers
• Kalman filtering
3. Analysis of coccolith data
• Results for original model
• 723 different models – Bayesian model selection and inference
4. Second application: Phenotypic evolution on a phylogenetic tree
• Primates - preliminary results
5. Conclusion
2
Irregular time series related by latent processes:
evolution of body size in Coccolithus
Henderiks – Schweder - Reitan
The size of a single cell algae (Coccolithus) is measured by the diameter
of its fossilized coccoliths (calcite platelets). Want to model the evolution
of a lineage found at six sites.
•In continuous time and ”continuously”.
•By tracking a changing fitness optimum.
•Fitness might be influenced by observed (temperature) and unobserved
processes.
•Both fitness and underlying processes might be correlated across sites.
19,899 coccolith measurements, 205
sediment samples (1< n < 400) of body
size by site and time (0 to -60 my).
3
Our data – Coccolith size measurements
4
205 Sample mean log coccolith size (1 < n < 400) by time and site.
Individual samples
Bi-modality rather common.
Speciation? Not studied here!
5
Evolution of size distribution
Fitness = expected number of
reproducing offspring.
The population tracks the
fitness curve (natural selection)
The fitness curve moves about,
the population follow.
With a known fitness, µ, the
mean phenotype should be an
Ornstein-Uhlenbeck process
(Lande 1976).
dX (t )    X (t )   dt  dW (t )
With fitness as a process, µ(t),,
we can make a tracking model: dX (t )    X (t )   (t )dt  dW (t )
6
The Ornstein-Uhlenbeck process
dX (t )    X (t )   dt  dW (t )
Attributes:
• Normally distributed
• Markovian
• long-term level: 
• Standard deviation:
s=/2α
•α: pull
• Time for the correlation to
drop to 1/e: t =1/α
1.96 s

-1.96 s
t
The parameters (, t, s) can be estimated from the data. In this case: 1.99,
7
t=1/α0.80Myr, s0.12.
One layer tracking another
dX 1 (t )  1  X 1 (t )  X 2 (t ) dt   1dW1 (t )
dX 2 (t )   2  X 2 (t )   dt   2 dW2 (t )
Red process (t2=1/2=0.2,
s2=2) tracking black process
(t1=1/1=2, s=1)
Auto-correlation of the upper (black)
process, compared to a one-layered
SDE model.
A slow-tracking-fast can always be re-scaled to a fast-tracking slow process.
Impose identifying restriction: 1 ≥ 2
8
Process layers - illustration
Observations
Layer 1 – local
phenotypic
expression
Layer 2 – local fitness
optimum
T
External
series
Layer 3 – hidden
climate variations or
primary optimum
Fixed layer
Coccolith model
X s ,l (t )  state of the l ' th process at site s at time t (log size).
dX s ,1 (t )   s ,1  X s ,1 (t )  X s , 2 (t ) dt  noise
Population mean
dX s ,3 (t )   s ,3  X s ,3 (t )   dt  noise
Environmen t
dX s , 2 (t )   s , 2  X s , 2 (t )  X s ,3 (t )   s u (t ) dt  noise Fitness optimum
u (t )  observed covariate process (temperatu re)
 s ,l  0 is the tracking speed of site s belonging to the l ' th layer
 s ,l  0  random walk,  s ,l    perfect tr acking
The noise is white, possibly correlated inside a layer
X 3 , u  X 2  X1
Observatio ns Z s ,t are N ( X s ,1 (t ),  s ,t )
2
10
Stochastic differential equation (SDE)
vector processes
X (t ) : p - dimensiona l vector state process
W (t ) : q - dimensiona l vector Wi ener process (Brownian motion)
 (t )  u (t ) : p - dimensiona l level process,  a p  r regression matrix
A : p  p dimensiona l pull matrix
 p  q dimensiona l diffusion matrix
dX (t )  ( AX (t )   (t )) dt  dW (t )
Ito solution :
t
t


X (t )  B(t )  X (0)   B(u )  (u )du   B(u )dW (u )
0
0


1
B(t )  e
 At

  An (t ) n / n! V 1e  tV (when existing)
n 0
Λ is a diagonal eigenvalue matrix of A, V is the left eigenvecto r matrix.
11
Model variants for Coccolith evolution
 The individual size of the algae Coccolithus has evolved over time in
the world oceans. What can we say about this evolution?
How fast do the populations track its fitness optimum?
Are the fitness optima the same/correlated across oceans, do they
vary in concert?
Does fitness depend on global temperature? How fast does fitness
vary over time?
Are there unmeasured processes influencing fitness?
Model variations:
• 1, 2 or 3 layers.
• Inclusion of external timeseries
• In a single layer:




Local or global parameters
Correlation between sites (inter-regional correlation)
Deterministic response to the lower layer
Random walk (no tracking)
12
Likelihood: Kalman filter
Need a linear, normal Markov chain with independent normal observations:
X (t1 )  X (t2 )    X (tn )
z1
z2
zn
Process
Observations
The Ito solution
t
t


1
X (t )  B(t )  X (0)   B(u )  (u )du   B(u )dW (u )
0
0


gives, together with measurement variances, what is needed to calculate the
likelihood using the Kalman filter:
E ( zi | z1 ,, zi 1 )  mi and Var ( zi | z1 ,, zi 1 )  vi
f ( z |  )   n i 1 f N ( zi | mi , vi , )
13
Kalman smoothing (state estimation)
ML fit of a tracking model with 3 layers
North Atlantic
Red curve: expectancy
Black curve: realization
Green curve: uncertainty
Snapshot:
14
Inference

Exact Gaussian likelihood, multi-modal

Maximum likelihood by hill climbing from 50 starting points
 BIC for model comparison.

Bayesian:
 Wide but informative prior distributions respecting identifying
restrictions
 MCMC (with parallel tempering) +
Importance sampling (for model likelihood)
 Bayes factor for model comparison and posterior probabilities
 Posterior weight of a property C from posterior model probabilities
1
f (data | M )

nC M C
W (C ) 
1
f (data | M )


C ' nC ' M C '
15
Model inference concerns
Concerns:
 Identification restriction: increasing tracking speed up the layers
 In total: 723 models when equivalent models are pruned out)
Enough data for model selection?
•
•
Data Simulated from the ML fit of the original model
Model selection over original model plus 25 likely suspects.
•
•
correct number of layers generally found with the Bayesian approach.
BIC seems too stingy on the number of layers.
16
Results, original model
3 layers, no regionality, no correlation
17
Bayesian inference on the 723 models
1
 f (data | M )
nC M C
W (C ) 
1
f (data | M )


n
C'
C ' M C '
18
95% credibility bands for the 5 most
probable models
19
Summary
 Best 5 models in good agreement.
(together, 19.7% of summed
integrated likelihood):






Three layers.
Common expectancy in bottom
layer .
No impact of exogenous
temperature series.
Lowest layer: Inter-regional
correlations,   0.5. Sitespecific pull.
Middle layer: Intermediate
tracking.
Upper layer: Very fast tracking.
Middle layers: fitness optima
Top layer: population mean log size
20
Layered inference and inference
uncertainty
21
Phenotypic evolution on a phylogenetic tree:
Body size of primates
Measuremen ts i  1,  ,209 (92 extant measuremen ts, 117 fossils)
X i ,1 (t )  mean phenotype (log mass)
X i , 2 (t )  fitness optimum
X i ,3 (t )  underlying environmen t, primary optimum
X i ,3  X i , 2  X i ,1 same tracking model as for coccoliths
Phylogenet ic tree :
Ti , j  Time when i split from j.
X i ,l (t )  X j ,l (t ) when t  Ti , j .
Condition on these equalities for the given tree .
Observatio ns Z i ,k  X i ,k (t k )  N (0,  2 k ) will
be multi - normally distribute d with mean vecto r and covariance
matrix as parametric expression s from the SDE model.
22
First results for some primates
23
Why linear SDE processes?
 Parsimonious: Simplest way of having a stochastic continuous time
process that can track something else.
 Tractable: The likelihood, L()  f(Data | ), can be analytically
calculated by the Kalman filter or directly by the parameterized multinormal model for the observations. ( = model parameter set)
 Some justification from biology, see Lande (1976), Estes and Arnold
(2007), Hansen (1997), Hansen et. al (2008).
 Great flexibility, widely applicable...
24
Further comments






Many processes evolve in continuous rather than discrete time. Thinking
and modelling might then be more natural in continuous time.
Tracking SDE processes with latent layers allow rather general correlation
structure within and between time series. Inference is possible.
Continuous time models allow related processes to be observed at variable
frequencies (high-frequency data analyzed along with low frequency data).
Endogeneity, regression structure, co-integration, non- stationarity, causal
structure, seasonality …. are possible in SDE processes.
Extensions to non-linear SDE models…
SDE processes might be driven by non-Gaussian instantaneous
stochasticity (jump processes).
25
Bibliography







Commenges D and Gégout-Petit A (2009), A general dynamical
statistical model with causal interpretation. J.R. Statist. Soc. B, 71,
719-736
Lande R (1976), Natural Selection and Random Genetic Drift in
Phenotypic Evolution, Evolution 30, 314-334
Hansen TF (1997), Stabilizing Selection and the Comparative
Analysis of Adaptation, Evolution, 51-5, 1341-1351
Estes S, Arnold SJ (2007), Resolving the Paradox of Stasis:
Models with Stabilizing Selection Explain Evolutionary Divergence
on All Timescales, The American Naturalist, 169-2, 227-244
Hansen TF, Pienaar J, Orzack SH (2008), A Comparative Method
for Studying Adaptation to a Randomly Evolving Environment,
Evolution 62-8, 1965-1977
Schuss Z (1980). Theory and Applications of Stochastic Differential
Equations. John Wiley and Sons, Inc., New York.
Schweder T (1970). Decomposable Markov Processes. J. Applied
Prob. 7, 400–410
Source codes, examples files and supplementary information
can be found at http://folk.uio.no/trondr/layered/.
26
Download