Modelling Phenotypic Evolution Stochastic Differential Equations by Combining statistical

advertisement

ICES 2010, Kent

Modelling Phenotypic Evolution

by

Stochastic Differential Equations

Tore Schweder and Trond Reitan

University of Oslo

Jorijntje Henderiks

University of Uppsala

Combining statistical timeseries with fossil measurements.

1

Overview

1. Introduction:

• Motivating example:

 Phenotypic evolution: irregular time series related by common latent processes

 Coccolith data (microfossils)

 205 data points in space (6 sites) and time (60 million years)

2. Stochastic differential equation vector processes

• Ito representation and diagonalization

• Tracking processes and hidden layers

• Kalman filtering

3. Analysis of coccolith data

• Results for original model

• 723 different models – Bayesian model selection and inference

4. Second application: Phenotypic evolution on a phylogenetic tree

• Primates - preliminary results

5. Conclusion

2

Irregular time series related by latent processes: evolution of body size in Coccolithus

Henderiks – Schweder - Reitan

The size of a single cell algae ( Coccolithus ) is measured by the diameter of its fossilized coccoliths (calcite platelets). Want to model the evolution of a lineage found at six sites.

•In continuous time and ”continuously”.

•By tracking a changing fitness optimum.

•Fitness might be influenced by observed (temperature) and unobserved processes.

•Both fitness and underlying processes might be correlated across sites.

19,899 coccolith measurements, 205 sediment samples (1< n < 400) of body size by site and time (0 to -60 my).

3

Our data – Coccolith size measurements

205 Sample mean log coccolith size (1 < n < 400) by time and site.

4

Individual samples

Bi-modality rather common.

Speciation? Not studied here!

5

Evolution of size distribution

Fitness = expected number of reproducing offspring.

The population tracks the fitness curve (natural selection)

The fitness curve moves about, the population follow.

With a known fitness, µ , the mean phenotype should be an

Ornstein-Uhlenbeck process

(Lande 1976).

dX ( t ) X ( t ) dt dW ( t )

With fitness as a process, µ(t)

,

, we can make a tracking model: dX ( t ) X ( t ) ( t ) dt dW ( t )

6

dX ( t )

The Ornstein-Uhlenbeck process

X ( t ) dt dW ( t )

1.96 s

Attributes:

• Normally distributed

• Markovian

• long-term level:

• Standard deviation: s= / 2α

• α : pull

• Time for the correlation to drop to 1/e: t =1/α t

-1.96 s

The parameters ( , t, s ) can be estimated from the data. In this case: 1.99, t=1/ α 0.80Myr, s 0.12.

7

One layer tracking another

dX

1

( t ) dX

2

( t )

1

X

1

( t )

2

X

2

( t )

X

2

( t ) dt dt

1 dW

1

( t )

2 dW

2

( t )

Red process ( t

2

=1/

2

=0.2, s

2

=2 ) tracking black process

( t

1

=1/

1

= 2, s =1 )

Auto-correlation of the upper (black) process, compared to a one-layered

SDE model.

A slow-tracking-fast can always be re-scaled to a fast-tracking slow process.

Impose identifying restriction:

1

2

8

Process layers - illustration

Observations

Layer 1 – local phenotypic expression

Layer 2 – local fitness optimum

T

External series

Layer 3 – hidden climate variations or primary optimum

Fixed layer

Coccolith model

X s , l

( t ) state of the l ' th process at site s at time t (log size).

dX s , 1

( t ) s , 1

X s , 1

( t ) X s , 2

( t ) dt noise Population mean dX u ( t ) s , 2

( t ) dX s , 3

( t ) observed s , 2 s , 3

X s , 2

( t )

X s , 3

( t ) covariate

X s , 3

( t ) dt s u ( t ) dt process (temperatu noise re)

Fitness optimum noise Environmen t

0 is the tracking speed of site s belonging to the l ' th layer s , l

0 random walk, s , l

The noise is white, possibly s , l correlated perfect tr acking inside a layer

X

3

, u X

2

X

1

Observatio ns Z s , t are N ( X s , 1

( t ), s , t

2

)

10

Stochastic differential equation (SDE) vector processes

X ( t )

W ( t )

( t )

: p

: q -

dimensiona dimensiona u ( t ) : p l vector state process l vector Wi dimensiona l ener process level process,

(Brownian a p motion) r regression matrix

A : p p p dimensiona q dimensiona l pull matrix l diffusion matrix dX ( t ) ( AX ( t ) ( t )) dt dW ( t )

Ito solution :

X ( t ) B ( t )

1

X ( 0 )

0 t

B ( u ) ( u ) du

0 t

B ( u ) dW ( u )

B ( t ) e

At

A n

( t ) n

/ n !

Λ is a diagonal n 0 eigenvalue

V matrix

1 e t

V (when existing) of A , V is the left eigenvecto r matrix.

11

Model variants for Coccolith evolution

 T he individual size of the algae Coccolithus has evolved over time in the world oceans. What can we say about this evolution?

 How fast do the populations track its fitness optimum?

 Are the fitness optima the same/correlated across oceans, do they vary in concert?

 Does fitness depend on global temperature? How fast does fitness vary over time?

 Are there unmeasured processes influencing fitness?

Model variations:

1, 2 or 3 layers.

Inclusion of external timeseries

In a single layer:

Local or global parameters

Correlation between sites (inter-regional correlation)

Deterministic response to the lower layer

 Random walk (no tracking)

12

Likelihood: Kalman filter

Need a linear, normal Markov chain with independent normal observations:

X ( t

1

) X ( t

2

)  X ( t n

) Process z

1 z

The Ito solution

2 z n

Observations t t

X ( t ) B ( t )

1

X ( 0 ) B ( u ) ( u ) du B ( u ) dW ( u )

0 0 gives, together with measurement variances, what is needed to calculate the likelihood using the Kalman filter:

E ( z i

| z

1

,  , z i 1

) m i and Var ( z i

| z

1

,  , z i 1

) v i f ( z | ) n i 1 f

N

( z i

| m i

, v i

, )

13

Kalman smoothing (state estimation)

ML fit of a tracking model with 3 layers

North Atlantic

Red curve: expectancy

Black curve: realization

Green curve: uncertainty

Snapshot:

14

Inference

 Exact Gaussian likelihood, multi-modal

 Maximum likelihood by hill climbing from 50 starting points

 BIC for model comparison.

 Bayesian:

 Wide but informative prior distributions respecting identifying restrictions

MCMC (with parallel tempering) +

Importance sampling (for model likelihood)

Bayes factor for model comparison and posterior probabilities

Posterior weight of a property C from posterior model probabilities

W ( C )

C '

1 n

C

1

M C f ( data | M ) f ( data | M ) n

C ' M C '

15

Model inference concerns

Concerns:

Identification restriction: increasing tracking speed up the layers

In total: 723 models when equivalent models are pruned out)

Enough data for model selection?

Data Simulated from the ML fit of the original model

Model selection over original model plus 25 likely suspects.

• correct number of layers generally found with the Bayesian approach.

• BIC seems too stingy on the number of layers.

16

Results, original model

3 layers, no regionality, no correlation

17

Bayesian inference on the 723 models

W ( C )

C '

1 n

C

1

M C f ( data | M ) f ( data | M ) n

C ' M C '

18

95% credibility bands for the 5 most probable models

19

Summary

 Best 5 models in good agreement.

(together, 19.7% of summed integrated likelihood):

Three layers.

Common expectancy in bottom layer .

No impact of exogenous temperature series.

Lowest layer: Inter-regional correlations, 0.5. Sitespecific pull.

Middle layer: Intermediate tracking.

Upper layer: Very fast tracking.

Middle layers: fitness optima

Top layer: population mean log size

20

Layered inference and inference uncertainty

21

Phenotypic evolution on a phylogenetic tree:

Body size of primates

Measuremen ts i 1 ,  , 209 (92 extant measuremen ts, 117 fossils)

X i , 1

( t )

X i , 2

( t ) mean fitness phenotype optimum

(log mass)

X i , 3

( t )

X i , 3 underlying

X i , 2

X environmen i , 1 same tracking t, primary optimum model as for coccoliths

Phylogenet

T i , j

X i , l

( t )

Time

X

when j , l i split from j .

( t ) when t T i , j

.

Condition ic tree : on these equalities for the given tree .

Observatio be multi ns Z i , k

normally matrix as parametric

X i , k

( t k

) distribute expression

N ( 0 , d with

2 k

) will mean vecto r and covariance s from the SDE model.

22

First results for some primates

23

Why linear SDE processes?

Parsimonious: Simplest way of having a stochastic continuous time process that can track something else.

Tractable: The likelihood, L( ) f(Data | ), can be analytically calculated by the Kalman filter or directly by the parameterized multinormal model for the observations. ( = model parameter set)

Some justification from biology, see Lande (1976), Estes and Arnold

(2007), Hansen (1997), Hansen et. al (2008).

Great flexibility, widely applicable...

24

Further comments

Many processes evolve in continuous rather than discrete time. Thinking and modelling might then be more natural in continuous time.

Tracking SDE processes with latent layers allow rather general correlation structure within and between time series. Inference is possible.

Continuous time models allow related processes to be observed at variable frequencies (high-frequency data analyzed along with low frequency data).

Endogeneity, regression structure, co-integration, non- stationarity, causal structure, seasonality …. are possible in SDE processes.

Extensions to nonlinear SDE models…

SDE processes might be driven by non-Gaussian instantaneous stochasticity (jump processes).

25

Bibliography

Commenges D and Gégout-Petit A (2009), A general dynamical statistical model with causal interpretation. J.R. Statist. Soc. B, 71,

719-736

Lande R (1976), Natural Selection and Random Genetic Drift in

Phenotypic Evolution, Evolution 30, 314-334

Hansen TF (1997), Stabilizing Selection and the Comparative

Analysis of Adaptation, Evolution , 51-5, 1341-1351

Estes S, Arnold SJ (2007), Resolving the Paradox of Stasis:

Models with Stabilizing Selection Explain Evolutionary Divergence on All Timescales, The American Naturalist , 169-2, 227-244

Hansen TF, Pienaar J, Orzack SH (2008), A Comparative Method for Studying Adaptation to a Randomly Evolving Environment,

Evolution 62-8, 1965-1977

Schuss Z (1980). Theory and Applications of Stochastic Differential

Equations. John Wiley and Sons, Inc., New York.

Schweder T (1970). Decomposable Markov Processes. J. Applied

Prob. 7, 400 –410

Source codes, examples files and supplementary information can be found at http://folk.uio.no/trondr/layered/.

26

Download