Supporting Materials to accompany Modelling Heterotachy in

advertisement
Supporting Materials to accompany Modelling Heterotachy in Phylogenetic Inference by
Reversible-Jump Markov chain Monte Carlo
Mark Pagel, School of Biological Sciences, University of Reading, and Andrew Meade, Institute
of Biological Sciences, University of Aberystwyth
Calculating Proposal Ratios and Jacobian terms for the reversible-jump Markov chain
Monte Carlo method (MCMC, see text of article)
For most applications of MCMC the dimensionality of the current and proposed models is the
same and thus the Jacobian can be ignored, as it takes the value of 1. An unusual feature of the
approach we describe in the associated article, however, is that we wish to explore models with
differing numbers of parameters, corresponding to some branches of the tree topology being
assigned more than one length. The reversible-jump MCMC (RJ-MCMC) algorithm can be used
to construct chains that jump among models of differing dimensionality. These require carefully
constructed proposal mechanisms and calculation of the Jacobian term. We describe these below.
Splitting and Merging. To implement the reversible-jump procedure for determining
branches to add to or subtract from the current model in the chain, the ‘normal’ moves of the
Markov chain – those of exploring the parameters of the model of sequence evolution, Q, and the
possible different phylogenetic trees -- can be ignored as these moves do not change the
dimension of the parameter space. Instead we wish to define the proposal ratios and Jacobian
terms for moves that add a new branch length to the tree by splitting an existing branch into two,
or cause two distinct lengths to be merged into one. Call these moves “split” and “merge”
respectively.
Both moves begin by randomly selecting one of the 2s-3 edges of the tree. The model will
always have the same number of complete branch-length sets for every edge. Call this number k
where if k=1 the model is just a conventional non-mixture model, but for k>1 the model sums the
likelihood at each site over the k branch-length sets as in eq. 3. We will assume in what follows
that this number has been set in advance. Thus, if k=3 every edge is represented by three lengths.
In our RJ model, these three lengths can be identical in which case they represent a single ‘length
class’ and account for just one parameter. Alternatively, there could be two length classes among
the three lengths (two lengths being identical but differing from the third) in which case this edge
accounts for two parameters in the model (or one parameter over and above the conventional non-
mixture model). By comparison, the full branch-lengths sets mixture models would always treat
this circumstance as three parameters.
For a given k, if all k lengths associated with the edge are identical – there is a single length class
-- then the probability of attempting a split move on that edge is 1. This is because we can only
increase the dimensionality of edges with one length class. The probability of attempting a merge
move on the same edge is zero – we cannot reduce the dimesionality of an edge with one length
class. For edges with at least two but fewer than k length classes, the probability of attempting a
split move on that edge is 1/2, as is the probability of attempting a merge move. That is, we can
attempt to split when there is at least one length class with two or more elements, and equally, so
long as the number of elements in a single class is less than k, we could also attempt a merge
move. If all of the k lengths associated with an edge are different, we can no longer attempt to
split, so the probability of attempting a merge move on that edge is 1.0.
To conduct a split we first determine the number of different length classes that could be split,
that is, those containing at least two elements. If k=3, for example, it is possible to split so long
as there is either one or two length classes. If there are three (each containing one element) we
cannot split any of them because to do so would increase k beyond 3. Call the number of length
classes with greater than one element m. Then the probability of selecting one of these m classes
to split is 1/m. If the set m has n (identical) elements, some number ni of them will be assigned to
one of the new length classes to arise from the split and nj of them to the other, where ni+nj= n.
There are 2 n i n j 1 1 different ways of making these assignments. To accomplish the split we
draw a uniform random number, u, and create two new lengths by calculating t i  t  u / n i and
t j t  u /n j . If u is drawn on the interval n i t to n j t then the two new lengths sum to t.


A merge move seeks to combine two
 different length classes into one. If there is only one length

class associated with the edge it cannot be merged. If the edge has two or more length classes,
two are chosen at random from among the l classes available, and there are
 ways of doing
l
2
this. The two length classes chosen are merged creating one fewer length class, with all of the
elements in both classes being combined by a simple weighted average,
the weights being the ni

and nj.
Following this algorithm the proposal ratio for a split move is:
Pm' 
1

l'
2
1
1
Ps   n1
 p(u)
m (2 1)
where Pm and Ps refer to the probabilities
of merging and splitting respectively, p(u) is the

probability of observing u, and the primes denote the proposed model. Similarly, the proposal
ratio for merge move is:

Ps' 
1
1
 n1
 p(u)
'
m (2 1)
.
1
Pm  l

2
The Jacobian is defined as the determinant
of the square matrix of partial derivates of the

proposed parameters with respect to the current parameters and to the amount by which they are
to be changed, u. Here the Jacobian for a split performed on set m is written as:
t i
J= tt
j
t
t i
u n i  n j
t j  n i n j ,
u
where the bold elements in the Jacobian
 are square matrices of their respective partial derivatives
with dimensions corresponding to the number of elements in ni and nj. The Jacobian for the merge
is the reciprocal of J.
For the analyses reported here we have implemented the RJ model to be seeded with k=2
identical branch length sets, corresponding to two vectors of branch lengths in the above
equations. We then apply the split and merge moves to the shared edges of these vectors. The
lengths of the edges that never accept a split always remain identical across the two branch-length
sets although they can change as a pair in response to normal branch length updates in the
Markov chain. Edges that accept a split can adopt lengths that diverge between the two length
classes. At some later iteration of the chain they might even be merged to re-form a single length.
We use just two branch-length sets here because previous testing (Meade and Pagel, 2008) had
indicated that two were adequate for these data. In future implementations of the model we will
investigate allowing the number of distinct sets to change dynamically according to ‘augment’
and ‘reduce’ moves such as we have described elsewhere (Pagel and Meade, 2006). These moves
will propose to add or to remove an entire set of branch-lengths in one iteration of the chain,
being aided by an appropriate Jacobian term to account for the large change in dimensionality of
the chain.
Download