Supplementary information for `The sequence of the Human X

advertisement
Supplementary Discussion 1
Dating the recent ~4 Mb transposition event from the X to the Y chromosome
(XTR)
It is known from earlier work that the existence of the X-transposed region (XTR) on
the Y chromosome is specific to humans1. This is commonly supposed to place an upper
bound on the age of the transposition event, in that the transposition must have occurred
after the speciation event that lead to human and chimpanzee lineages, which has been
dated to ~6 MYA2. However, it is worth noting that the transposition event could have
predated this speciation event if the transposition had been polymorphic within the
human/chimpanzee common ancestor and was subsequently fixed only on the human
lineage. In reality, due to the short-lived nature of Y-linked polymorphism it is highly
unlikely that the transposition could remain polymorphic on the Y chromosome for long.
Therefore, a priori it is highly unlikely that the transposition event predates ~8 MYA.
The degree of sequence divergence between extant X and Y copies of the XTR is
related to the length of time since the transposition event. However, this is not a simple
relationship as the amount of sequence divergence is complicated by two factors:
(i) Polymorphism on the X chromosome at the time of transposition
(ii) Different rates of sequence evolution on the X and Y chromosomes
Polymorphism on the X chromosome at the time of transposition means that the X
chromosome that acted as donor to the Y chromosome is likely to have differed in
sequence within the transposed region from the X chromosomes that gave rise to the
extant XTR on the human X chromosome3. In other words, the extant X and Y copies of
the XTR in humans share a most recent common ancestor that predates the transposition
event by an unknown amount of time. Consequently, immediately after they shared this
common ancestor, the extant Y copy of the XTR was initially evolving on the X
chromosome and only later, after the transposition event, started evolving on the Y.
Because of the different mutation rates in the male and female germlines, the rate of
sequence evolution on the X chromosome (µx) is not the same as the rate of evolution on
the Y chromosome (µy). However, these two evolutionary rates can be related by the
male driven mutation rate (), which is the ratio of male and female germline mutation
rates. Y chromosomal sequences are inherited through the male germline all the time and
X chromosome sequences are inherited through the male germline one third of the time.
Consequently, if the mutation rate in the female germline is µ, then the evolutionary rates
of the sex chromosomes are:
µy = µ
µx = ((2+)/3)µ
The ancestral relationships among the extant XTR sequences in human and an outgroup
can be related by the following phylogeny.
t1
X
Y
Transposition
t2
X1-XTR
Y-XTR
X2-XTR
Parameters:
X1-XTR
extant human X copy of XTR
X2 –XTR
extinct X copy that transposed to Y
Y-XTR
extant human Y copy of XTR
O-XTR
outgroup X copy of XTR
Y
Mutations on Y branch since X1 split from X2
O-XTR
X
Mutations on X branch since X1 split from X2
µx
mutation rate on X chromosome
µy
mutation rate on Y chromosome

ratio of male and female mutation rate
t1
time between divergence of X1 and X2 and transposition event
t2
time since transposition
Below we show that we can estimate the age of the transposition event by estimating
the amount of time the Y chromosomal copy of the XTR has been evolving on the Y
chromosome. We can estimate this time (by solving simultaneous equations) if we:

count the number of mutations that have occurred on the lineages leading
to the extant X and Y copies of the XTR,

take a range of plausible values for  from the literature, and

estimate the rate of sequence evolution on the X chromosome.
Looking at the figure above we can see that:
X = µxt1 + µxt2
(1)
Y = µxt1 + µyt2
Therefore:
Y-X = µyt2 - µxt2
µy /µx = 3/(2+)
µy = 3µx/(2+)
Therefore:
(Reference 4 and see above)
Y-X = 3µxt2/(2+) - µxt2 = µxt2(2-2)/(2+)
Which rearranges to:
µxt2 = (Y-X)(2+)/(2-2)
(2)
µxt2 can be estimated by determining Y and X, and taking a range of plausible 
values from the literature. Subsequently, µxt1 can then be calculated using equation (1).
This gives the relative values of t2 and t1, in other words the proportions of the time
since X1 and X2 split that occurred before and after the transposition event. To convert
these proportions to absolute values of t1 and t2 we need to know µx.
Values for X, Y and µx are estimated below.
Counting the number of mutations occurring on the X and Y XTR lineages
(parameters X and Y)
To count the number of mutations that have occurred on the lineages leading to the X
and Y copies of the XTR we must use an outgroup to identify the ancestral state of each
mismatch between the X and Y copies of the XTR. Partial sequence (~2.2Mb) of the
XTR in chimpanzees is available from the November 13th 2003 sequence assembly from
the Chimpanzee Sequencing Consortium (http://www.ensembl.org/Pan_troglodytes/).
However, it has been pointed out3 that the chimpanzee sequence may not be a valid
outgroup as it could share a common ancestor with either the X or Y copy of the XTR to
the exclusion of the other (see phylogenies below).
To investigate whether the extant chimpanzee XTR sequence is indeed a valid
outgroup, we used previously published sequence data from a small portion (~40kb) of
the XTR in chimpanzee and gorilla5 to construct a phylogeny relating the chimpanzee
XTR, gorilla XTR, human Y-chromosomal XTR and human X-chromosomal XTR. The
gorilla sequence acts as an outgroup that reveals the order of branching of the other three
lineages. The neighbour-joining tree of these four sequences (see below) clearly shows
that the human X and Y chromosomal XTR sequences share a common ancestor to the
exclusion of chimpanzee, and therefore that the chimpanzee XTR sequence constitutes a
valid outgroup. Notably, the branch length (0.00179) between the human X&Y XTR
common ancestor and the common ancestor that these two sequences share with the
chimpanzee XTR is relatively short. This suggests that the two human XTR lineages
diverged from one another shortly after they diverged from the extant chimpanzee XTR
sequence.
Neighbour-joining tree relating the ~40kb sequences from reference 5.
Having demonstrated the Chimpanzee XTR sequence to be a valid outgroup, this
sequence (~2.2 Mb) was aligned with the human X (~3.9 Mb) and Y (~3.4 Mb) copies of
the XTR using MLAGAN6. The resulting alignment (~1.6 Mb) was then visualised using
VISTA7 to reveal regions of incorrect alignment, which were subsequently manually
edited. Once all sites containing gaps or Ns within any sequence were removed, the
sequence similarity between human X and Y copies in this XTR alignment was 98.83%
All sites within the ~1.6 Mb alignment of the three sequences that vary between the
human X and Y copies were identified (19,275 sites in 1,645,694 bp) and classified into
one of three classes:

mutations occurring on the Y lineage (Y=11,563),

mutations occurring on the X lineage (X=7,469) and,

sites at which the ancestral state was ambiguous (N=243).
To avoid underestimating the number of mutations that have occurred on the extant
X and Y lineages the sites with ambiguous ancestral state were apportioned to the X and
Y lineages in the same ratio as the sites that could be assigned to either lineage
(11,563:7,469). This gives final values of 7,564 and 11,711 for X and Y respectively.
Estimating the mutation rate on the human X chromosome (µx)
The rate of X-chromosomal XTR sequence evolution (µx) can be estimated by
aligning the human and chimpanzee XTR sequences (Chimpanzee Sequencing
Consortium http://www.genome.wustl.edu/projects/chimp/ ), and dividing the sequence
divergence by the evolutionary time separating these two sequences. At present, sequence
divergence can most accurately be obtained by aligning the human X sequence to highquality sites in individual shotgun sequence reads from the chimpanzee project (J.
Mullikin, personal communication). This analysis gives a sequence divergence of 1.1%
between the human and chimpanzee XTR sequences, which agrees closely with a
previous estimate of the overall divergence of 1.0% between human and chimpanzee X
chromosomes8.
We now turn our attention to the evolutionary time separating the human and
chimpanzee XTR sequences. Due to polymorphism in the human-chimp common
ancestor, human and chimpanzee XTR sequences must have diverged prior to the
speciation event. Just how long before the speciation event these two sequences diverged
depends on the effective population size of this common ancestor; there is a simple
relationship between the mean coalescence time of two neutrally-evolving lineages and
the effective population size (t=2Ne generations). One recent study has estimated the
effective population size (Ne) of the common ancestor to be 5-9 times greater than the
estimate of 10,000 for humans9. An older study with fewer data estimated the ancestral
Ne to be 3.5-6.5 times greater than in humans10. Similarly, it has been postulated that the
genome-wide nucleotide diversity in the common ancestor (and therefore the effective
population size) was about 4 times greater in this common ancestor than that among
extant humans8. Assuming a generation time of 15 years for the common ancestor9 and
correcting for the lower effective population size of the X chromosome, suggests that the
human and chimpanzee XTRs shared a common ancestor some 2.25-4.05 MY before the
subsequent speciation event (6 MYA).
Using these estimates for sequence divergence and evolutionary time separating the
human and chimpanzee XTRs, we estimate an X-chromosomal mutation rate (µx) of 5.56.7 x 10-10 per nucleotide per year, which agrees well with previous estimates11.
Estimating the date of the transposition event (t2).
In the table below, a range of possible times for transposition event (t2) are calculated
using equation (2) with a range of plausible values of  (reference 12) and the Xchromosomal mutation rate calculated above. Sampling error is likely to be negligible
compared to the uncertainty in our knowledge of these key evolutionary parameters.
Further comparative sequencing of great ape genomes can be expected to reduce
uncertainty in these parameters in the near future, which should allow for more precise
dating of the transposition event.
Note that this analysis suggests a minimum plausible value for  of 2.13, as this is
the minimal value that can account for the greater number of mutations on the Y lineage
even if the transposition event takes places at the same time as divergence of X1 and X2
(i.e. maximising the evolutionary time spent in the more mutagenic male germline).
Within this range of estimates we believe the most plausible estimate is that
corresponding to ancestral Ne = 50000,  = 3 (reference 8), which gives a date for the
transposition event of ~4.7 MYA.
α
Ne
µx
t2
2.5
3
3.5
4
4.5
5
5.5
6
50000
50000
50000
50000
50000
50000
50000
50000
6.7E-10
6.7E-10
6.7E-10
6.7E-10
6.7E-10
6.7E-10
6.7E-10
6.7E-10
5.67E+06
4.72E+06
4.16E+06
3.78E+06
3.51E+06
3.31E+06
3.15E+06
3.02E+06
2.5
3
3.5
4
4.5
5
5.5
6
90000
90000
90000
90000
90000
90000
90000
90000
5.5E-10
5.5E-10
5.5E-10
5.5E-10
5.5E-10
5.5E-10
5.5E-10
5.5E-10
6.91E+06
5.75E+06
5.06E+06
4.60E+06
4.27E+06
4.03E+06
3.84E+06
3.68E+06
Date of transposition event
8.00E+06
7.00E+06
6.00E+06
MYA
5.00E+06
4.00E+06
3.00E+06
2.00E+06
1.00E+06
0.00E+00
2.5
3
3.5
4
4.5
5
Alpha
µx=6.7E-10
µx=5.5E-10
5.5
6
References
1. Page, D.C., Harper, M.E., Love, J. & Botstein, D. Occurrence of a transposition from
the X-chromosome long arm to the Y-chromosome short arm during human evolution.
Nature 311, 119-122 (1984).
2. Glazko, G.V. & Nei, M. Estimation of divergence times for major lineages of primate
species. Mol. Biol. Evol. 20, 424-434 (2003).
3. Makova, K.D. & Li, W.H. Strong male-driven evolution of DNA sequences in humans
and apes. Nature 416, 624-626 (2002).
4. Miyata, T., Hayashida, H., Kuma, K., Mitsuyasa, K. & Yasunaga, T. Male-driven
molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harb. Symp.
Quant. Biol. 52, 863-867 (1987).
5. Bohossian, H.B., Skaletsky, H. & Page D.C. Unexpectedly similar rates of nucleotide
substitution found in male and female hominids. Nature 406, 622-625 (2000).
6. Brudno, M. et al. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple
alignment of genomic DNA. Genome Res. 13, 721-731 (2003).
7. Mayor, C. et al. VISTA : visualizing global DNA sequence alignments of arbitrary
length. Bioinformatics 16, 1046-1047 (2000).
8. Ebersberger, I., Metzler, D., Schwarz, C. & Paabo, S. Genomewide comparison of
DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70, 1490-1497
(2002).
9. Chen, F.C. & Li, W.H. Genomic divergences between humans and other hominoids
and the effective population size of the common ancestor of humans and chimpanzees.
Am. J. Hum. Genet. 68, 444-456 (2001).
10. Ruvolo, M. Molecular phylogeny of the hominoids: inferences from multiple
independent DNA sequence data sets. Mol. Biol. Evol. 14, 248-265 (1997).
11. Nachman, M.W. & Crowell, S.L. Estimate of the mutation rate per nucleotide in
humans. Genetics 156, 297-304 (2000).
12. Li, W.H., Yi, S. & Makova, K. Male-driven evolution. Curr. Opin. Genet. Dev. 12,
650-656 (2002).
Download