Supplementary Discussion 1 Dating the recent ~4 Mb transposition event from the X to the Y chromosome (XTR) It is known from earlier work that the existence of the X-transposed region (XTR) on the Y chromosome is specific to humans1. This is commonly supposed to place an upper bound on the age of the transposition event, in that the transposition must have occurred after the speciation event that lead to human and chimpanzee lineages, which has been dated to ~6 MYA2. However, it is worth noting that the transposition event could have predated this speciation event if the transposition had been polymorphic within the human/chimpanzee common ancestor and was subsequently fixed only on the human lineage. In reality, due to the short-lived nature of Y-linked polymorphism it is highly unlikely that the transposition could remain polymorphic on the Y chromosome for long. Therefore, a priori it is highly unlikely that the transposition event predates ~8 MYA. The degree of sequence divergence between extant X and Y copies of the XTR is related to the length of time since the transposition event. However, this is not a simple relationship as the amount of sequence divergence is complicated by two factors: (i) Polymorphism on the X chromosome at the time of transposition (ii) Different rates of sequence evolution on the X and Y chromosomes Polymorphism on the X chromosome at the time of transposition means that the X chromosome that acted as donor to the Y chromosome is likely to have differed in sequence within the transposed region from the X chromosomes that gave rise to the extant XTR on the human X chromosome3. In other words, the extant X and Y copies of the XTR in humans share a most recent common ancestor that predates the transposition event by an unknown amount of time. Consequently, immediately after they shared this common ancestor, the extant Y copy of the XTR was initially evolving on the X chromosome and only later, after the transposition event, started evolving on the Y. Because of the different mutation rates in the male and female germlines, the rate of sequence evolution on the X chromosome (µx) is not the same as the rate of evolution on the Y chromosome (µy). However, these two evolutionary rates can be related by the male driven mutation rate (), which is the ratio of male and female germline mutation rates. Y chromosomal sequences are inherited through the male germline all the time and X chromosome sequences are inherited through the male germline one third of the time. Consequently, if the mutation rate in the female germline is µ, then the evolutionary rates of the sex chromosomes are: µy = µ µx = ((2+)/3)µ The ancestral relationships among the extant XTR sequences in human and an outgroup can be related by the following phylogeny. t1 X Y Transposition t2 X1-XTR Y-XTR X2-XTR Parameters: X1-XTR extant human X copy of XTR X2 –XTR extinct X copy that transposed to Y Y-XTR extant human Y copy of XTR O-XTR outgroup X copy of XTR Y Mutations on Y branch since X1 split from X2 O-XTR X Mutations on X branch since X1 split from X2 µx mutation rate on X chromosome µy mutation rate on Y chromosome ratio of male and female mutation rate t1 time between divergence of X1 and X2 and transposition event t2 time since transposition Below we show that we can estimate the age of the transposition event by estimating the amount of time the Y chromosomal copy of the XTR has been evolving on the Y chromosome. We can estimate this time (by solving simultaneous equations) if we: count the number of mutations that have occurred on the lineages leading to the extant X and Y copies of the XTR, take a range of plausible values for from the literature, and estimate the rate of sequence evolution on the X chromosome. Looking at the figure above we can see that: X = µxt1 + µxt2 (1) Y = µxt1 + µyt2 Therefore: Y-X = µyt2 - µxt2 µy /µx = 3/(2+) µy = 3µx/(2+) Therefore: (Reference 4 and see above) Y-X = 3µxt2/(2+) - µxt2 = µxt2(2-2)/(2+) Which rearranges to: µxt2 = (Y-X)(2+)/(2-2) (2) µxt2 can be estimated by determining Y and X, and taking a range of plausible values from the literature. Subsequently, µxt1 can then be calculated using equation (1). This gives the relative values of t2 and t1, in other words the proportions of the time since X1 and X2 split that occurred before and after the transposition event. To convert these proportions to absolute values of t1 and t2 we need to know µx. Values for X, Y and µx are estimated below. Counting the number of mutations occurring on the X and Y XTR lineages (parameters X and Y) To count the number of mutations that have occurred on the lineages leading to the X and Y copies of the XTR we must use an outgroup to identify the ancestral state of each mismatch between the X and Y copies of the XTR. Partial sequence (~2.2Mb) of the XTR in chimpanzees is available from the November 13th 2003 sequence assembly from the Chimpanzee Sequencing Consortium (http://www.ensembl.org/Pan_troglodytes/). However, it has been pointed out3 that the chimpanzee sequence may not be a valid outgroup as it could share a common ancestor with either the X or Y copy of the XTR to the exclusion of the other (see phylogenies below). To investigate whether the extant chimpanzee XTR sequence is indeed a valid outgroup, we used previously published sequence data from a small portion (~40kb) of the XTR in chimpanzee and gorilla5 to construct a phylogeny relating the chimpanzee XTR, gorilla XTR, human Y-chromosomal XTR and human X-chromosomal XTR. The gorilla sequence acts as an outgroup that reveals the order of branching of the other three lineages. The neighbour-joining tree of these four sequences (see below) clearly shows that the human X and Y chromosomal XTR sequences share a common ancestor to the exclusion of chimpanzee, and therefore that the chimpanzee XTR sequence constitutes a valid outgroup. Notably, the branch length (0.00179) between the human X&Y XTR common ancestor and the common ancestor that these two sequences share with the chimpanzee XTR is relatively short. This suggests that the two human XTR lineages diverged from one another shortly after they diverged from the extant chimpanzee XTR sequence. Neighbour-joining tree relating the ~40kb sequences from reference 5. Having demonstrated the Chimpanzee XTR sequence to be a valid outgroup, this sequence (~2.2 Mb) was aligned with the human X (~3.9 Mb) and Y (~3.4 Mb) copies of the XTR using MLAGAN6. The resulting alignment (~1.6 Mb) was then visualised using VISTA7 to reveal regions of incorrect alignment, which were subsequently manually edited. Once all sites containing gaps or Ns within any sequence were removed, the sequence similarity between human X and Y copies in this XTR alignment was 98.83% All sites within the ~1.6 Mb alignment of the three sequences that vary between the human X and Y copies were identified (19,275 sites in 1,645,694 bp) and classified into one of three classes: mutations occurring on the Y lineage (Y=11,563), mutations occurring on the X lineage (X=7,469) and, sites at which the ancestral state was ambiguous (N=243). To avoid underestimating the number of mutations that have occurred on the extant X and Y lineages the sites with ambiguous ancestral state were apportioned to the X and Y lineages in the same ratio as the sites that could be assigned to either lineage (11,563:7,469). This gives final values of 7,564 and 11,711 for X and Y respectively. Estimating the mutation rate on the human X chromosome (µx) The rate of X-chromosomal XTR sequence evolution (µx) can be estimated by aligning the human and chimpanzee XTR sequences (Chimpanzee Sequencing Consortium http://www.genome.wustl.edu/projects/chimp/ ), and dividing the sequence divergence by the evolutionary time separating these two sequences. At present, sequence divergence can most accurately be obtained by aligning the human X sequence to highquality sites in individual shotgun sequence reads from the chimpanzee project (J. Mullikin, personal communication). This analysis gives a sequence divergence of 1.1% between the human and chimpanzee XTR sequences, which agrees closely with a previous estimate of the overall divergence of 1.0% between human and chimpanzee X chromosomes8. We now turn our attention to the evolutionary time separating the human and chimpanzee XTR sequences. Due to polymorphism in the human-chimp common ancestor, human and chimpanzee XTR sequences must have diverged prior to the speciation event. Just how long before the speciation event these two sequences diverged depends on the effective population size of this common ancestor; there is a simple relationship between the mean coalescence time of two neutrally-evolving lineages and the effective population size (t=2Ne generations). One recent study has estimated the effective population size (Ne) of the common ancestor to be 5-9 times greater than the estimate of 10,000 for humans9. An older study with fewer data estimated the ancestral Ne to be 3.5-6.5 times greater than in humans10. Similarly, it has been postulated that the genome-wide nucleotide diversity in the common ancestor (and therefore the effective population size) was about 4 times greater in this common ancestor than that among extant humans8. Assuming a generation time of 15 years for the common ancestor9 and correcting for the lower effective population size of the X chromosome, suggests that the human and chimpanzee XTRs shared a common ancestor some 2.25-4.05 MY before the subsequent speciation event (6 MYA). Using these estimates for sequence divergence and evolutionary time separating the human and chimpanzee XTRs, we estimate an X-chromosomal mutation rate (µx) of 5.56.7 x 10-10 per nucleotide per year, which agrees well with previous estimates11. Estimating the date of the transposition event (t2). In the table below, a range of possible times for transposition event (t2) are calculated using equation (2) with a range of plausible values of (reference 12) and the Xchromosomal mutation rate calculated above. Sampling error is likely to be negligible compared to the uncertainty in our knowledge of these key evolutionary parameters. Further comparative sequencing of great ape genomes can be expected to reduce uncertainty in these parameters in the near future, which should allow for more precise dating of the transposition event. Note that this analysis suggests a minimum plausible value for of 2.13, as this is the minimal value that can account for the greater number of mutations on the Y lineage even if the transposition event takes places at the same time as divergence of X1 and X2 (i.e. maximising the evolutionary time spent in the more mutagenic male germline). Within this range of estimates we believe the most plausible estimate is that corresponding to ancestral Ne = 50000, = 3 (reference 8), which gives a date for the transposition event of ~4.7 MYA. α Ne µx t2 2.5 3 3.5 4 4.5 5 5.5 6 50000 50000 50000 50000 50000 50000 50000 50000 6.7E-10 6.7E-10 6.7E-10 6.7E-10 6.7E-10 6.7E-10 6.7E-10 6.7E-10 5.67E+06 4.72E+06 4.16E+06 3.78E+06 3.51E+06 3.31E+06 3.15E+06 3.02E+06 2.5 3 3.5 4 4.5 5 5.5 6 90000 90000 90000 90000 90000 90000 90000 90000 5.5E-10 5.5E-10 5.5E-10 5.5E-10 5.5E-10 5.5E-10 5.5E-10 5.5E-10 6.91E+06 5.75E+06 5.06E+06 4.60E+06 4.27E+06 4.03E+06 3.84E+06 3.68E+06 Date of transposition event 8.00E+06 7.00E+06 6.00E+06 MYA 5.00E+06 4.00E+06 3.00E+06 2.00E+06 1.00E+06 0.00E+00 2.5 3 3.5 4 4.5 5 Alpha µx=6.7E-10 µx=5.5E-10 5.5 6 References 1. Page, D.C., Harper, M.E., Love, J. & Botstein, D. Occurrence of a transposition from the X-chromosome long arm to the Y-chromosome short arm during human evolution. Nature 311, 119-122 (1984). 2. Glazko, G.V. & Nei, M. Estimation of divergence times for major lineages of primate species. Mol. Biol. Evol. 20, 424-434 (2003). 3. Makova, K.D. & Li, W.H. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416, 624-626 (2002). 4. Miyata, T., Hayashida, H., Kuma, K., Mitsuyasa, K. & Yasunaga, T. Male-driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harb. Symp. Quant. Biol. 52, 863-867 (1987). 5. Bohossian, H.B., Skaletsky, H. & Page D.C. Unexpectedly similar rates of nucleotide substitution found in male and female hominids. Nature 406, 622-625 (2000). 6. Brudno, M. et al. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721-731 (2003). 7. Mayor, C. et al. VISTA : visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046-1047 (2000). 8. Ebersberger, I., Metzler, D., Schwarz, C. & Paabo, S. Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70, 1490-1497 (2002). 9. Chen, F.C. & Li, W.H. Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am. J. Hum. Genet. 68, 444-456 (2001). 10. Ruvolo, M. Molecular phylogeny of the hominoids: inferences from multiple independent DNA sequence data sets. Mol. Biol. Evol. 14, 248-265 (1997). 11. Nachman, M.W. & Crowell, S.L. Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297-304 (2000). 12. Li, W.H., Yi, S. & Makova, K. Male-driven evolution. Curr. Opin. Genet. Dev. 12, 650-656 (2002).