The Y chromosome

advertisement
The Y chromosome — with the genes to make
a man — has been sequenced. Often regarded
as a genetic wasteland, the sequence reveals
that we may have underestimated its powers.
Here, Nature presents the research, as well as
news, reviews and analysis. As with all of
Nature's genome content, these articles are
available free online. To celebrate this
landmark, we are offering a 15% discount for
a subscription to Nature. Click here for details.
News and Views
Tales of the Y chromosome
H. F. Willard
Determining the sequence of the human Y chromosome presented
a daunting challenge to genome researchers. But the task is now
done, and the secrets revealed justify the effort.
Article
The male-specific region of the human Y chromosome is a mosaic of
discrete sequence classes
H. Skaletsky et al.
Letter
Abundant gene conversion between arms of palindromes in human and
ape Y chromosomes
S. Rozen et al.
Nature Science Update
Y chromosome sequence completed
DNA readout reveals genetic palindromes safeguard male-defining
chromosome.
Y chromosomes rewrite British history
Anglo-Saxons' genetic stamp weaker than historians suspected.
All articles here are available free to registered users
evolution
Human spermatozoa: The future of sex
R. John Aitken, Jennifer A. Marshall Graves
The vulnerability of the Y chromosome will be a key factor in
shaping the evolutionary future of our species.
Nature 415, 963 (28 Feb 2002)
Unexpectedly similar rates of nucleotide substitution found in male and
female hominids
Hacho B. Bohossian, Helen Skaletsky, David C. Page
Nature 406, 622 - 625 (10 Aug 2000)
Y-chromosome variation and Irish origins
Emmeline W. Hill, Mark A. Jobling, Daniel G. Bradley
Nature 404, 351 - 352 (23 Mar 2000)
The application of molecular genetic approaches to the study of human
evolution
L. Luca Cavalli-Sforza, Marcus W. Feldman
Nature Genetics 33, 266 - 275 (01 Mar 2003)
The human Y chromosome, in the light of evolution
Bruce T. Lahn, Nathaniel M. Pearson, Karin Jegalian
Nature Reviews Genetics 2, 207 - 216 (01 Mar 2001)
Y chromosome sequence variation and the history of human
populations
Peter A. Underhill et al.
Nature Genetics 26, 358 - 361 (01 Nov 2000)
development
DMY is a Y-specific DM-domain gene required for male development in
the medaka fish
Masaru Matsuda et al.
Nature 417, 559 - 563 (30 May 2002)
Male development of chromosomally female mice transgenic for Sry
Koopman P, Gubbay J, Vivian N, Goodfellow P, Lovell-Badge R.
Nature 351, 96 (9 May 1991)
Sox9 induces testis development in XX transgenic mice
Valerie P.I. Vidal, Marie-Christine Chaboissier, Dirk G. de Rooij,
Andreas Schedl
Nature Genetics 28, 216 - 217 (01 Jul 2001)
A transgenic insertion upstream of Sox9 is associated with dominant
XX sex reversal in the mouse
Colin E. Bishop et al.
Nature Genetics 26, 490 - 494 (01 Dec 2000)
genetics
Human mtDNA and Y-chromosome variation is correlated with
matrilocal versus patrilocal residence
H. Oota, W. Settheetham-Ishida, D. Tiwawech, T. Ishida, & M.
Stoneking
Nature Genetics 29, 20-21 (2001)
Retroposition of autosomal mRNA yielded testis-specific gene family on
human Y chromosome
Bruce T Lahn, David C Page
Nature Genetics 21, 429 - 433
Reduced adaptation of a non-recombining neo-Y chromosome
Doris Bachtrog, Brian Charlesworth
Nature 416, 323 - 326 (21 Mar 2002)
Strong male-driven evolution of DNA sequences in humans and apes
Kateryna D. Makova, Wen-Hsiung Li
Nature 416, 624 - 626 (11 Apr 2002)
A physical map of the human Y chromosome
Charles A. Tilford et al.
Nature 409, 943 - 945 (15 Feb 2001)
Picture credits:
The creator and voices behind the Simpsons Matt Groening © MC PHERSON
COLIN/CORBIS SYGMA
Vitruvian Man by Leonardo da Vinci © Bettmann/CORBIS
David by Michelangelo © Royalty-Free/CORBIS
Nature 423, 810 - 813 (19 June 2003); doi:10.1038/423810a
Genome biology: Tales of the Y chromosome
HUNTINGTON F. WILLARD
Huntington F. Willard is at the Institute for Genome Sciences and Policy, and the Department of Molecular Genetics and
Microbiology, Duke University, Durham, North Carolina 27710, USA.
e-mail: hunt.willard@duke.edu
Determining the sequence of the human Y chromosome presented a daunting
challenge to genome researchers. But the task is now done, and the secrets revealed
justify the effort.
Ancient maps showed the known world in colourful detail, beyond the edges of which
lay vast expanses of terra incognita. Much creative thought went into portraying this
unexplored territory, often featuring nasty-looking serpents and dragons. Only when
Magellan managed to circumnavigate the globe did it become apparent that the
unknown was in fact navigable, and that the serpents and dragons, if not illusory,
could at least be tamed. The human genome has its terra incognita too, some of it
known, much of it subject to alternating angst and fascination by genome biologists,
and all of it to be avoided if possible — until now. On pages 825 and 873 of this issue1,
2
, a group of modern-day Magellans describe how they sailed headlong into the frothy
seas of duplicated, inverted and otherwise troublesome sequences on the human Y
chromosome. They have emerged safely on the other side, with tales to tell.
Because of its distinctive role in sex determination, the Y chromosome has long
attracted special attention from geneticists, evolutionary biologists and even the lay
public. It is known to consist of regions of DNA that show quite distinctive genetic
behaviour and genomic characteristics. The two human sex chromosomes, X and Y
(Fig. 1), originated a few hundred million years ago from the same ancestral autosome
— a non-sex chromosome — during the evolution of sex determination3. They then
diverged in sequence over the succeeding aeons. Nowadays, there are relatively short
regions at either end of the Y chromosome that are still identical to the corresponding
regions of the X chromosome, reflecting the frequent exchange of DNA between these
regions ('recombination') that occurs during sperm production 4. But more than 95% of
the modern-day Y chromosome is male-specific, consisting of some 23 million base
pairs (Mb) of euchromatin — the part of our genome containing most of the genes —
and a variable amount of heterochromatin, consisting of highly repetitive DNA and
often dismissed as non-functional. Now, in an accomplishment that can only be
described as heroic, Skaletsky et al.1 report the complete sequence of the 23-Mb
euchromatic segment, which they designate the MSY, for 'male-specific region of the
Y'.
Figure 1 Male make-up. Full legend
High resolution image and legend (110k)
Prioritization in the Human Genome Project had led to the heterochromatic regions of
the Y and other chromosomes being set aside to be dealt with later, if ever. But there
was reason to hope that the euchromatin of the Y chromosome would present no more
difficult a sequencing challenge than that found elsewhere in the genome. That
supposition could not have been more wrong. As Skaletsky et al. report, the MSY is a
mosaic of complex and interrelated sequences that made this one of the most
problematic regions of the human genome thus far to be successfully sequenced and
assembled.
For instance, about 10–15% of the MSY consists of stretches of sequence that moved
there from the X chromosome within only the past few million years. These stretches
are still 99% identical to their X-chromosome counterparts and are dominated by a
high proportion of interspersed repetitive sequences, with only two genes. A further
20% of the MSY consists of a class of sequences ('X-degenerate' sequences1) that are
more distantly related to the X chromosome, reflecting their more ancient common
origin. And the remainder comprises a web of Y-specific repetitive sequences that
make up a series of palindromes — sequences that read the same on both strands of
the DNA double helix, with two 'arms' stretching out from a central point of mirrored
symmetry. These palindromes come in a range of sizes, up to almost 3 Mb in length,
with more than 99.9% identity between the two arms of each palindrome.
The repetitive sequences, particularly the palindromes, caused some difficulties for
sequence assemblers. Genome-sequencing projects involve fragmenting the genome in
question into small, overlapping pieces, sequencing them, and then using computer
algorithms to put the pieces together in the correct order. There are various ways of
doing this; assembling the MSY's palindromes (and discriminating between their arms)
required an iterative mapping and sequencing process more reminiscent of the
knowledge-based mapping approaches of the early days of the Human Genome Project
than the high-throughput assemblies that have emerged for most of the genome5, 6.
This strategy was aided by the fact that the sequence came from a single Y
chromosome, so Skaletsky et al. knew that minor sequence variations must have come
from duplicated copies on the same chromosome, rather than from different Y
chromosomes. Although necessarily more painstaking, this overall approach provides a
model for how researchers might attack at least some of the troublesome areas of the
rest of the genome — such as blocks of repetitive heterochromatin and the hundreds
of regions of substantial sequence duplication7 — where standard assembly programs
can be fooled.
This is not just a celebratory tale of a successful sequencing journey, however. Along
the way, Skaletsky et al. picked up artefacts of Y-chromosome antiquity, dating as far
back as 300 million years, that allow a glimpse into the evolutionary strategies that the
Y chromosome has used to survive.
For instance, from the degree and patterns of divergence of the genes found on both
sex chromosomes, the authors provide evidence for the stepwise decay of the Y
chromosome over time and define changes in both Y-chromosome organization and
gene content and expression. Unlike the regions at the ends, most of the lengths of
the sex chromosomes do not exchange sequence during sperm production, and
Skaletsky et al. point to two consequences of this suppression of recombination. First,
selection occurred on the Y chromosome for a group of testis-specific genes that the
authors argue may have enhanced male fertility. Most of these genes are found within
the palindromes, showing why it can be important to sequence such difficult regions.
Second, as X–Y recombination became suppressed during evolution, an alternative
mechanism had to emerge to maintain the sequence and function of the remaining Ychromosome genes and to prevent the accumulation of inactivating mutations and the
ultimate demise of the chromosome8.
To gain insight into this alternative mechanism, Rozen et al.2 examined the hypothesis
that X–Y recombination has been replaced by extensive, ongoing recombination
between the arms of the MSY palindromes — where the sequence on one arm of the
palindrome alters or 'converts' the sequence on the other. To test the predictions of
this model, the authors sequenced one particular palindrome-embedded gene from Y
chromosomes from around the world, representing the full tree of the previously
established Y-chromosome genealogy9. They found several instances where the
sequence of the copy of the gene on one arm of the palindrome had altered the
sequence of the other arm's copy. From this, they calculate that as many as 600 base
pairs (from the 5.4 Mb contained in MSY palindromes) must be converted in each
newborn male in the human population.
These data also indicate that gene conversion in general may be more common than
previously suspected, especially in other palindromic and duplicated regions around the
genome7. This supports a more dynamic view of genome change, in which, even within
a single generation, not only does the occasional mutation occur (there are estimated
to be as many as 100–200 new base-pair changes in each person), but also perhaps
thousands of gene-conversion events.
The tales told by these Magellans of the genome hold two lessons for those who might
question the wisdom of such exploration. First, even the most repetitive and seemingly
impenetrable stretches of the genome hold secrets that justify the effort. Second, each
chromosome has its own story to tell, quite apart from the story of the genome as a
whole. Although the sex chromosomes provide the strongest case for a special
relationship between genome organization and the unique biology of a chromosome 10,
11
, the other chromosomes shouldn't feel left out. Each is the product of hundreds of
millions of years of evolution, shaped by processes that have rearranged and
exchanged sequences, contributed to the formation of new species, given birth to new
genes and gene families, and provided the basis for a range of genetically determined
or genomically influenced traits. Piecing together these events remains a worthwhile
challenge, for among the flotsam and jetsam of each chromosome lie clues to our
history.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
Skaletsky, H. et al. Nature 423, 825-837 (2003). | Article |
Rozen, S. et al. Nature 423, 873-876 (2003). | Article |
Ohno, S. Sex Chromosomes and Sex-Linked Genes (Springer, Berlin, 1967).
Burgoyne, P. S. Hum. Genet. 61, 85-90 (1982). | PubMed | ChemPort |
International Human Genome Sequencing Consortium Nature 409, 860-921
(2001). | Article | PubMed | ChemPort |
Venter, J. C. et al. Science 291, 1304-1351 (2001). | Article | PubMed | ChemPort |
Bailey, J. A. et al. Science 297, 1003-1007 (2002). | Article | PubMed | ChemPort |
Marshall Graves, J. A. Trends Genet. 18, 259-264 (2002). | Article | PubMed |
Cavalli-Sforza, L. L. & Feldman, M. W. Nature Genet. 33, 266-275 (2003). | Article | PubMed |
Lahn, B. T. & Page, D. C. Science 278, 675-680 (1997). | Article | PubMed | ChemPort |
Carrel, L., Cottle, A. A., Goglin, K. C. & Willard, H. F. Proc. Natl Acad. Sci. USA 96, 1444014444 (1999). | Article | PubMed | ChemPort |
Nature 423, 825 - 837 (19 June 2003); doi:10.1038/nature01722
The male-specific region of the human Y chromosome is a mosaic of discrete
sequence classes
HELEN SKALETSKY*, TOMOKO KURODA-KAWAGUCHI*, PATRICK J. MINX†,
HOLLAND S. CORDUM†, LADEANA HILLIER†, LAURA G. BROWN*, SJOERD REPPING‡,
TATYANA PYNTIKOVA*, JOHAR ALI†, TAMBERLYN BIERI†, ASIF CHINWALLA†,
ANDREW DELEHAUNTY†, KIM DELEHAUNTY†, HUI DU†, GINGER FEWELL†,
LUCINDA FULTON†, ROBERT FULTON†, TINA GRAVES†, SHUN-FANG HOU†,
PHILIP LATRIELLE†, SHAWN LEONARD†, ELAINE MARDIS†, RACHEL MAUPIN†,
JOHN MCPHERSON†, TRACIE MINER†, WILLIAM NASH†, CHRISTINE NGUYEN†,
PHILIP OZERSKY†, KYMBERLIE PEPIN†, SUSAN ROCK†, TRACY ROHLFING†,
KELSI SCOTT†, BRIAN SCHULTZ†, CINDY STRONG†, AYE TIN-WOLLAM†, SHIAWPYNG YANG†, ROBERT H. WATERSTON†, RICHARD K. WILSON†, STEVE ROZEN* &
DAVID C. PAGE*
* Howard Hughes Medical Institute, Whitehead Institute, and Department of Biology, Massachusetts Institute of Technology, 9
Cambridge Center, Cambridge, Massachusetts 02142, USA
† Genome Sequencing Center, Washington University School of Medicine, 4444 Forest Park Boulevard, St Louis, Missouri 63108,
USA
‡ Center for Reproductive Medicine, Department of Gynaecology and Obstetrics, Academic Medical Centre, Amsterdam 1105 AZ,
the Netherlands
Correspondence and requests for materials should be addressed to D.C.P. (page_admin@wi.mit.edu). GenBank accession numbers
are listed in Fig. 2l and the Supplementary Information.
The male-specific region of the Y chromosome, the MSY, differentiates the
sexes and comprises 95% of the chromosome's length. Here, we report that
the MSY is a mosaic of heterochromatic sequences and three classes of
euchromatic sequences: X-transposed, X-degenerate and ampliconic. These
classes contain all 156 known transcription units, which include 78 proteincoding genes that collectively encode 27 distinct proteins. The X-transposed
sequences exhibit 99% identity to the X chromosome. The X-degenerate
sequences are remnants of ancient autosomes from which the modern X and
Y chromosomes evolved. The ampliconic class includes large regions (about
30% of the MSY euchromatin) where sequence pairs show greater than
99.9% identity, which is maintained by frequent gene conversion (nonreciprocal transfer). The most prominent features here are eight massive
palindromes, at least six of which contain testis genes.
The history of human Y chromosome research can be divided into three eras. The first
era focused on mendelian examination of human family trees. In the opening decades
of the twentieth century, proponents of Mendel's concept of the gene observed three
modes of inheritance in our species: autosomal recessive, autosomal dominant and Xlinked recessive. Contemporaneously, other scholars sought to identify traits that
exhibited Y-linked (father to son) transmission. These scholars erroneously claimed
success, presenting family trees purported to demonstrate that Y-chromosomal genes
were responsible for hairy ears, scaly skin and other traits. Meanwhile, light
microscopic studies of human cells provided strong physical evidence of the existence
of a male-specific chromosome1. By 1950, studies of human pedigrees reported at
least 17 Y-linked traits2.
The second era was dominated by the view that the Y chromosome was a genetic
wasteland, based on the debunking of earlier studies and a dearth of new evidence for
genes. In the 1950s, Stern systematically exposed critical flaws in each of the
preceding pedigree studies and dismissed them 2. In 1959, Jacobs' study of Klinefelter
(XXY) males3 and Ford's research on Turner (X0) females 4 demonstrated that the Y
chromosome carries a pivotal sex-determining gene, but this gene was considered to
be an exception on a generally desolate chromosome. In the 1960s, Ohno proposed
that the mammalian X and Y chromosomes had evolved from an ordinary pair of
autosomes5. Ohno speculated that the X chromosome had retained the ancestral
autosome's gene content whereas the Y chromosome had lost all but perhaps one gene
involved in sex determination. Thus emerged the understanding of the human Y
chromosome as a profoundly degenerate X chromosome.
The hallmark of the third and present era has been the application of recombinant DNA
and genomic technologies to the Y chromosome, culminating in molecularly based
conclusions about its genes. In recent decades, an understanding of the Y
chromosome's biological functions has begun to emerge from DNA studies of
individuals with partial Y chromosomes, coupled with molecular characterization of Ylinked genes implicated in gonadal sex reversal, Turner syndrome, graft rejection and
spermatogenic failure6. Genomic studies revealed that the Y chromosome contains a
region, comprising 95% of its length, where there is no X–Y crossing over. This region
came to be known as the non-recombining region, or NRY, although our discovery of
abundant recombination, as reported here and in the accompanying manuscript,
compels us to rename it the male-specific region, or MSY7. The MSY is flanked on both
sides by pseudoautosomal regions, where X–Y crossing over is a normal and frequent
event in male meiosis (see Supplementary Note 1).
Previous efforts to construct accurate, high-resolution physical maps of the MSY had
been stymied by an abundance of lengthy, intrachromosomal repetitive sequences, or
amplicons8. To overcome this difficulty, we identified minute variations between
amplicon copies, and then highlighted these minute variants (sequence family
variants9) as markers to be ordered with respect to one another, yielding a map
amenable to iterative refinement. However, the minute variants could only be found by
fully and accurately sequencing and comparing near-identical amplicon copies. Thus, in
our effort to determine the nucleotide sequence of the MSY, mapping and sequencing
activities were fused into a single, iterative analytic process. We have previously
reported the physical map that emerged from these efforts 10. Here we report the
sequencing of the MSY.
We mapped and sequenced a tiling path of 220 bacterial artificial chromosome (BAC)
clones, each containing a portion of the MSY from the same individual. We used only
one man's Y chromosome to prevent any allelic variation, or polymorphism, from
confounding our search for minute sequence variation between amplicon copies. (MSY
amplicon copies can differ as little in sequence as two Y chromosomes chosen at
random from the population11.) We chose to sequence highly redundant BACs,
especially in amplicon-rich regions: about 12.7 million (roughly 60%) of the
euchromatic nucleotides were sequenced in at least two independent BAC clones. This
redundancy allowed us to refine and validate the MSY sequence by exhaustively
investigating, and in most cases resolving, sequence discrepancies between
overlapping BACs.
Sequencing of euchromatic and heterochromatic regions
We begin with a statistical synopsis of the MSY sequence, considering the euchromatic
and heterochromatic portions separately. (In this analysis, we have equated satellite
sequences with heterochromatin, and all other sequences with euchromatin.) The
product of our present research is a 'reference' sequence from one man's Y
chromosome. A full description of the nature and extent of Y chromosome variation in
human populations must await future studies. We and our colleagues have previously
reported the nucleotide sequence of two portions of the MSY (the AZFa and AZFc
regions12, 13). We have incorporated this previously reported sequence data in our
present analysis of the entire MSY.
The MSY's euchromatic DNA sequences total roughly 23 megabases (Mb), including
8 Mb on the short arm (Yp) and 14.5 Mb on the long arm (Yq) (Fig. 1). We obtained
finished sequence, with an estimated error rate of about 1 per 10 5 nucleotides, for all
MSY euchromatin, with two known exceptions. First, there remain two gaps, each of
which is roughly 50 kilobases (kb) long as judged by chromosomal fluorescence in situ
hybridization (FISH) (Supplementary Fig. 1). Second, we obtained representative but
incomplete sequence for a tandem array that spans roughly 0.7 Mb on Yp. We estimate
that we obtained finished nucleotide sequence for roughly 97% of the MSY
euchromatin, and that we captured 99% of the sequence complexity of MSY's
euchromatin.
Figure 1 The male-specific region of the Y chromosome. Full legend
High resolution image and legend (72k)
So far, efforts to gain sequence-based understanding of human chromosomes have
largely by-passed heterochromatic regions (refs 14, 15; see also Supplementary Note
2), including a large block of heterochromatic sequences found in the centromeric
region of every nuclear chromosome16. In addition to its centromeric heterochromatin
(approximately 1 Mb, ref. 17), the Y chromosome was previously shown to contain a
second, much longer heterochromatic block (roughly 40 Mb) that comprises the bulk of
the distal long arm (Fig. 1; see also Supplementary Note 3). In the course of the
present sequencing project, we discovered and characterized a third heterochromatic
block—a sharply demarcated island that spans approximately 400 kb, comprises
>3,000 tandem repeats of 125 base pairs (bp), and interrupts the euchromatic
sequences of proximal Yq (Figs 1 and 2). The other two heterochromatic blocks also
consist of massively amplified tandem repeats of low sequence complexity. We
attempted to sequence BACs spanning the boundaries and representing the body of
each of the three heterochromatic blocks. We succeeded, with the exception that the
distal boundary of the major heterochromatic region, on distal Yq, was not identified
with certainty (Supplementary Fig. 1). In total, we found that the heterochromatin of
MSY encompasses at least six distinct sequence species (Table 1), each of which form
long, homogeneous tandem arrays. Our findings are detailed in Supplementary Note 4
and Supplementary Fig. 3.
Figure 2 Sequence-based map of the MSY; a detailed view of
the 24-Mb region shown in Fig. 1b. Full legend
High resolution image and legend (44k)
A catalogue of genes and transcription units
With a comprehensive reference sequence of the MSY in hand, we set out to catalogue
systematically the genes of the MSY. We electronically identified and manually
examined all matches to previously reported MSY genes. Furthermore, we used
polymerase chain reaction with reverse transcription (RT–PCR) and/or sequencing of
complementary DNA clones to evaluate electronic matches to publicly available
expressed sequence tags (ESTs), as well as potential genes that were predicted using
GenomeScan software18. For all experimentally verified genes whose expression
patterns had not been reported previously, we tested for expression in diverse human
tissues by RT–PCR and subsequent sequencing of RT–PCR products.
We found that the MSY includes at least 156 transcription units, half of which probably
encode proteins (Table 2 and Figs 2, 3; see also Supplementary Tables 1 and 2). All
156 transcription units identified are located in euchromatic sequences. We have no
evidence of transcription of the MSY heterochromatin. Of the approximately 78
protein-coding units, about 60 are members of nine different MSY-specific gene
families, each characterized by >98% nucleotide identity among family members, in
both exons and introns. The remaining 18 protein-coding genes are present in one
copy each in the MSY. (These include two genes, RPS4Y1 and RPS4Y2, that exhibit
93.6% nucleotide identity in coding exons but are much more diverged in introns.)
Thus, the MSY seems to encode at least 27 distinct proteins or protein families.
Figure 3 MSY genes, transcription units and palindromes.
Full legend
High resolution image and legend (131k)
Furthermore, the MSY includes at least 78 transcription units for which strong evidence
of protein coding is lacking; many of these transcription units are probably non-coding.
Of these 78 transcription units, 13 occur in single copy in the MSY and the remaining
65 are members of 15 MSY-specific families. Considering together both coding and
non-coding transcription units, the MSY appears to contain 24 MSY-specific families,
which collectively account for 125 of the 156 MSY transcription units identified so far.
On the basis of earlier experiments, most of the genes of the MSY were thought to fall
into two functional classes, with genes in the first group expressed throughout the
body, in many organs, and genes in the second group expressed predominantly or
exclusively in testes19. Our present catalogue of MSY genes and their patterns of tissue
expression (Table 2) corroborate this model. Of the MSY's 27 distinct protein-coding
genes or gene families identified so far, 12 are expressed ubiquitously and 11 are
expressed exclusively or predominantly in testes.
Three classes of sequences in the MSY euchromatin
We find that nearly all of the euchromatic sequences fall into three classes, which we
have named X-transposed, X-degenerate and ampliconic. As shown in Figs 1 and 2,
the MSY euchromatin is a patchwork of these three sequence classes. The
characteristics of the classes are summarized in Fig. 4.
Figure 4 Three sequence classes in the MSY euchromatin.
Full legend
High resolution image and legend (47k)
The X-transposed sequences are 99% identical to DNA sequences in Xq21, a band in
the midst of the long arm of the human X chromosome. The X-transposed sequences
are so named because their presence in the human MSY is the result of a massive Xto-Y transposition that occurred about 3–4 million years ago, after the divergence of
the human and chimpanzee lineages20-22. Subsequently, an inversion within the MSY
short arm cleaved the X-transposed block into two non-contiguous segments, as
observed in the modern MSY (Figs 1 and 2)21, 22. The X-transposed sequences do not
participate in X–Y crossing over during male meiosis, distinguishing them from the
pseudoautosomal sequences found in the telomeric regions of the human X and Y
chromosomes.
Within the X-transposed segments, which have a combined length of 3.4 Mb, we
identified only two genes, both of which have homologues in Xq21 (Table 2). Thus the
X-transposed sequences exhibit the lowest density of genes among the three sequence
classes in the MSY euchromatin (Figs 1 and 3), as well as the highest density of
interspersed repeat elements (Fig. 1). In particular, long interspersed nuclear element
1 (LINE1) elements account for 36% of all X-transposed sequence, or nearly twice the
genome average of 20%14, 15. As expected, low gene density and high repeat density
also characterize the homologous sequence block in Xq21.
In contrast to the X-transposed sequence blocks, the X-degenerate segments of the
MSY are dotted with single-copy gene or pseudogene homologues of 27 different Xlinked genes. These single-copy MSY genes and pseudogenes display between 60%
and 96% nucleotide sequence identity to their X-linked homologues, and they seem to
be surviving relics of ancient autosomes from which the X and Y chromosomes coevolved, as explained below. In 13 cases, the MSY homologue is a pseudogene with
sequence similarity to exons and introns of the functional X homologue
(Supplementary Table 3). In the remaining 14 cases, the MSY homologue seems to be
a transcribed, functional gene, and the X- and Y-linked genes encode very similar but
non-identical protein isoforms (Table 2 and Figs 2, 3). These include two cases in
which a functional X-linked gene has two expressed homologues in the MSY. The Ylinked genes RPS4Y1 and RPS4Y2 are full-length homologues of the X-linked gene
RPS4X, and they apparently encode two different, full-length isoforms of ribosomal
protein S4. In contrast, the Y-linked genes CYorf15A and CYorf15B are homologous to,
respectively, 5' and 3' portions of the X-linked gene CXorf15, and they apparently
encode proteins homologous to, respectively, amino- and carboxy-terminal portions of
the predicted CXORF15 protein (Supplementary Fig. 4). Together, the X-degenerate
sequences encode 16 of the MSY's 27 distinct proteins or protein families.
Notably, all 12 ubiquitously expressed MSY genes reside in the X-degenerate regions;
no such genes have been identified elsewhere in the MSY. Conversely, among the 11
MSY genes found to be expressed predominantly in testes, only one gene, the sexdetermining SRY, is X-degenerate.
The third class of euchromatic sequences, the ampliconic segments, are composed
largely of sequences that exhibit marked similarity—as much as 99.9% identity over
tens or hundreds of kilobases—to other sequences in the MSY. We refer to these long,
MSY-specific repeat units, of which there are many families, as amplicons. The
amplicons are located in seven segments that are scattered across the euchromatic
long arm and proximal short arm (Figs 1 and 2), and whose combined length is
10.2 Mb.
We identified these ampliconic regions through a comprehensive analysis of similarities
within the sequenced portions of the MSY. We calculated the percentage nucleotide
identity between all pairs of known MSY sequences and then plotted the data in two
ways. First we determined, at each point along the length of the sequenced MSY, the
highest intrachromosomal similarity. The resulting graph (Fig. 5c) identifies the
ampliconic regions as those where intrachromosomal identity, over stretches of 50 kb
or more, generally exceeds 50%. Notably, 60% (6.1 Mb) of the ampliconic sequences
exhibit intrachromosomal identities of 99.9% or greater.
Figure 5 Sequence similarities within the MSY. Full legend
High resolution image and legend (73k)
A more spatially detailed representation of intrachromosomal similarities is shown in
Fig. 5a, which records the locations of all MSY sequence pairs characterized by at least
65% identity within a sliding window of 2,000 nucleotides. After heterochromatic and
LINE1 repeats have been accounted for, the MSY is seen to contain many long
stretches of sequence that are similar to those elsewhere in the MSY. As shown in the
inset to Fig. 5a, the triangular plot can be broken down into two smaller triangles—one
representing sequence comparisons within Yp, the other depicting comparisons within
Yq—and a rectangle depicting comparisons between Yp and Yq. Scrutiny of these Yp,
Yq and Yp–Yq components of the plot reveals a wealth of sequence similarities within
and between ampliconic segments on both arms of the chromosome.
The ampliconic sequences exhibit by far the highest density of genes, both coding and
non-coding, among the three sequence classes in the MSY euchromatin (Figs 1 and 3).
We identified nine distinct MSY-specific protein-coding gene families, with copy
numbers ranging from two (VCY, XKRY, HSFY, PRY) to three (BPY2) to four (CDY, DAZ)
to six (RBMY) to approximately 35 (TSPY) (Table 2 and Figs 2, 3). (These copy
numbers pertain to the particular Y chromosome that we sequenced; they may vary in
human populations.) In aggregate, these nine coding families encompass roughly 60
transcription units. Furthermore, the ampliconic sequences include at least 75 other
transcription units for which strong evidence of protein coding is lacking (Figs 2 and 3;
see also Supplementary Table 2). Of these 75 putative non-coding transcription units,
65 are members of 15 MSY-specific families, and the remaining 10 occur in single
copy. Considering together both coding and non-coding elements, the ampliconic
sequences contain 135 of the 156 MSY transcription units identified so far.
In contrast to the ubiquitous expression of most X-degenerate genes, the ampliconic
genes and transcription units show highly restricted expression (Table 2). All nine
protein-coding families in the ampliconic regions are expressed predominantly or
exclusively in testes, as are most of the regions' non-coding transcription units.
Among the three euchromatic sequence classes, the ampliconic sequences exhibit by
far the lowest densities of LINE1 and total interspersed repeat elements (Fig. 1).
Indeed, the interspersed repeat content of the MSY's ampliconic sequences (35%) is
far below the mean for the human genome (44%; z-test yields P
0.000001).
Eight palindromes comprising 25% of MSY euchromatin
The most pronounced structural features of the ampliconic regions of Yq are eight
massive palindromes (Table 3). In the dot plot of Fig. 5a, the longer palindromes are
visible as vertical blue lines that approach the baseline. An MSY map highlighting all
eight palindromes is shown in Fig. 3a. In all eight palindromes, the arms are highly
symmetrical, with arm-to-arm nucleotide identities of 99.94–99.997%. (By convention,
these percentage identities refer only to nucleotide substitutions and do not take
account of insertions and deletions by which palindrome arms differ.) The palindromes
are long, their arms ranging from 9 kb to 1.45 Mb in length. They are imperfect in that
each contains a unique, non-duplicated spacer, 2–170 kb in length, at its centre.
Palindrome P1 is particularly spectacular, having a span of 2.9 Mb, an arm-to-arm
identity of 99.97%, and bearing two secondary palindromes (P1.1 and P1.2, each with
a span of 24 kb) within its arms13. The eight palindromes collectively comprise 5.7 Mb,
or one-quarter of the MSY euchromatin.
Six of the eight palindromes carry recognized protein-coding genes, all of which seem
to be expressed specifically in testes (Fig. 3b). In all known cases of genes on MSY
palindromes, identical or nearly identical gene copies exist on opposite arms of the
palindrome. Of the nine multi-copy, protein-coding gene families identified so far in the
MSY, eight have members on palindromes. Indeed, six families are located exclusively
in palindromes. These include the DAZ genes, which exist in four copies—two in
palindrome P1 and two in P2—and the CDY genes, which also occur in four copies—two
in P1 and two in P5 (Fig. 3b). In addition, the palindromes contain at least seven
families of apparently non-coding transcription units, all expressed exclusively or
predominantly in testes (Fig. 3e).
In addition to the eight palindromes, the ampliconic regions of Yq and Yp contain five
sets of more widely spaced inverted repeats with repeat lengths of 62–298 kb (Fig. 2;
see also Supplementary Table 4). Three of these inverted repeat pairs (IR1, IR2 and
IR3) exhibit nucleotide identities of 99.66–99.95%. Inversion of the IR3 repeats, both
located on Yp, was probably a direct consequence of the molecular evolutionary event
that cleaved the X-transposed sequences into two non-contiguous segments
(Supplementary Fig. 5). Subsequent homologous recombination between inverted IR3
repeats was responsible, we suspect, for a 3.6-Mb inversion polymorphism observed
on the short arm of the modern Y chromosome (Supplementary Figs 5 and 6)10.
Transcriptionally active tandem arrays
In addition to palindromes and inverted repeats, the ampliconic regions of Yq and Yp
contain a variety of long tandem arrays. Prominent among these are the newly
identified NORF (no long open reading frame) clusters, which in aggregate account for
about 622 kb on Yp and Yq, and the previously reported TSPY clusters, which comprise
about 700 kb of Yp (Fig. 2). Triangular dot plots that highlight the regularities and
relatively crisp borders of the NORF and TSPY arrays are shown in Supplementary Fig.
7 (see also Supplementary Note 5).
The NORF arrays are based on a repeat unit of 2.48 kb. A consensus sequence for the
repeat is readily identifiable (Supplementary File 2), but the sequence of individual
repeat elements typically diverges from that consensus by 14–20%. The NORF arrays
are so named because they harbour a great diversity of spliced but apparently noncoding transcription units, including the TTTY1, TTTY2, TTTY6, TTTY7, TTTY8, TTTY18,
TTTY19, TTTY21 and TTTY22 families. Both strands of the NORF arrays are transcribed;
3' portions of the TTTY1 and TTTY2 transcripts are complementary (Supplementary
Fig. 8).
The TSPY arrays are based on a 20.4-kb repeat unit23 that encodes, on one strand, a
previously identified protein, TSPY. A newly identified transcription unit, CYorf16, is
found on the opposite strand; its protein coding potential remains to be tested.
Approximately 35 copies of this repeat unit—and hence 35 TSPY genes and 35 CYorf16
transcription units—are found in a single, highly regular tandem array in proximal Yp
(Fig. 2 and Supplementary Fig. 7d, e); here the sequences of individual repeat units
rarely differ from the consensus by more than 1%. Furthermore, a single, isolated
TSPY repeat unit, whose sequence diverges 3% from the consensus, is located more
distally in Yp, embedded in the distal IR3 inverted repeat (Fig. 2). The 35-unit TSPY
cluster is the largest and most homogeneous protein-coding tandem array identified so
far in the human genome.
The evolution of the MSY
On the basis of our present findings and previous studies, we propose a model of MSY
evolution that addresses all three euchromatic sequence classes (Figs 6 and 7). In
developing the model, we will offer an evolutionary map of the MSY (Fig. 8). We will
then consider the two largest and most gene-rich sequence classes—X-degenerate and
ampliconic—arguing that two opposed evolutionary dynamics have been at work: gene
decay versus gene acquisition and conservation. Throughout, we will propose decisive
roles for modulation of DNA recombination, both crossing over and gene conversion, in
the evolution and on-going maintenance of the MSY (Fig. 9).
Figure 6 Molecular evolutionary pathways and processes that
gave rise to genes in three MSY euchromatic sequence
classes. X-degenerate genes and pseudogenes (yellow
background) derived from an autosomal pair that was
ancestral to both the X and Y chromosomes (and that was
enlarged by subsequent fusion with other autosomes or
autosomal segments50). X-transposed genes (pink
background) derived from X-linked genes, which in turn
derived from the ancestral autosomal pair. Full legend
High resolution image and legend (52k)
Figure 7 Plot of Ks (Supplementary Table 5) versus X-linked
gene order for 31 X–Y gene (or gene/pseudogene) pairs.
Full legend
High resolution image and legend (33k)
Figure 8 Evolutionary map of the MSY. Full legend
High resolution image and legend (69k)
Figure 9 MSY sequences exhibiting 99.9%
intrachromosomal identity probably undergo Y–Y gene
conversion. Full legend
High resolution image and legend (29k)
The human X and Y chromosomes are thought to have evolved from an ordinary pair
of autosomes5, 24. Support for this hypothesis, and a proposed 300-million-year
timeline for human sex chromosome evolution, have emerged from studies of modern
X–Y gene pairs. In this context, investigators have interpreted the X–Y gene pairs as
surviving 'fossils' where extensive sequence identity between ancestral X and Y
chromosomes once existed25, 26. Our present sequencing of the MSY euchromatin
expands the catalogue of known X–Y gene pair fossils, providing opportunity to reexamine models developed in earlier studies.
Evolutionary stratification of X–Y genes Lahn and Page previously studied the
evolutionary ages of X–Y gene pairs, as measured by synonymous X–Y nucleotide
divergence, or Ks (ref. 26). They reasoned that X–Y differentiation would have begun
only after X–Y crossing over ceased. They observed a strong correlation between the
age (Ks) of individual X–Y gene pairs and the locations of their X members on the
human X chromosome. Among the 19 X–Y gene pairs studied, age increased in a
stepwise fashion along the length of the X chromosome, in four 'evolutionary strata'.
This suggested that at least four events had punctuated human sex chromosome
evolution, with each event suppressing X–Y crossing over in one stratum without
grossly disturbing gene order in the X chromosome.
We re-analysed this published information and combined the results with Ks and map
location data for 12 additional X–Y gene pairs, thus compiling data on 31 X–Y pairs in
all (Supplementary Table 5). In each of 27 pairs, the Y member is an X-degenerate
gene or pseudogene. The other four pairs include two in which the Y member is an Xtransposed gene and two in which the Y members are ampliconic gene families.
Among all X-degenerate pairs, and the two ampliconic pairs, the previously reported
correlation between age (Ks) and X map position is readily apparent, with age
increasing from the distal short arm to the long arm of the X chromosome (Fig. 7).
Furthermore, as observed in the earlier study, the order of the homologous genes in
the MSY appears to be scrambled with respect to Ks (Supplementary Fig. 9). These
observations, together with the earlier arguments of Lahn and Page, suggest three
conclusions. First, all MSY genes and pseudogenes identified here as X-degenerate
seem to be products of a single molecular evolutionary process: the region-by-region
suppression of crossing over in ancestral autosomes, with subsequent differentiation of
the Y from the X chromosome (Fig. 6). Second, at least two of the MSY's ampliconic
gene families, VCY and RBMY, also originated in this manner, but subsequently
acquired the characteristics of ampliconic sequences (Fig. 6; for independent evidence
concerning RBMY see refs 27 and 28). Third, as previously hypothesized, inversions in
the Y chromosome may have suppressed crossing over with the X chromosome.
X-transposed genes as exceptions A very different evolutionary model accounts for
the X-transposed genes, as confirmed by our Ks analysis. If, as hypothesized, these
MSY genes are the result of a single, recent transposition from the X chromosome (Fig.
6), then the Ks values of the two X-transposed X–Y gene pairs should be similar to
each other but much lower than the Ks values of the nearby (X-degenerate) pairs in
the X-chromosome long arm. This prediction is met (Fig. 7). The two X-transposed X–Y
gene pairs seem to be orders of magnitude younger than the ancient pairs (group 1 in
Fig. 7) among which they are physically situated in the X chromosome.
Blurred boundaries Our observations differ from those of Lahn and Page in that the
boundaries between X–Y gene groups 2 and 3, and between groups 3 and 4, now seem
less distinct (Fig. 7; compare with Fig. 2 in ref. 26). Whereas our present observations
could be interpreted as evidence that suppression of X–Y crossing over evolved in
more than four steps, such a conclusion would be premature. The apparent overlaps
between groups could be artefacts of local errors in ordering X-linked genes, these
regions not yet having been fully sequenced, or simply of large standard errors for
some Ks estimates (Fig. 7). Some changes in local gene order in the X chromosome
may also have occurred during its evolution. Another potentially confounding factor is
X–Y gene conversion, which would depress Ks values and estimated ages for geneconverted X–Y pairs. Gene conversion depends on high sequence similarity, and thus
one might expect any such effect to be greater among the younger X–Y pairs, in
groups 3 and 4. Indeed, comparisons of X and Y genomic sequences suggest that the
VCX/Y pair and 3' portions of the KAL1/P pair (both pairs in group 4) have engaged in
extensive gene conversion (Supplementary Fig. 10), depressing their Ks values below
those of the 5' portion of the KAL1/P pair and of other group 4 pairs (Fig. 7).
A map of male-specific ages Having examined the evolutionary ages of all 31 X–Y
gene pairs, we used them to anchor an evolutionary map of the modern human MSY.
The map displays the male-specific ages of many sequence segments (Fig. 8). Here,
male-specific age is the estimated number of years that have passed since sequences
ancestral to that segment were incorporated into the MSY (having previously been
autosomal, pseudoautosomal, or X-linked). We estimated the age of each gene or
segment using Lahn and Page's methods that combined Ks analysis (Supplementary
Table 5) with comparative gene mapping data from other mammals. The resulting
estimated ages are graphed on a logarithmic scale to accommodate a range that
extends from approximately 4 million years (the X-transposed sequences; the
youngest known sequences in the MSY) to approximately 300 million years (SRY, the
sex determinant and arguably the oldest gene in the MSY).
As can be seen in Fig. 8, the MSY euchromatin is an elaborate patchwork of sequences
of diverse male-specific ages. The result of a single, recent transposition from the Xchromosome, the MSY's X-transposed sequences are homogeneously youthful. The
sequences of both the X-degenerate and ampliconic classes are much older, and they
display a wide range of male-specific ages (Fig. 8). As we will argue, it is in comparing
and contrasting these two chronologically diverse classes that the central themes of
MSY evolution and function are revealed most clearly.
Evolutionary dynamics of X-degenerate and ampliconic sequences
To appreciate the evolutionary dynamics of these two sequence classes, we need to
consider both their similarities and differences. In many senses, the X-degenerate and
ampliconic sequences together dominate the euchromatic MSY. The X-degenerate and
ampliconic classes are physically intermingled in the MSY, and they are comparably
large, constituting, respectively, 38% and 45% of the MSY's euchromatic sequences
(Fig. 1 and Supplementary Table 6). Together, these two sequence classes carry all
but two of the MSY's 78 known protein-coding transcription units (Table 2). The Xdegenerate and ampliconic classes display comparable diversities of male-specific
ages, from tens to hundreds of millions of years (Fig. 8). This implies that Xdegenerate and ampliconic sequences evolved in parallel, as parts of a single DNA
molecule, for as much as 300 million years. Moreover, we infer that the X-degenerate
and ampliconic sequences evolved under similar, unusual circumstances: both were
transmitted exclusively through the male germ line, and neither participated in meiotic
crossing over with a homologous counterpart. However, a number of marked structural
and functional differences between these two sequence classes suggest that they
followed different evolutionary trajectories. Palindromes are prevalent in ampliconic
sequences. The density of transcription units is much higher and the density of
interspersed repeats is much lower in ampliconic than in X-degenerate sequences (Fig.
1). The two sequence classes also diverge starkly with respect to gene-expression
patterns. Most X-degenerate genes are expressed widely throughout the body, and
many are probably involved in cellular housekeeping activities that are critical in both
males and females. In contrast, most ampliconic genes are expressed predominantly
or exclusively in testes, where they probably function in spermatogenesis.
Decay in the absence of sexual recombination The X-degenerate sequences are
adequately explained by the prevailing theory of sex chromosome evolution, which
states that as the X and Y chromosomes evolved from an autosomal pair, the X
chromosome maintained most of its ancestor's genes whereas the Y chromosome lost
them5, 24-26. Our findings support the two major premises of this theory: the
evolutionary genetic benefits of sexual recombination through meiotic crossing over,
and the deleterious consequences of its absence. According to this theory, most
ancestral genes remained functionally intact in the X chromosome, where the benefits
of crossing over (in females) continued. In the Y chromosome, in contrast, the shutting
down of X–Y crossing over during evolution triggered a monotonic decline in gene
function. This model is corroborated by the presence, in the MSY's X-degenerate
sequences, of decayed, intron-bearing pseudogenes of 13 different X-linked genes
(Supplementary Table 3). Presumably, many hundreds of other X-homologous genes
were deleted outright from the evolving MSY, leaving no trace in the DNA sequence of
the modern human MSY. Seen in this light, the 16 protein-coding genes in the modern
MSY's X-degenerate sequences (Table 2 and Fig. 3) appear as rare examples of
persistence in the absence of sexual recombination.
Acquisition and conservation of spermatogenic functions This evolutionary
model of the Y chromosome as a decaying X chromosome, however, provides no
explanation for central characteristics of the MSY's ampliconic sequences, including
testis-specific gene expression, near-perfect palindromes, and an abundance of
autosomal (as well as X-chromosomal) sequence similarities. To account for these
characteristics, we propose that the MSY acquired, and evolved a means of conserving,
genes that specifically enhanced male fertility.
Unlike the X-degenerate sequences, all of which trace to the MSY's shared ancestry
with the X chromosome, the ampliconic sequences evolved from a great variety of
genomic sources, and by a diversity of molecular mechanisms (Fig. 6). As mentioned
previously, the ampliconic genes VCY and RBMY were, similar to the X-degenerate
genes, derived from common ancestors of the X and Y chromosomes27, 28. In contrast,
the DAZ genes arose, during primate evolution, by transposition and subsequent
amplification of an autosomal transcription unit, DAZL, which still exists on human
chromosome 3 (ref. 29). Indeed, systematic analysis of MSY/autosome similarities
suggests that a series of autosomal transpositions contributed to the MSY's ampliconic
sequences during primate evolution (Fig. 8; see also ref. 13). Yet another molecular
mechanism accounts for the CDY genes, which arose by retroposition (and subsequent
amplification) of a processed messenger RNA derived from an autosomal gene 30. This
retroposition event was previously thought to have occurred during primate evolution,
but our present Ks analysis indicates a much older date, probably before the lineages
of marsupials and placental mammals diverged (Fig. 8; see also Supplementary Table
5).
Despite the wide variety of genomic sources and molecular evolutionary mechanisms
that gave rise to the ampliconic genes, they all came to exist in the MSY in multiple,
nearly identical copies, and they evolved remarkably uniform patterns of tissue
expression. Indeed, detailed studies of several ampliconic gene families have revealed
that they are expressed predominantly or exclusively in one cell lineage: the
spermatogenic cells of the testis. What accounts for this convergence of evolutionary
outcomes? The genesis of XY sex chromosomes during mammalian evolution, and
specifically the emergence of a male-specific domain, created a genomic niche where
selection could operate to enhance male germ-cell development. Amplification of the
testis genes might have enhanced sperm production through high levels of expression.
However, in a region devoid of crossing over, amplification might also have allowed
another type of homologous recombination, gene conversion, to emerge as a means of
conserving gene function.
Abundant Y–Y gene conversion in ampliconic regions
Gene conversion is the non-reciprocal transfer of sequence information from one DNA
duplex to another31. This type of genetic recombination has been studied most
extensively in fungi, where it was originally demonstrated to occur between
chromosome homologues, or at lower frequency between sister chromatids, in meiosis.
It was later shown that gene conversion could also occur between duplicated
sequences on a single chromosome, and in mitosis32. Here we will argue that gene
conversion (non-reciprocal recombination) is as frequent in the MSY as crossing over
(reciprocal recombination) is in ordinary chromosomes.
Specifically, two major findings provide evidence that gene conversion occurs routinely
in 30% of the MSY euchromatin, including nearly all of the MSY's testis-specific gene
families. The accompanying study7 reports the identification and sequencing of
chimpanzee Y-linked orthologues of human MSY palindromes and establishes that gene
conversion between palindrome arms has occurred in both the human and chimpanzee
lineages, and has continued to occur in human populations. Here we report that these
palindromes are representative of a large, discrete fraction of MSY sequences, all of
which bear at least 99.9% identity to other MSY sequences. These findings suggest
that the entire fraction is subject to frequent gene conversion.
Above we described calculations of percentage nucleotide identity between all pairs of
known MSY sequences. We defined and mapped the ampliconic regions by reporting,
at each point along the length of the MSY euchromatin, the highest percentage identity
to other MSY sequences (intrachromosomal similarity; Fig. 5). To view this data from
another perspective, we electronically fractionated all MSY sequences according to
intrachromosomal similarity. As seen in Fig. 9a, 30% of MSY euchromatic sequences
display intrachromosomal identities of 99.9–100%. As intrachromosomal identity
declines below 99.9%, the fractional representation of MSY sequences drops abruptly.
Thus, the sequences displaying intrachromosomal identities of 99.9% represent a
large and distinct subset of the MSY euchromatin.
This 99.9% subset comprises the eight palindromes as well as large portions of the
IR2 and IR3 inverted repeats described above (Figs 2 and 3). Indeed, nearly all of the
99.9% sequences exist as pairs in inverted orientation. Thus, the MSY palindromes in
which gene conversion has been demonstrated 7 are typical and representative of the
99.9% fraction. We extrapolate that nearly all of the 99.9% fraction is engaged in
gene conversion on a routine basis, resulting in a degree of identity among MSY's
inverted sequence pairs that rivals that of two autosomal homologues, or alleles,
chosen at random from the human population15, 33.
Two modes of productive recombination in the human Y chromosome
Combined with previous discoveries in the pseudoautosomal regions, the present
findings imply that two modes of homologous recombination occur regularly in the
human Y chromosome. First, there is crossing over with the X chromosome in the
pseudoautosomal regions (aggregate length 3.0 Mb) (Supplementary Note 6). Second,
there is Y–Y gene conversion in the 99.9% regions (aggregate length 6.1 Mb)
dispersed throughout the MSY (Fig. 9b)7. We refer to both routine modes of Y
chromosome recombination as 'productive' to distinguish them from the relatively rare,
aberrant recombination events (typically Y–Y or X–Y) that perturb sex differentiation or
fertility and thereby diminish the reproductive fitness of affected individuals.
Genetic mapping studies have shown that, typically, one X–Y crossover occurs per
generation in the pseudoautosomal regions (Supplementary Note 7). As described in
the accompanying report7, steady-state calculations suggest that, on average, multiple
Y–Y gene conversion events take place per generation in the MSY. Thus, most
homologous recombination events in the Y chromosome probably occur in the MSY.
In recent years, we and other investigators have referred to the MSY as the NRY, or
'non-recombining region of the Y chromosome'. This usage reflected both awareness
that productive X–Y crossing over did not occur in the MSY, and ignorance of the Y–Y
gene conversion that is apparently commonplace there. We now refer to the NRY as
the MSY, or 'male-specific region of the Y chromosome', because it is recombinogenic
and unique to males.
Gene conversion and the MSY's testis gene families
Examination of the MSY's testis gene families provides additional insight into the
potential biological significance of the 99.9% fraction and the gene conversion
associated with it. Eight of the MSY's nine identified testis gene families have members
in the palindromes or inverted repeats that comprise the 99.9% fraction just
described. (The exceptional family is TSPY, most of whose members are found in a
long tandem array.) Many of these family members are intact gene copies, but others
are apparent pseudogenes with disrupted splice sites or reading frames. For each of
the eight testis gene families, we counted the numbers of intact and pseudogene
copies, both within and without the 99.9% fraction (Table 4). Whereas large numbers
of pseudogenes are present both inside and outside the 99.9% fraction, the intact
gene copies, 25 in all, are located exclusively in the 99.9% fraction.
Thus, there is an evident association of intact testis genes with near-identical inverted
sequence pairs that undergo gene conversion. What is the biological significance of this
association? We envision two possibilities, which are not mutually exclusive. First, we
note that in all cases examined so far, expression of these testis-specific gene families
has been found to be limited to or most pronounced in cells of the spermatogenic
lineage—in germ cells. Perhaps these near-identical sequence pairs are
transcriptionally active in germ cells because there they generate cruciforms or other
unusual chromatin configurations. Second, the occurrence of MSY gene pairs that are
subject to frequent gene conversion might provide a mechanism for conserving gene
functions across evolutionary time in the absence of crossing over.
Implications for future studies
We anticipate that the nucleotide sequence reported here, and the methods with which
it was obtained, will find many applications in human biology and beyond.
Comparisons with other human Y chromosomes The sequence of one man's MSY,
as reported here, provides a point of departure for systematic, comprehensive
characterization of MSY sequence variation in human populations. The MSY's unique
characteristics—male specificity, no crossing over and abundant gene conversion—
suggest that its sequence variation might differ markedly from that of ordinary human
chromosomes. Already the availability of MSY sequence information in public databases
has accelerated the emergence of MSY sequence variation as a powerful tool in
reconstructing the patrilineal origins of modern human populations11, 34.
Comparisons (or lack of) with other species Little is known about the DNA
sequences of Y chromosomes in other animals or plants, and thus it is not possible at
present to compare systematically the human MSY with that of any other species. Both
the Drosophila and mouse Y chromosomes contain genes required for
spermatogenesis, but meagre Y chromosome sequence data is available in either
species. In Drosophila, the sequences of autosomes and the X chromosome were
assembled from whole-genome shotgun data. Unfortunately, this shotgun analysis was
insufficient to assemble much Y chromosome sequence35, 36, confirming prior suspicions
that, in Drosophila as in humans, the Y chromosome poses special challenges. In the
mouse, a draft sequence of the female genome is available37, but systematic efforts to
sequence the male-specific region of the Y chromosome have yet to be initiated. If
undertaken, Y chromosome sequencing projects in Drosophila, mouse and other
species are likely to encounter special technical hurdles, but they are also likely to
yield entirely unforeseen biological insights, as was the case here for the human MSY.
The availability of human MSY sequence has already enabled new tests and rekindled
debate of Haldane's hypothesis that mutations in the male germ line greatly
outnumber those in the female germ line (Supplementary Note 8). This debate will
surely be fuelled by sequencing of other primate and mammalian Y chromosomes.
Methods for sequencing difficult genomic regions Our strategy of iterative
mapping and sequencing was laborious but essential. Two faster, less costly strategies
have been used recently in sequencing large genomes: whole-genome shotgun
analysis15, 35 and sequencing a tiling path of mapped clones (ref. 14 and
Supplementary Note 9). Neither of these sequencing strategies would have yielded a
coherent picture of the MSY. This is especially true of the MSY's ampliconic regions,
and most particularly the 30% of the MSY euchromatin (including the eight
palindromes) exhibiting intrachromosomal similarities of 99.9%. Large amplicons like
those described here are not unique to the MSY, but as in the MSY, they have proven
to be formidable obstacles to whole-genome methods38, 39. The iterative mapping and
sequencing strategy used here should be considered by genome scientists wishing to
determine the structure and sequence of amplicon-rich regions of human autosomes,
the X chromosome and other genomes.
The medical relevance of the MSY Propelled by advances in MSY genomics, the
biomedical significance of the MSY has begun to surface in recent years, with evidence
of roles in such diverse processes as gonadal sex determination, skeletal growth,
germ-cell tumorigenesis and graft rejection6. Two research areas that should benefit
from the present MSY sequence and gene catalogue are of particular note. First, one of
the most common chromosomal disorders of girls and women is Turner syndrome,
classically associated with a 45,X (X0) karyotype. Haploinsufficiency of particular genes
common to the X and Y chromosomes may be responsible for somatic features of the
syndrome40-42. In most cases, the molecular identity of these Turner genes remains to
be determined. One or more Turner genes are likely to be found within the catalogue
of X-degenerate genes (and their X-linked homologues; see Table 2).
A highly active area of MSY research explores spermatogenesis and the genetic basis
of male infertility. MSY deletions have emerged as the most common of the known
genetic causes of spermatogenic failure in human populations13, 43-46. The availability of
MSY sequence has already begun to transform our understanding, enabling
investigators to precisely define four distinct classes of recurrent MSY deletions causing
spermatogenic failure, identify the MSY genes absent as a result of these deletions
(typically members of testis-specific families), and demonstrate that most such
deletions are the result of homologous recombination between near-identical
amplicons13, 43-46. Thus, the ampliconic structures that may help preserve testis gene
function across evolutionary time (through gene conversion) also put individuals at risk
of spermatogenic failure (again, through homologous recombination).
Genetic and biological differences between males and females It is commonly
stated that the genomes of two randomly selected members of our species exhibit
99.9% nucleotide identity. In reality, this statement holds only if one is comparing two
males, or two females. If one compares a female with a male, the second X
chromosome (160 Mb, or roughly 3% of the diploid DNA content) is replaced by the
largely dissimilar Y chromosome (60 Mb, or 1% of the diploid DNA content). This
common substitution of the Y chromosome for the second X chromosome dwarfs all
other DNA polymorphism in the human genome. In decades past, and with the
important exception of X-linked recessive diseases, biologists often judged this
genomic dimorphism to be of limited functional consequence, especially because of
inactivation of the second X chromosome in females and the presumed paucity of
genes in the Y chromosome. Now we must begin to reconsider this position, given the
unanticipated number and variety of MSY genes, many of which are expressed
throughout the body, and the fact that many X-linked genes are expressed from both
X chromosomes in female cells47. The present sequence of the MSY, and the emerging
sequence of the X chromosome, offer the near prospect of a comprehensive catalogue
of genetic and sequence differences between human males and females. Translating
this knowledge into an understanding of the myriad differences between the sexes in
anatomy, physiology, cognition, behaviour and disease susceptibility presents a
monumental challenge, but surely one of broad significance and interest.
Methods
Iterative mapping and sequencing The method of iterative mapping and
sequencing used here has been described10, 13. All MSY BACs selected for sequencing
were isolated from the RPCI-11 library48, with the exception of 11 clones (nine
spanning the AZFa region12, and two used to narrow gaps10) from the CITB and CITC
libraries. We made frequent use of publicly available BAC-end sequences as a source of
markers during the final stages of map construction 49. Two gaps were closed by longrange PCR; see Supplementary Fig. 11.
Unfortunately, no cell line is available from the donor of the RPCI-11 BAC library. Thus,
to confirm the large-scale organization of MSY sequences reported here, we PCRamplified the inner and outer boundaries of all palindromes in ten men with genetically
diverse Y chromosomes (PCR primers in Supplementary Table 8). We sequenced all
resulting products. These experiments confirmed that each palindrome boundary is
present in the great majority of human Y chromosomes.
Intrachromosomal sequence similarity Analyses of intrachromosomal similarity
were performed using custom Perl code. This code used BLAST (http://blast.wustl.edu)
to compare all 5-kb sequence segments, in 2-kb steps, to the entire remainder of the
MSY sequence.
Interspersed repeats We electronically identified interspersed repeats with
RepeatMasker (http://repeatmasker.genome.washington.edu).
Homology to other chromosomes To identify sequence similarities to other human
chromosomes, we conducted BLAST searches against GenBank databases with the
sequence of each MSY clone. Interspersed repeats and low-complexity regions were
masked using RepeatMasker. To experimentally verify the chromosomal origins of
sequences similar to the MSY, we designed STSs from those sequences and assayed
them against the NIGMS human/rodent somatic cell hybrid mapping panels 1 and 2
(NIGMS Human Genetic Cell Repository,
http://locus.umdnj.edu/nigms/maps/mapping.html).
Identification of new genes and transcription units We identified potential
transcripts from three sources: (1) BLAST matches to cDNA sequences (EST or full
length). We pursued matches where the cDNA sequence showed evidence of
polyadenylation or splicing, or where there were multiple matching cDNA sequences.
(2) BLAST matches to fragments of putative MSY transcripts that had been cloned by
cDNA selection of testis cDNA against a flow-sorted, genomic Y-chromosome library19.
(3) GenomeScan18 predictions in the NCBI annotation of Y-chromosome contigs. We
then tested for transcription by RT–PCR as previously described13.
Chromosomal FISH One- or two-colour FISH to human chromosomes was performed
as previously described9.
Calculation of Ks and Ka We calculated the numbers of synonymous substitutions per
synonymous site (Ks) and of non-synonymous substitutions per non-synonymous site
(Ka) as follows. We used FASTA (ftp://ftp.virginia.edu/pub/fasta) to align the pairs of
coding sequences in Supplementary Table 5. For non-transcribed MSY pseudogenes,
we used FASTA to align the genomic sequence of pseudogene exons to the
corresponding transcribed coding sequence (Supplementary Table 5 and File 3). Then,
as is standard practice, insertions/deletions were manually removed from the
alignments. We calculated Ks and Ka for these alignments using the diverge function in
the Wisconsin Package (Version 10.2, Genetics Computer Group).
Supplementary information accompanies this paper.
Received 7 March 2003;
accepted 8 April 2003
References
1. Painter, T. S. The Y-chromosome in mammals. Science 53, 503-504 (1921)
2. Stern, C. The problem of complete Y-linkage in men. Am. J. Hum. Genet. 9, 147-166
(1957) | ChemPort |
3. Jacobs, P. A. & Strong, J. A. A case of human intersexuality having a possible XXY sex
determining mechanism. Nature 183, 302-303 (1959) | ChemPort |
4. Ford, C. E., Miller, O. J., Polani, P. E., de Almeida, J. C. & Briggs, J. H. A sex-chromosome
anomaly in a case of gonadal dysgenesis (Turner's syndrome). Lancet 1, 711-713
(1959) | ChemPort |
5. Ohno, S. Sex Chromosomes and Sex-linked Genes (Springer, Berlin, 1967)
6. Vogt, P. H. et al. Report of the third international workshop on Y chromosome mapping 1997.
Cytogenet. Cell Genet. 79, 1-20 (1997) | PubMed | ChemPort |
7. Rozen, S. et al. Abundant gene conversion between arms of palindromes in human and ape Y
chromosomes. Nature 423, 873-876 (2003) | Article |
8. Foote, S., Vollrath, D., Hilton, A. & Page, D. C. The human Y chromosome: Overlapping DNA
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
clones spanning the euchromatic region. Science 258, 60-66 (1992) | PubMed | ChemPort |
Saxena, R. et al. Four DAZ genes in two clusters found in AZFc region of human Y
chromosome. Genomics 67, 256-267 (2000) | Article | PubMed | ChemPort |
Tilford, C. et al. A physical map of the human Y chromosome. Nature 409, 943-945
(2001) | Article | PubMed | ChemPort |
Shen, P. et al. Population genetic implications from sequence variation in four Y chromosome
genes. Proc. Natl Acad. Sci. USA 97, 7354-7359 (2000) | Article | PubMed | ChemPort |
Sun, C. et al. An azoospermic man with a de novo point mutation in the Y-chromosomal gene
USP9Y. Nature Genet. 23, 429-432 (1999) | Article | PubMed | ChemPort |
Kuroda-Kawaguchi, T. et al. The AZFc region of the Y chromosome features massive
palindromes and uniform recurrent deletions in infertile men. Nature Genet. 29, 279-286
(2001) | Article | PubMed | ChemPort |
Human Genome Sequencing Consortium Initial sequencing and analysis of the human
genome. Nature 409, 860-921 (2001) | Article | PubMed | ChemPort |
Venter, J. C. et al. The sequence of the human genome. Science 291, 1304-1351
(2001) | Article | PubMed | ChemPort |
Schueler, M. G., Higgins, A. W., Rudd, M. K., Gustashaw, K. & Willard, H. F. Genomic and
genetic definition of a functional human centromere. Science 294, 109-115
(2001) | Article | PubMed | ChemPort |
Tyler-Smith, C. et al. Localization of DNA sequences required for human centromere function
through an analysis of rearranged Y chromosomes. Nature Genet. 5, 368-375
(1993) | PubMed | ChemPort |
Yeh, R. F., Lim, L. P. & Burge, C. B. Computational inference of homologous gene structures
in the human genome. Genome Res. 11, 803-816 (2001) | Article | PubMed | ChemPort |
Lahn, B. T. & Page, D. C. Functional coherence of the human Y chromosome. Science 278,
675-680 (1997) | Article | PubMed | ChemPort |
Page, D. C., Harper, M. E., Love, J. & Botstein, D. Occurrence of a transposition from the Xchromosome long arm to the Y-chromosome short arm during human evolution. Nature 311,
119-123 (1984) | PubMed | ChemPort |
Mumm, S., Molini, B., Terrell, J., Srivastava, A. & Schlessinger, D. Evolutionary features of the
4-Mb Xq21.3 XY homology region revealed by a map at 60-kb resolution. Genome Res. 7,
307-314 (1997) | PubMed | ChemPort |
Schwartz, A. et al. Reconstructing hominid Y evolution: X-homologous block, created by X-Y
transposition, was disrupted by Yp inversion through LINE-LINE recombination. Hum. Mol.
Genet. 7, 1-11 (1998) | Article | PubMed | ChemPort |
Tyler-Smith, C., Taylor, L. & Muller, U. Structure of a hypervariable tandemly repeated DNA
sequence on the short arm of the human Y chromosome. J. Mol. Biol. 203, 837-848
(1988) | PubMed | ChemPort |
Graves, J. A. & Schmidt, M. M. Mammalian sex chromosomes: Design or accident? Curr.
Opin. Genet. Dev. 2, 890-901 (1992) | PubMed | ChemPort |
Jegalian, K. & Page, D. C. A proposed path by which genes common to mammalian X and Y
chromosomes evolve to become X inactivated. Nature 394, 776-780
(1998) | Article | PubMed | ChemPort |
Lahn, B. T. & Page, D. C. Four evolutionary strata on the human X chromosome. Science 286,
964-967 (1999) | Article | PubMed | ChemPort |
Delbridge, M. L., Lingenfelter, P. A., Disteche, C. M. & Graves, J. A. M. The candidate
spermatogenesis gene RBMY has a homologue on the human X chromosome. Nature Genet.
22, 223-224 (1999) | Article | PubMed | ChemPort |
Mazeyrat, S., Saut, N., Mattei, M. G. & Mitchell, M. J. RBMY evolved on the Y chromosome
from a ubiquitously transcribed X-Y identical gene. Naure Genet. 22, 224-226
(1999) | Article | ChemPort |
29. Saxena, R. et al. The DAZ gene cluster on the human Y chromosome arose from an
autosomal gene that was transposed, repeatedly amplified and pruned. Nature Genet. 14, 292299 (1996) | PubMed | ChemPort |
30. Lahn, B. T. & Page, D. C. Retroposition of autosomal mRNA yielded testis-specific gene family
on human Y chromosome. Nature Genet. 21, 429-433 (1999) | Article | PubMed | ChemPort |
31. Szostak, J. W., Orr-Weaver, T. L., Rothstein, R. J. & Stahl, F. W. The double-strand-break
repair model for recombination. Cell 33, 25-35 (1983) | PubMed | ChemPort |
32. Jackson, J. A. & Fink, G. R. Gene conversion between duplicated genetic elements in yeast.
Nature 292, 306-311 (1981) | PubMed | ChemPort |
33. The International SNP Map Working Group. A map of human genome sequence variation
containing 1.42 million single nucleotide polymorphisms. Nature 409, 928-933
(2001) | Article | PubMed | ChemPort |
34. Underhill, P. A. et al. Y chromosome sequence variation and the history of human populations.
Nature Genet. 26, 358-361 (2000) | Article | PubMed | ChemPort |
35. Adams, M. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185-2195
(2000) | Article | PubMed |
36. Carvalho, A. B., Dobo, B. A., Vibranovski, M. D. & Clark, A. G. Identification of five new genes
on the Y chromosome of Drosophila melanogaster. Proc. Natl Acad. Sci. USA 98, 1322513230 (2001) | Article | PubMed | ChemPort |
37. Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome.
Nature 420, 520-562 (2002) | Article | PubMed | ChemPort |
38. Bailey, J. A. et al. Recent segmental duplications in the human genome. Science 297, 10031007 (2002) | Article | PubMed | ChemPort |
39. Stankiewicz, P. & Lupski, J. R. Genome architecture, rearrangements and genomic disorders.
Trends Genet. 18, 74-82 (2002) | Article | PubMed | ChemPort |
40. Ferguson-Smith, M. A. Karyotype-phenotype correlations in gonadal dysgenesis and their
bearing on the pathogenesis of malformations. J. Med. Genet. 2, 142-155 (1965)
41. Zinn, A. R., Page, D. C. & Fisher, E. M. C. Turner syndrome: The case of the missing sex
chromosome. Trends Genet. 9, 90-93 (1993) | Article | PubMed | ChemPort |
42. Rao, E. et al. Pseudoautosomal deletions encompassing a novel homeobox gene cause
growth failure in idiopathic short stature and Turner syndrome. Nature Genet. 16, 54-63
(1997) | PubMed | ChemPort |
43. Sun, C. et al. Deletion of azoospermia factor a (AZFa) region of human Y chromosome caused
by recombination between HERV15 proviruses. Hum. Mol. Genet. 9, 2291-2296
(2000) | Article | PubMed | ChemPort |
44. Blanco, P. et al. Divergent outcomes of intrachromosomal recombination on the human Y
chromosome: male infertility and recurrent polymorphism. J. Med. Genet. 37, 752-758
(2000) | Article | PubMed | ChemPort |
45. Kamp, C., Hirschmann, P., Voss, H., Huellen, K. & Vogt, P. H. Two long homologous retroviral
sequence blocks in proximal Yq11 cause AZFa microdeletions as a result of intrachromosomal
recombination events. Hum. Mol. Genet. 9, 2563-2572 (2000) | Article | PubMed | ChemPort |
46. Repping, S. et al. Recombination between palindromes P5 and P1 on the human Y
chromosome causes massive deletions and spermatogenic failure. Am. J. Hum. Genet. 71,
906-922 (2002) | Article | PubMed |
47. Carrel, L., Cottle, A. A., Goglin, K. C. & Willard, H. F. A first-generation X-inactivation profile of
the human X chromosome. Proc. Natl Acad. Sci. USA 96, 14440-14444
(1999) | Article | PubMed | ChemPort |
48. Osoegawa, K. et al. A bacterial artificial chromosome library for sequencing the complete
human genome. Genome Res. 11, 483-496 (2001) | Article | PubMed | ChemPort |
49. Zhao, S. et al. Human BAC ends quality assessment and sequence analyses. Genomics 63,
321-332 (2000) | Article | PubMed | ChemPort |
50. Watson, J. M., Spencer, J. A., Riggs, A. D. & Graves, J. A. Sex chromosome evolution:
Platypus gene mapping suggests that part of the human X chromosome was originally
autosomal. Proc. Natl Acad. Sci. USA 88, 11256-11260 (1991) | PubMed | ChemPort |
Acknowledgements. We thank R. Giardine, R. Oates and S. Silber for patient samples; and
J. Alfoldi, C. Disteche, J. Koubova and J. Lange for comments on the manuscript. This
work was supported by the National Institutes of Health and the Howard Hughes Medical
Institute.
Competing interests statement. The authors declare that they have no competing financial
interests.
Figure 1 The male-specific region of the Y chromosome. a, Schematic representation of the whole
chromosome, including the pseudoautosomal and heterochromatic regions. b, Enlarged view of a
24-Mb portion of the MSY, extending from the proximal boundary of the Yp pseudoautosomal
region to the proximal boundary of the large heterochromatic region of Yq. Shown are three classes
of euchromatic sequences, as well as heterochromatic sequences. A 1-Mb bar indicates the scale of
the diagram. c, d, Gene, pseudogene and interspersed repeat content of three euchromatic sequence
classes. c, Densities (numbers per Mb) of coding genes, non-coding transcription units, total
transcription units and pseudogenes. d, Percentages of nucleotides contained in Alu, retroviral,
LINE1 and total interspersed repeats. The data shown in c and d are available in numerical form in
Supplementary Tables 6 and 7. Supplementary Table 6 also provides information about the size and
(G + C) content of each sequence class.
Figure 2 Sequence-based map of the MSY; a detailed view of the 24-Mb region shown in Fig. 1b. Backgroun
indicate the three classes of MSY euchromatic sequences: X-transposed (pink), X-degenerate (yellow) and am
(blue), as well as heterochromatic (red stripes) and pseudoautosomal (green) sequences and NORF arrays (gre
Two gaps in the sequence are indicated at the top edge of the diagram. a, Eight primary palindromes (P1–P8)
secondary palindromes (P1.1 and P1.2). Diverging black arrows mark the left and right arms of each palindrom
between diverging arrows represent non-palindromic spacers at the centres of these structures. b, Near-perfect
repeats (non-palindromic), three in all (IR1 to IR3; Supplementary Table 4). In each case, the left and right arm
>99.5% nucleotide identity. c, Other inverted repeats (non-palindromic). Grey arrows (IR4) denote two region
identity, one on Yp and one on Yq. Yellow arrows (IR5) denote four regions of >92% identity, all on Yq. d, D
any of the four indicated regions—AZFa, P5/proximal P1 (AZFb), AZFc, or P5/distal P1—cause spermatogen
46
. e, Previously reported genes and new, experimentally verified transcription units for which cDNA sequenci
protein-coding potential (Table 2). Plus (+) and minus (-) strand are indicated by the top or bottom row, respec
Fig. 5c. g, Scale (Mb). h, Sequences whose transcription has been verified (in this or previous studies) but for
little or no evidence of protein-coding potential (Supplementary Table 2). i, Previously reported pseudogenes
apparently non-transcribed homologues of known coding genes (Supplementary Table 3). j, (G + C) content (
in a 100-kb sliding window with 1-kb steps. k, Alu, LINE1 and human endogenous retroviral (HERV) repeat
expressed as percentage of nucleotides, calculated in a 200-kb sliding window with 1-kb steps. l, 220 BAC clo
completely or partially sequenced. Each bar represents size and position of one BAC clone, identified by the n
portion of its GenBank accession number (in each case beginning with the prefix AC). Black bars represent fin
sequences deposited in GenBank, where finished sequences are trimmed to retain only 200 bp of overlap with
BACs. Grey bars represent the 'trimmings' of those BACs, not deposited in GenBank. Striped bars represent B
sequence has not been finished but has been deposited in GenBank. See Supplementary Fig. 2 for a more detai
this figure. The composite sequence of the 24-Mb region studied is available as Supplementary File 1.
Figure 3 MSY genes, transcription units and palindromes. a, Triangles denote sizes and locations of arms of e
repeats (whose arms exhibit 99.95% identity). Gaps between opposed triangles represent the non-duplicated 's
schematic, as in Fig. 1b. c, Nine families of protein-coding genes. Solid triangles denote apparently intact gene
are not shown. d, Single-copy protein-coding genes. e, Single-copy transcription units. These give rise to splic
Fifteen families of transcription units. g, Merged map of all genes and transcription units.
Figure 4 Three sequence classes in the MSY euchromatin. Colour scheme as in Fig. 2.
Figure 5 Sequence similarities within the MSY. a, Triangular dot plot in which the MSY's sequence is compa
the plot, each dot represents a match of >65% within a window of 2,000 nucleotides. Green dots represent mat
between LINE1 elements; red dots represent matches between heterochromatic sequences; blue dots represent
other sequences. Direct repeats appear as horizontal lines, inverted repeats as vertical lines, and palindromes a
nearly intersect the baseline. Long arrays of tandem repeats appear as pyramids. The inset indicates that the la
contains two smaller triangles (one revealing sequence similarities within Yp, and one revealing similarities w
rectangle (revealing similarities between Yp and Yq). b, MSY schematic, as in Fig. 1b. c, Plot of intrachromos
similarity, which serves to identify ampliconic sequences (blue). Using a 50-kb sliding window and 1-kb steps
euchromatic sequence was compared to all other available MSY euchromatic sequences. (Long interspersed re
before analysis.) At each point along the length of the MSY, the highest sequence similarity (expressed as per
identity) was identified. All such values >50% are shown. An expanded version of this plot is shown in Fig. 2f
Figure 6 Molecular evolutionary pathways and processes that gave rise to genes in three MSY
euchromatic sequence classes. X-degenerate genes and pseudogenes (yellow background) derived
from an autosomal pair that was ancestral to both the X and Y chromosomes (and that was enlarged
by subsequent fusion with other autosomes or autosomal segments50). X-transposed genes (pink
background) derived from X-linked genes, which in turn derived from the ancestral autosomal pair.
Ampliconic genes (blue background) were derived through three converging processes:
amplification of X-degenerate genes (for example, RBMY, VCY); transposition and amplification of
autosomal genes (DAZ); and retroposition and amplification of autosomal genes (CDY). Boxes
enumerate dominant themes in X-degenerate (yellow) and ampliconic (blue) gene evolution. The
asterisk indicates that Y–Y gene conversion is apparently common in the 61% of ampliconic
sequences that exhibit intrachromosomal identities of 99.9%.
Figure 7 Plot of Ks (Supplementary Table 5) versus X-linked gene order for 31 X–Y gene (or
gene/pseudogene) pairs. Colour highlighting of X-linked gene names indicates whether Y
homologues are X-degenerate (yellow), ampliconic (blue) or X-transposed (pink). Within the plot,
four yellow rectangles denote four previously defined 'evolutionary strata', or groups of genes26; a
small pink rectangle highlights two X-transposed genes. Genes in the X chromosome are ordered
according to the NCBI sequence assembly of November 2002; distances between genes are not
drawn to scale. Standard errors for Ks values are shown.
Figure 8 Evolutionary map of the MSY. At the bottom is an MSY schematic, as in Fig. 1b. Coloured rectangl
schematic depict the estimated male-specific ages of the corresponding segments of the modern MSY. These a
logarithmic scale (a). b, X–Y strata 1, 2, 3 and 4 (ref. 26 and Fig. 7). c, The chromosomes (more properly, the
of the chromosomes) from which the indicated X-transposed or ampliconic sequences apparently arose throug
evolution. d, MSY genes that apparently arose at the indicated times. e, Approximate times of divergence betw
other vertebrate lineages. The methods used to estimate the male-specific ages of each of the sequences and ge
Supplementary Table 9.
Figure 9 MSY sequences exhibiting 99.9% intrachromosomal identity probably undergo Y–Y
gene conversion. a, Electronic fractionation of MSY euchromatic sequences according to
intrachromosomal similarity (per cent identity to other MSY sequences), plotted on a logarithmic
scale. Values <70% are not shown. b, Sites of productive recombination in the Y chromosome.
Shown at the top is a schematic representation of the entire Y chromosome, including the
pseudoautosomal regions (green). The pseudoautosomal regions are sites of frequent X–Y crossing
over. Within the MSY's ampliconic sequences are many sites of apparently frequent Y–Y gene
conversion; all of these sites display intrachromosomal identities of 99.9%.
Nature 423, 873 - 876 (19 June 2003); doi:10.1038/nature01723
Abundant gene conversion between arms of palindromes in human and ape
Y chromosomes
STEVE ROZEN*, HELEN SKALETSKY*, JANET D. MARSZALEK*, PATRICK J. MINX†,
HOLLAND S. CORDUM†, ROBERT H. WATERSTON†, RICHARD K. WILSON† &
DAVID C. PAGE*
* Howard Hughes Medical Institute, Whitehead Institute, and Department of Biology, Massachusetts Institute of Technology, 9
Cambridge Center, Cambridge, Massachusetts 02142, USA
† Genome Sequencing Center, Washington University School of Medicine, 4444 Forest Park Boulevard, St Louis, Missouri 63108,
USA
Correspondence and requests for materials should be addressed to D.C.P. (page_admin@wi.mit.edu). All new DNA sequences and
STSs were submitted to GenBank with accession numbers AC139189–AC139194 (chimpanzee BACs), AY090860–AY090881
(palindrome boundary sequences in apes), and G73582–G73595 (STS for amplifying palindrome boundaries); see Supplementary
Information for details.
Eight palindromes comprise one-quarter of the euchromatic DNA of the malespecific region of the human Y chromosome, the MSY1. They contain many
testis-specific genes and typically exhibit 99.97% intra-palindromic (arm-toarm) sequence identity1. This high degree of identity could be interpreted as
evidence that the palindromes arose through duplication events that occurred
about 100,000 years ago. Using comparative sequencing in great apes, we
demonstrate here that at least six of these MSY palindromes predate the
divergence of the human and chimpanzee lineages, which occurred about 5
million years ago. The arms of these palindromes must have subsequently
engaged in gene conversion, driving the paired arms to evolve in concert.
Indeed, analysis of MSY palindrome sequence variation in existing human
populations provides evidence of recurrent arm-to-arm gene conversion in
our species. We conclude that during recent evolution, an average of
approximately 600 nucleotides per newborn male have undergone Y–Y gene
conversion, which has had an important role in the evolution of multi-copy
testis gene families in the MSY.
The human MSY palindromes, designated P1–P8, are surprisingly large, with arm
lengths that range from 9 kilobases (kb; P7) to 1.45 megabases (Mb; P1) (see Table 2
and Figs 2, 3 and 5 of the accompanying manuscript1). The paired arms of each
palindrome are separated by a non-duplicated spacer that measures 2–170 kb in
length. Fifteen gene and transcript families have been identified in the palindrome
arms (none in the spacers), and all seem to be expressed predominantly or exclusively
in testes1. Similar to the palindrome arms in which they reside, these gene families are
characterized by extremely low sequence divergence between the copies found in a
single Y chromosome.
The DAZ gene family of the MSY resides exclusively in the arms of palindromes P1 and
P2 (ref. 2). Near identity between DAZ copies in a single Y chromosome led some
investigators to conclude, based on molecular clock reasoning, that DAZ gene
amplification had occurred only within the last 200,000 years3. However, multiple Ylinked copies of DAZ also exist in apes and Old World monkeys3-6. This suggests that
palindromes P1 and P2, which contain the DAZ genes, might predate the divergence of
humans from other primate lineages. This may be true for the other MSY palindromes
as well. In that case, the near identity observed between palindrome arms could be
the consequence of gene conversion—"the non-reciprocal transfer of information from
one DNA duplex to another"7. Gene conversion sometimes involves transfer between
repeated sequences on the same chromosome8.
To test the ancient origins/gene-conversion hypothesis, we looked for evidence that
MSY palindromes were present in the common ancestor of humans and chimpanzees.
Specifically, we searched for orthologues of the eight human palindromes in
chimpanzees (Pan troglodytes), bonobos (pygmy chimpanzee, Pan paniscus) and
gorillas (Gorilla gorilla). In each species, and for each palindrome, we attempted to
amplify, by polymerase chain reaction (PCR), and sequence the two inner boundaries
(between spacer and arms) and the two outer boundaries (between arms and
surrounding sequences). We successfully amplified both inner boundaries in multiple
palindromes (Table 1). In all of these cases, the PCR products were observed only
when male genomic DNAs were used as templates, and never when using female
genomic DNAs (data not shown). This implies that the PCR products were amplified
from the male-specific regions of the great ape Y chromosomes. In all cases, the
boundary sequences were essentially identical in humans and great apes (Fig. 1a; see
also Supplementary Information). Only for P7 did we successfully amplify both outer
boundaries (in chimpanzee and bonobo). These findings suggested that: (1) most
palindromes found in the modern human MSY were already present, in the MSY, in the
common ancestor of humans and chimpanzees; and (2) inner boundaries are more
highly conserved than outer boundaries.
Figure 1 Sequence comparison of human and ape MSY
palindromes. Full legend
High resolution image and legend (79k)
To enable detailed comparisons of human and chimpanzee palindromes, we screened a
male chimpanzee genomic bacterial artificial chromosome (BAC) library for clones
homologous to the inner boundaries of human palindromes P1–P8. We identified and
then sequenced chimpanzee BACs corresponding to palindromes P1, P2, P6 and P7.
(The BAC library provided only one- to twofold coverage, on average, of chimpanzee
MSY sequences and thus was not expected to contain all boundaries of MSY
palindromes.) Comparative sequence analysis confirmed the structural similarity of the
human and chimpanzee palindromes and, by inference, their common ancestry (Fig.
1b; see Supplementary Information for complete sequence alignments). We observed
1.44% sequence divergence, on average, between orthologous palindrome arms in
human and chimpanzee (Fig. 1b and Table 2). Such divergence between species
probably reflects the simple accumulation of neutral mutations in the human and
chimpanzee lineages after their separation. However, within each of the chimpanzee
palindromes studied, we observed markedly little arm-to-arm divergence: 0.028%, on
average, which is statistically indistinguishable from the 0.021% arm-to-arm
divergence observed in the human MSY palindromes (Table 2; see also Supplementary
Table 7). We conclude that the MSY palindromes predated separation of the human
and chimpanzee lineages, and that, in both the human and chimpanzee lineages, the
paired arms of the palindromes evolved in concert.
If gene conversion between palindrome arms was responsible for our findings, it might
leave traces in the recent genealogy of the human MSY. In particular, we might find
evidence that single nucleotide differences between the two arms of a human MSY
palindrome had been eliminated by gene conversion. Examination of two CDY genes—
one in each arm of palindrome P1—revealed a duplicated site of sequence variation
that fulfilled this prediction. By sequencing this duplicated site in diverse, unrelated
men, we identified some Y chromosomes with a C at this site in both arms of P1 (C/C
chromosomes), other chromosomes with a C in one arm and a T in the second arm
(C/T chromosomes), and other chromosomes with a T in both arms (T/T
chromosomes; Fig. 2a). We confirmed these findings using a PCR/restriction-digestion
assay (Supplementary Fig. 4). This single nucleotide substitution occurs at nucleotide
381 of the CDY coding region but does not alter the predicted amino acid sequence.
Figure 2 Site in CDY1 showing evidence of multiple
independent gene conversion events. Full legend
High resolution image and legend (44k)
We then typed this nucleotide variant in 171 unrelated men chosen to represent the
great diversity of Y chromosomes that other investigators have discovered in human
populations. Specifically, these 171 Y chromosomes represented 42 distinct branches
of a robust tree of human Y chromosome genealogy (Supplementary Fig. 1)9. In this
sampling of the MSY genealogical tree, C/T chromosomes and T/T chromosomes were
confined to a young cluster of five closely related branches (Fig. 2b; see also
Supplementary Fig. 1). In the 37 other tested branches, only C/C chromosomes were
observed. This distribution (Fig. 2b) suggested that the chromosome immediately
ancestral to the five-branch cluster was C/T, and that this chromosome had arisen
(from a C/C chromosome) by a C T substitution in one arm of palindrome P1.
In three of this cluster's five branches, we observed T/T as well as C/T chromosomes
(Fig. 2; see also Supplementary Fig. 1). This finding is readily explained by gene
conversion in a C/T chromosome—the ancestral chromosome for this cluster—replacing
the C in one arm of palindrome P1 with the T in the other arm. The data reveal at least
three such gene-conversion events—one in each of the branches that have T/T
chromosomes (Fig. 2b). In one of these branches, we also observed C/C chromosomes
alongside C/T and T/T chromosomes (Fig. 2b). Here we surmise that gene conversion
in a C/T chromosome replaced the T in one arm of P1 with the C in the other arm.
Thus, during recent human history, gene conversion in C/T chromosomes has used
either the C copy or the T copy as template. In addition, we investigated two other
duplicated sites of sequence variation, and at both sites we found evidence of
recurrent gene conversion during recent human history (Supplementary Figs 2 and 3).
How frequently does gene conversion occur in the MSY palindromes? Near uniformity
of arm-to-arm sequence divergence in both human and chimpanzee palindromes
(Table 2 in ref. 1 and Fig. 1b) suggests a steady-state balance between new mutations
that create differences between arms, and gene-conversion events that erase these
differences. Accordingly, we can calculate the rate of gene conversion needed to
maintain the observed divergence in the face of new mutations. Let µ be the human
MSY mutation rate, 1.6 10-9 substitutions per nucleotide per year (see Methods). Let
d be the observed divergence between human MSY palindrome arms (3 10-4
substitutions per duplicated nucleotide), and let c be the (unknown) rate of gene
conversion (in both directions combined) per duplicated nucleotide per year.
Differences between arms are introduced at a rate of 2µ (as a mutation in either arm
creates a difference between arms), and homogenized at a rate of cd. Thus, at steady
state, cd = 2µ. Then c = 2µ/d = 2 1.6 10-9/3 10-4 = 1.1 10-5 gene conversions
per duplicated nucleotide per year. For a 20-year human generation, this corresponds
to a rate of 2.2 10-4 conversions per duplicated nucleotide per generation,
comparable to rates estimated directly in a mouse transgenic system10. Over the
5.4 Mb in human MSY palindromes (2.7 106 duplicated nucleotides), then, an
average of about 600 duplicated nucleotides have undergone arm-to-arm gene
conversion for every son born in recent human evolution. Most of these conversions
would have involved two identical DNA sequences, and thus their products would be
unobservable. The inferred kinetics of gene conversion in MSY palindromes is
especially striking because the MSY was previously viewed as recombinationally inert
under normal circumstances: it was known previously as the non-recombining region,
or NRY.
At present, we do not know whether gene conversion in MSY palindromes occurs
during meiosis, mitosis, or both. It may involve homology-directed double-strand
break repair, as in gene conversion between homologous chromosomes or sister
chromatids11. An interesting observation is that human–chimpanzee divergence is
significantly reduced in MSY palindrome arms as compared with other MSY sequences
examined (Table 2). This reduction is evident even when comparing Alu and other
interspersed repeat sequences that are presumed to be of little functional consequence
(Supplementary Table 1). Thus, the reduced rate of evolution in palindrome arms does
not seem to be due to selective constraints. A weak directional bias in gene
conversion, favouring restoration of the original sequence, might account for these
observations.
Our finding of abundant gene conversion in MSY palindromes raises questions about
the molecular-clock dating of other segmental duplications in the human genome12.
Some of these were interpreted as being of recent origin based on low copy-to-copy
divergence13. In other cases, however, analysis by Southern blots14, 15 or quantitative
PCR16 indicated that these duplications exist in great apes as well as in humans. Thus,
these duplications might well represent conserved genomic organizations subject to
gene conversion and concerted evolution. In the case of human X-chromosomal colour
vision genes, 2 kb of comparative sequence data confirm concerted evolution 17, 18. Our
current findings, taken together with these previous results, raise the possibility that
gene conversion in primate genomes could be much more pervasive than previously
thought.
Finally, we note a strong association between gene conversion and MSY testis genes.
In humans, all genes in MSY palindromes seem to be expressed predominantly or
exclusively in testes, and most MSY genes with this expression pattern occur in
palindromes1. Given the abundance of gene conversion in palindromes, we infer that
Y–Y gene conversion has accompanied and shaped the evolution of multi-copy testis
gene families in the MSY. Perhaps some selective advantage stemmed from the
palindromic duplication of MSY testis genes during human evolution. If so, has Y–Y
gene conversion had a role in that advantage? Has it allowed genes in palindromes to
resist, or at least retard, the evolutionary decay that is a hallmark of Y chromosome
evolution19? This could explain the observation, as reported in the accompanying
paper, that intact testis-specific genes tend to be located in palindrome arms whereas
non-functional copies of these genes seem to be distributed randomly (see Table 4 in
ref. 1). A full understanding of the functional and evolutionary significance of our
findings will require further study in primates and other mammals.
Methods
Estimating the MSY mutation rate We estimated the MSY mutation rate in the
human lineage based on the data and analysis in ref. 20, and an estimate of 5.5
million years ago for the most recent common ancestor of humans and chimpanzees21.
The result is 1.6 10-9 substitutions per nucleotide per year (Supplementary Fig. 5).
PCR amplification and sequencing of palindrome boundaries Supplementary
Table 2 lists the PCR primers and conditions used to amplify palindrome boundaries.
Supplementary Table 3 provides GenBank accession numbers for the chimpanzee,
bonobo and gorilla sequences obtained.
Identification and sequencing of chimpanzee BACs We screened high-density
filters from the RPCI-43 male chimpanzee BAC library22 (BACPAC resources) using
hybridization probes designed to detect sequences (1) near the inner boundaries of
palindromes P1–P6 and P8; (2) near P7; and (3) from a non-ampliconic region of the
human MSY. STS content and BAC-end sequences confirmed that, among the
candidate BACs identified by hybridization, six contained the central portions of
orthologues to human MSY palindromes. The BACs were sequenced as previously
described2. Supplementary Table 4 provides descriptions of the sequenced BACs and
their GenBank accession numbers.
Sequence analysis Sequences were aligned with CLUSTAL W using default
parameters23. In a few cases, the resulting alignments were adjusted manually. All
alignments are provided as Supplementary Information.
Typing nucleotide variants in palindrome arms The sites studied were CDY1 +
381 (Fig. 2 and Supplementary Fig. 1), CDY1 - 84 (Supplementary Fig. 2), and sY586
(Supplementary Fig. 3). sY586 was genotyped as previously described24. PCR primers
and conditions for amplifying CDY1 + 381 (sY1313) and CDY1 - 84 (sY1314) have
been deposited in GenBank (accession numbers G73596 and G73597, respectively).
When typing CDY1 + 381 by sequencing, 'primer A' in GenBank G73596 served as the
sequencing primer. CDY1 - 84 was typed by sequencing using 'primer B' in GenBank
G73597.
For the samples that showed evidence of gene conversion (Fig. 2 and Supplementary
Figs 1–3), we excluded the possibility of deletion of one copy of the variant site as
discussed in Supplementary Note 1.
Steady-state balance between mutations and gene-conversion To show that the
combined action of mutation and gene conversion results in a steady-state level of
arm-to-arm divergence, we use the following recursion: dn+1 = (1 - cg)dn + 2µg where
dn is the sequence divergence between repeat copies at generation n, µg is the
mutation rate per nucleotide per generation, and cg is the gene conversion rate per
duplicated nucleotide per generation. We presume that d0 = 0, corresponding to no
differences between sequence copies immediately after the initial duplication event.
However, as 1 - cg < 1, limn
dn = 2µg/cg, for any value of d0 small enough to
support cg. Because µg and d are very small, mutations almost never occur at sites that
already differ between the two palindrome arms, and this possibility can be ignored. As
shown in Supplementary Note 2, our analysis is a special case of Ohta's analysis25.
Supplementary information accompanies this paper.
Received 10 March 2003;
accepted 7 April 2003
References
1. Skaletsky, H. et al. The male-specific region of the human Y chromosome is a mosaic of
discrete sequence classes. Nature 423, 825-837 (2003) | Article |
2. Kuroda-Kawaguchi, T. et al. The AZFc region of the Y chromosome features massive
palindromes and uniform recurrent deletions in infertile men. Nature Genet. 29, 279-286
(2001) | Article | PubMed | ChemPort |
3. Agulnik, A. I. et al. Evolution of the DAZ gene family suggests that Y-linked DAZ plays little, or
a limited, role in spermatogenesis but underlines a recent African origin for human populations.
Hum. Mol. Genet. 7, 1371-1377 (1998) | Article | PubMed | ChemPort |
4. Reijo, R. et al. Diverse spermatogenic defects in humans caused by Y chromosome deletions
encompassing a novel RNA-binding protein gene. Nature Genet. 10, 383-393
(1995) | PubMed | ChemPort |
5. Glaser, B. et al. Simian Y chromosomes: species-specific rearrangements of DAZ, RBM, and
TSPY versus contiguity of PAR and SRY. Mamm. Genome 9, 226-231
(1998) | Article | PubMed | ChemPort |
6. Makova, K. D. & Li, W. H. Strong male-driven evolution of DNA sequences in humans and
apes. Nature 416, 624-626 (2002) | Article | PubMed | ChemPort |
7. Szostak, J. W., Orr-Weaver, T. L., Rothstein, R. J. & Stahl, F. W. The double-strand-break
repair model for recombination. Cell 33, 25-35 (1983) | PubMed | ChemPort |
8. Jackson, J. A. & Fink, G. R. Gene conversion between duplicated genetic elements in yeast.
Nature 292, 306-311 (1981) | PubMed | ChemPort |
9. Underhill, P. A. et al. The phylogeography of Y chromosome binary haplotypes and the origins
of modern human populations. Ann. Hum. Genet. 65, 43-62
(2001) | Article | PubMed | ChemPort |
10. Murti, J. R., Bumbulis, M. & Schimenti, J. C. High-frequency germ line gene conversion in
transgenic mice. Mol. Cell. Biol. 12, 2545-2552 (1992) | PubMed | ChemPort |
11. Johnson, R. D. & Jasin, M. Sister chromatid gene conversion is a prominent double-strand
break repair pathway in mammalian cells. EMBO J. 19, 3398-3407
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
(2000) | Article | PubMed | ChemPort |
Bailey, J. A. et al. Recent segmental duplications in the human genome. Science 297, 10031007 (2002) | Article | PubMed | ChemPort |
International Human Genome Sequencing Consortium Initial sequencing and analysis of the
human genome. Nature 409, 860-921 (2001) | Article | PubMed | ChemPort |
Small, K., Iber, J. & Warren, S. T. Emerin deletion reveals a common X-chromosome inversion
mediated by inverted repeats. Nature Genet. 16, 95-99 (1997)
Aradhya, S. et al. Multiple pathogenic and benign genomic rearrangements occur at a 35 kb
duplication involving the NEMO and LAGE2 genes. Hum. Mol. Genet. 10, 2557-2567
(2001) | Article | PubMed | ChemPort |
Rochette, C. F., Gilbert, N. & Simard, L. R. SMN gene duplication and the emergence of the
SMN2 gene occurred in distinct hominids: SMN2 is unique to Homo sapiens. Hum. Genet. 108,
255-266 (2001) | Article | PubMed | ChemPort |
Deeb, S. S., Jorgensen, A. L., Battisti, L., Iwasaki, L. & Motulsky, A. G. Sequence divergence
of the red and green visual pigments in great apes and humans. Proc. Natl Acad. Sci. USA 91,
7262-7266 (1994) | PubMed | ChemPort |
Zhou, Y.-H. & Li, W.-H. Gene conversion and natural selection in the evolution of X-linked color
vision genes in higher primates. Mol. Biol. Evol. 18, 780-783 (1996)
Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes. Phil. Trans. R. Soc.
Lond. B 355, 1563-1572 (2000) | Article | ChemPort |
Bohossian, H. B., Skaletsky, H. & Page, D. C. Unexpectedly similar rates of nucleotide
substitution found in male and female hominids. Nature 406, 622-625
(2000) | Article | PubMed | ChemPort |
Kumar, S. & Hedges, S. B. A molecular timescale for vertebrate evolution. Nature 392, 917920 (1998) | Article | PubMed | ChemPort |
Fujiyama, A. et al. Construction and analysis of a human-chimpanzee comparative clone map.
Science 295, 131-134 (2002) | Article | PubMed |
Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of
progressive multiple sequence alignment through sequence weighting, position-specific gap
penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680
(1994) | PubMed | ChemPort |
Saxena, R. et al. Four DAZ genes in two clusters found in AZFc region of human Y
chromosome. Genomics 67, 256-267 (2000) | Article | PubMed | ChemPort |
Ohta, T. Allelic and nonallelic homology of a supergene family. Proc. Natl Acad. Sci. USA 79,
3251-3254 (1982) | PubMed | ChemPort |
Casanova, M. et al. A human Y-linked DNA polymorphism and its potential for estimating
genetic and evolutionary distance. Science 230, 1403-1406 (1985) | PubMed | ChemPort |
Underhill, P. A. et al. Detection of numerous Y chromosome biallelic polymorphisms by
denaturing high-performance liquid chromatography. Genome Res. 7, 996-1005
(1997) | PubMed | ChemPort |
Shen, P. et al. Population genetic implications from sequence variation in four Y chromosome
genes. Proc. Natl Acad. Sci. USA 97, 7354-7359 (2000) | Article | PubMed | ChemPort |
Acknowledgements. We thank R. K. Alagappan and L. G. Brown for technical
contributions; N. A. Ellis, M. F. Hammer, T. Jenkins and P. A. Underhill for assistance
with genealogical studies; H. M. McClure and Yerkes Regional Primate Research Center
for samples; C. Disteche, A. E. Donnenfeld, J. H. Hersh, T. Jenkins, P. G. McDonough, B.
McGillivray, R. D. Oates, P. Patrizio, R. Rosenfield, L. Shapiro, S. Silber, M. C. Summers,
J. Weissenbach, B. Whitmire and S. Yang for patient samples; and J. E. Alfoldi, B.
Charlesworth, A. G. Clark, J. Koubova, J. Lange, B. Levy, T. L. Orr-Weaver, S. Repping,
W. R. Rice and J. Saionz for comments on the manuscript. This work was supported by the
National Institutes of Health and the Howard Hughes Medical Institute.
Competing interests statement. The authors declare that they have no competing financial
interests.
Figure 1 Sequence comparison of human and ape MSY palindromes. a, Nucleotide sequences of
inner boundaries of palindrome P6 in human and apes. Dots represent identity to human sequence.
Full interspecific alignments of this and other palindromes' boundaries are in Supplementary
Information. b, Overview of sequence divergence between human and chimpanzee palindromes,
and between palindrome arms within each species. Each palindrome is shown to scale, folded about
the centre of the spacer. For palindromes P1/P2 and P6, only the central portions are contained in
sequenced chimpanzee BACs, and the palindromes are not perfectly centred within the BACs.
Therefore, more sequence from one arm is available than from the other. For P1/P2 we include the
5' and 3' DAZ exons but exclude the central, intragenically duplicated regions of the gene24). The
CDY1 genes are not in the portions of P1 shown1. For palindrome P7, the entire sequence of both
arms is represented in sequenced chimpanzee BACs, as is extensive flanking, non-ampliconic
sequence. Supplementary Table 8 provides confidence intervals, calculations and links to sequence
alignments.
Figure 2 Site in CDY1 showing evidence of multiple independent gene conversion events. This
site, named CDY1 + 381, occurs in each arm of palindrome P1. a, Sequence traces for samples
PD365, PD335 and PD207 with C/C, C/T and T/T chromosomes, respectively. b, Distribution of
C/C, C/T and T/T chromosomes in the MSY genealogical tree, focusing on the cluster of related
branches to which C/T and T/T chromosomes are confined. M92, M67, M12, M172, p12f: biallelic
polymorphisms that define branch points in the part of the tree shown26–28. See Supplementary Fig.
1 for the full tree and inference of ancestral genotypes.
Y chromosome sequence completed
DNA readout reveals genetic palindromes safeguard maledefining chromosome.
19 June 2003
JOHN WHITFIELD
The Y chromosome has
sex with itself to guard
against mutation.
© GettyImages
Reports of the demise
of the Y chromosome
and an impending
extinction of men may
have been
exaggerated. The Y's
full genome sequence
reveals that we have
underestimated its
powers of selfpreservation.
Instead of doubling up
to protect its genetic
cargo like other chromosomes, the lone Y
safeguards its genes by having sex with itself, an
international consortium has found.
"We're on a quest to bring respectability to the Y
chromosome," says geneticist David Page of the
Massachusetts Institute of Technology, leader of
the sequencing team. The male-defining
chromosome was previously thought of as a
wasteland where genes go to die.
The Y's defences are double-edged, however,
sometimes leading to infertility. The sequence
should help us to diagnose and treat such genetic
mishaps.
Two-way street
Human chromosome pairs swap genes to minimize
bad mutations. Y, which has no partner, faces
being whittled away by mutation. Some estimate
that the chromosome could be complete junk in
about ten million years.
The finished sequence shows that the chromosome
fights entropy with palindromes. About six million
of its 50 million DNA letters reside in sequences
that read the same, in opposite directions, on both
strands of the double helix. The longest is nearly
three million letters long1. "The Y chromosome is a
hall of mirrors," says Page.
These palindromes house many genes - which
means that there is a copy at each end of the
palindromic sequence. These provide back-ups
should harmful mutations arise. The mirror-image
structure also allows the arms to swap position
when DNA divides. Genes are shuffled and bad
copies are purged.
Page's team has
calculated the amount
of swapping needed in
each generation to
produce the nearperfect palindromes of
the human Y. They
estimate that every
man's Y contains 600
DNA letters that differ
from his father's2. This
is thousands of times
more than the normal
mutation rate.
There are 50 million
letters in Y's finished
sequence.
source: Nature
"No one had
contemplated that there
would be this level of
gene conversion in our own genome," says
Huntington Willard of Duke University, Durham,
North Carolina. "It gives us a glimpse of how the Y
has protected itself."
Other researchers see swapping as an evolutionary
accident, not a safeguard. "It's a daring
suggestion, but I find it a bit difficult to believe,"
says geneticist Mark Jobling of the University of
Leicester, UK.
Jobling is sceptical because the trick has a high
cost: good genes are just as liable to be lost as
bad. This is a major cause of male infertility, as
most of the genes within the palindromes control
testes development. One in every few thousand
men is infertile because key genes have been
deleted.
Y files
Genetic testing is already used to diagnose male
infertility. A fuller understanding of the Y's make-
up will help refine these tests, and improve
doctors' advice to couples. "We have a greater
knowledge of where the Y tends to break," says
Page. "Testing needs to be updated to reflect our
better understanding from the finished sequence."
The palindromes, and other forms of repeated
DNA, made the Y chromosome very tricky to
sequence. So the finished sequence comes from
just one man's Y. Getting more sequences is
essential, says Jobling, as the chromosome's
structure, and hence biology, varies greatly around
the world.
"We have a beautiful snapshot of the Y
chromosome," he says. "Now we need to look in
other lineages to build up a photo album of its
diversity."
References
1.
2.
Skaletsky, H. et al. The male-specific region of the human
Y chromosome is a mosaic of discrete sequence classes.
Nature, 423, 825 - 837, (2003). |Article|
Rozen, S. et al. Abundant gene conversion between arms
of palindromes in human and ape Y chromosomes. Nature,
423, 873 - 876, (2003).
|Article|
Y chromosomes rewrite British history
Anglo-Saxons' genetic stamp weaker than historians
suspected
19 June 2003
HANNAH HOAG
A new survey of Y
chromosomes in the
British Isles suggests
that the Anglo-Saxons
failed to leave as much
of a genetic stamp on
the UK as history
books imply1.
Some Scottish men's Y's
are remarkably similar to
those of southern
England.
© GettyImages
Romans, AngloSaxons, Danes,
Vikings and Normans
invaded Britain
repeatedly between 50
BC and AD 1050. Many
historians ascribe
much of the British
ancestry to the AngloSaxons because their
written legacy
overshadows that of
the Celts.
But the Y
chromosomes of the
regions tell a different story. "The Celts weren't
pushed to the fringes of Scotland and Wales; a lot
of them remained in England and central Ireland,"
says study team member David Goldstein, of
University College London. This is surprising: the
Anglo-Saxons reputedly colonized southern
England heavily.
The Anglo-Saxons and Danes left their mark in
central and eastern England, and mainland
Scotland, the survey says, and the biological traces
of Norwegian invaders show up in the northern
British Isles, including Orkney.
Similar studies, including one by the same team,
have looked at differences in mitochondrial DNA,
which we inherit from our mothers. They found
little regional variation because females tended to
move to their husbands.
But the Y chromosome shows sharper differences
from one geographic region to the next, says
geneticist Luca Cavalli-Sforza, of Stanford
University, California. "The Y chromosome has a
lower mutation rate than mitrochondrial DNA."
Goldstein's team collected DNA samples from more
than 1,700 men living in towns across England,
Ireland, Scotland and Wales. They took a further
400 DNA samples from continental Europeans,
including Germans and Basques. Only men whose
paternal grandfathers had dwelt within 20 miles of
their current home were eligible.
The Y chromosomes of men from Wales and
Ireland resemble those of the Basques. Some
believe that the Basques, from the border of
France and Spain, are the original Europeans.
The new survey is an example of how
archaeologists, prehistorians and geneticists are
beginning to collaborate, comments Chris TylerSmith of the University of Oxford, UK, who tracks
human evolution using the Y chromosome. "It
would be nice to see the whole world surveyed in
this kind of detail, but it's expensive and there are
other priorities."
References
1.
Capelli, C. et al. A Y chromosome census of the British
Isles. Current Biology, 13, 979 - 984, (2003).
|Article|
28 February 2002
Nature 415, 963 (2002); doi:10.1038/415963a
Human spermatozoa: The future of sex
R. JOHN AITKEN1 AND JENNIFER A. MARSHALL GRAVES2
1
R. John Aitken is at the Hunter Medical Research Institute and is in the Discipline of Biological Sciences, University of Newcastle,
Callaghan, New South Wales 2308, Australia.
2 Jennifer A. Marshall Graves is in the Research School of Biological Sciences, Australian National University, Canberra, ACT 2601,
Australia.
The desperate plight of the human spermatozoon is clearly reflected by the poor fecundity
of our species. Human spermatozoa stand apart from the gametes of virtually all other
mammals in the paucity of their phenotype, the inadequacy of their function, and the
sensitivity to fragmentation of their mitochondrial and nuclear genomes. Roughly one in
seven Western couples seek treatment for infertility, mostly because of problems with
semen quality.
Even when a human spermatozoon achieves the fertilization of an oocyte, damage can crop
up in the next generation. All dominant mutations in our species (such as those that cause
achondroplasia, multiple endocrine neoplasia and Apert's syndrome) seem to arise in the
male germ line. Spermatozoa are also important mediators of the environmental
contribution to cancer in young adults and children; for example, heavy smoking in fathers
is reported to confer a fourfold increase in cancer risk to their children.
The impact of environmental toxicants and the innate
inadequacy of human spermatozoa are compounded
by the advent of effective contraception and the
introduction of assisted-conception technologies. This
lifting of the selection pressure on fertility means that
those endowed with genes for high fecundity have lost
their advantage over those without. As a result, future
generations are bound to experience a further decline
in semen quality and, ultimately, human fertility.
What mechanisms are responsible for the poor
fertilizing potential and genetic damage shown by
human spermatozoa? Two main causes of germ-cell
dysfunction have recently been discovered: gene
deletions on the long arm of the male sex-determining
Y chromosome, and oxidative stress. We believe that
these aetiologies may be associated.
Swimming against the tide: sperm
quality seems set to decline still
further in future generations.
The Y chromosome is particularly vulnerable to gene deletions because it is not a matching
partner for the X chromosome, so it cannot retrieve lost genetic information by homologous
recombination. Over the past 300 million years, the mammalian Y chromosome has been
reduced from a pairing partner to the X chromosome to a shadow of its former self, rescued
only by a large addition from a non-sex-determining chromosome in 'placental' mammals.
Many of the remaining genes have acquired functions essential for sex determination and
spermatogenesis.
The original Y chromosome contained around 1,500 genes, but during the ensuing 300
million years all but about 50 were inactivated or lost. Overall, this gives an inactivation
rate of five genes per million years. The presence of many genes that have lost their
function (pseudogenes) on the Y chromosome indicates that this process of attrition is
continuing, so that even these key genes will be lost. At the present rate of decay, the Y
chromosome will self-destruct in around 10 million years. This has already occurred in the
mole vole, in which the Y chromosome (together with all of its genes) has been completely
lost from the genome.
Accelerated degeneration of the Y chromosome is found in the 5–15% of severely infertile
men whose infertility is caused by wholesale deletions of parts of this chromosome.
Because mutations that cause infertility cannot be inherited, the relative abundance of Ychromosome deletions in male patients suggests an extremely high rate of spontaneous
DNA damage. Even microdeletions on the Y chromosome destabilize its transmission,
frequently causing it to be lost during gamete production.
One important mechanism by which DNA damage is induced in the male germ line is
oxidative stress. Spermatozoa are particularly vulnerable to this because they generate
reactive oxygen species and are rich in targets for oxidative attack. Moreover, because they
are transcriptionally inactive and have little cytoplasm, spermatozoa are deficient in both
antioxidants and DNA-repair systems. Oxidative stress is thus a major cause of male
infertility, and contributes to the high rate of DNA fragmentation in spermatozoa.
Such DNA fragmentation probably predisposes the cell to mutagenic change, which would
become fixed as a deletion in the embryo by aberrant recombination. The Y chromosome is
susceptible to such recombination because of its high frequency of repetitive elements.
Fragmentation induced by free radicals in the Y chromosome's DNA might also cause other
post-fertilization genetic changes, such as insertions and amplifications. Because mutations
that originate in this way precede the embryo's first cleavage division, they will enter the
germ line and contribute to infertility and morbidity, including cancer, in the offspring.
At present, we have no idea what causes oxidative stress, DNA fragmentation and
functional incompetence in human spermatozoa. But we do know for certain that such
events put pressure on the vulnerable Y chromosome. In the long term, absolute selection
against males with deletions that confer sex reversal or sterility will create strong pressure
either to retain (and amplify) fertility genes, or for any fertile variant that replaces it. Could
the present race of humans eventually be replaced by a new variant (or several independent
variants that cannot cross-hybridize) with an alternative sex-determining/differentiation
system? Such a new hominid race could differ from present humans in many other
characteristics, depending on the gene pool of the new variant's handful of founders.
There is evidence of rapid 'selective sweeps' in the Y chromosome's evolution. These take
place when a Y chromosome with an allele that confers a big selective advantage rapidly
replaces other Y chromosomes in the population. As the Y chromosome is never broken up
by recombination, whatever alleles lie at other genetic loci — even if they are deleterious
— will 'hitch-hike' to fixation along with the advantageous variant.
FURTHER READING
Aitken, R. J. J. Reprod. Fertil. 115, 1–7 (1999).
Marshall Graves, J. A. Biol. Reprod. 63, 667–676 (2000).
Just, W. et al. Nature Genet. 11, 117–118 (1995).
Kuroda-Kawaguchi, T. et al. Nature Genet. 29, 279–286 (2001).
Kamp, C. et al. Mol. Hum. Reprod. 7, 987–994 (2001).
10 August 2000
Nature 406, 622 - 625 (2000); doi:10.1038/35020557
Unexpectedly similar rates of nucleotide substitution found in male and female
hominids
HACHO B. BOHOSSIAN, HELEN SKALETSKY & DAVID C. PAGE
Howard Hughes Medical Institute, Whitehead Institute, and Department of Biology, Massachusetts Institute of Technology, 9
Cambridge Center, Cambridge, Massachusetts 02142, USA
Correspondence and requests for materials should be addressed to D.C.P. (e-mail: dcpage@wi.mit.edu).
In 1947, it was suggested that, in humans, the mutation rate is dramatically higher in
the male germ line than in the female germ line1. This hypothesis has been supported
by the observation that, among primates, Y-linked genes evolved more rapidly than
homologous X-linked genes2-6. Based on these evolutionary studies, the ratio ( m) of
male to female mutation rates in primates was estimated to be about 5. However,
selection could have skewed sequence evolution in introns and exons7-10. In addition,
some of the X–Y gene pairs studied lie within chromosomal regions with substantially
divergent nucleotide sequences7, 11, 12. Here we directly compare human X and Y
sequences within a large region with no known genes. Here the two chromosomes are
99% identical, and X–Y divergence began only three or four million years ago, during
hominid evolution13-15. In apes, homologous sequences exist only on the X
chromosome. We sequenced and compared 38.6 kb of this region from human X,
human Y, chimpanzee X and gorilla X chromosomes. We calculated m to be 1.7 (95%
confidence interval 1.15–2.87), significantly lower than previous estimates in primates.
We infer that, in humans and their immediate ancestors, male and female mutation
rates were far more similar than previously supposed.
Li et al. have suggested that the most accurate and precise estimates of m should emerge
from comparison of lengthy DNA segments that are non-functional but highly similar in
sequence7. We therefore focused upon a large region of 99% nucleotide identity between
the long arm of the human X chromosome (Xq) and the short arm of the human Y
chromosome (Yp). These sequences are present on human Yp because of a massive X-to-Y
transposition that occurred about three or four million years ago, after divergence of the
human and chimpanzee lineages13-15. This Xq–Yp region is poor in genes (T. Kawaguchi et
al., unpublished results). The 1% divergence between the human X- and Y-linked
sequences presumably reflects the random accumulation of new mutations on both the X
and Y chromosomes during the last three to four million years of hominid evolution. Given
this evolutionary history and the paucity of genes, the Xq–Yp region offers a nearly ideal
substrate for estimation of m in hominids.
From within this Xq–Yp region we selected a 38.6-kb segment for detailed study. Human
X and Y-chromosomal bacterial artificial chromosomes (BACs) containing this segment
had been isolated in our laboratory and sequenced at the Whitehead Institute/MIT Center
for Genome Research. This segment has a total content of guanine and cytosine (G+C
content) of 35%; Alus, LINES, and other interspersed repeat elements account for 61% of
the total sequence on both X and Y. We found no known or electronically predicted genes
within this segment or within 100 kb to either side of the segment.
We compared the sequences of this 38.6-kb segment as found on the human X, human Y,
chimpanzee X and gorilla X chromosomes. For the chimpanzee and gorilla X
chromosomes, we generated sequencing templates by polymerase chain reaction (PCR)
amplification using female genomic DNAs as starting material. As controls, we resequenced the corresponding portions of the human X and Y BACs, again using PCRgenerated templates. In this manner, we were able to assemble 38.6 kb of virtually
continuous sequence from all four chromosomes. Assembly and sequencing of genomic
PCR products across the entire 38.6-kb segment were straightforward because the region's
interspersed repeat elements, though numerous, were ancient and thus did not impede
selection of locus-specific oligonucleotide primers. This characteristic of the region,
together with the absence of genes, had led us to select it for study.
We discovered that, within this segment, the human X and Y chromosome differed at 441
nucleotides (Table 1); the two human chromosomes were 98.86% identical, in good
agreement with previous estimates for the larger Xq–Yp region15. Of these 441 nucleotide
substitutions, we were able to infer in 413 cases whether the mutation had occurred on the
hominid X chromosome (175 cases) or on the hominid Y chromosome (238 cases). This
was done by examining, for each substitution, the corresponding nucleotide position on the
chimpanzee and gorilla X chromosomes. For example, if a T nucleotide was present on the
human Y chromosome, but an A was present at the corresponding site on human,
chimpanzee, and gorilla X chromosomes, we inferred that the primitive or ancestral state
was A, and that an A-to-T substitution had occurred on the hominid Y chromosome.
Conversely, if the human Y, gorilla X, and chimpanzee X chromosomes were identical to
each other but differed from the human X chromosome at a particular nucleotide, we
inferred that a mutation had arisen on the hominid X chromosome. We were unable to infer
the chromosomal origin of the substitution at only 28 nucleotide sites; in most such cases,
we observed nucleotide differences between chimpanzee and gorilla. (We also traced the
chromosomal origins of small insertions and deletions—15 on the hominid X chromosome,
23 on the hominid Y chromosome—but these events were too few to merit detailed
analysis.)
Using the inferred numbers of nucleotide substitutions on the hominid X and Y
chromosomes, and ignoring the 28 unresolved differences, we estimated m by Miyata's
formula:
where Y/X is the ratio of mutation rates on the two sex chromosomes2. This formula is
based on the expectation that, in any generation, two-thirds of X chromosomes are
transmitted through the female germ line, the remaining one-third being transmitted
through the male germ line. By contrast, all Y chromosomes are transmitted through the
male germ line. Our observed Y/X ratio of (238/175) = 1.36 (95% confidence interval
1.12–1.65) implies that m = 1.66 (95% confidence interval 1.19–2.45). Restricting the
analysis to either transitions or transversions does not alter the estimate of m (Table 1).
Our calculations of Y/X and m are based on direct comparison of human X and human Y
chromosomal sequences with a readily inferred ancestral sequence. All previous estimates
of Y/X (and thus of m) in primates involved X–Y gene pairs that were too diverged to
allow accurate reconstruction of ancestral sequences. Previous estimates of Y/X required
construction and comparison of two trees of evolutionary distances: one tree for
orthologous Y-linked sequences in several species and a second tree for orthologous Xlinked sequences in the same species3-6. To ensure that discrepancies between past and
present findings were not attributable to different methods of calculation, we re-analysed
our sequence data using the traditional method. We first estimated evolutionary distances
between the human Y, human X, chimpanzee X and gorilla X sequences using the method
of ref. 16 (Table 2). As expected, the resulting phylogenetic tree (Fig. 1) casts chimpanzee
and gorilla as outgroups to the human X and Y sequences. The tree yields a Y/X value of
(0.67/0.51) = 1.31 (95% confidence interval 1.05–1.55), implying that m = 1.55 (95%
confidence interval 1.08–2.14). Thus, recalculating Y/X and m by the traditional
phylogenetic method yields essentially the same results as our direct calculation. The
phylogenetic tree also suggests that X–Y gene conversion was not a major factor in the
evolution of these sequences in hominids. (Even if some X–Y gene conversion had
occurred, it would not affect our estimates of m.) Finally, the phylogenetic tree
corroborates the conclusion15 that the X-to-Y transposition occurred within one to two
million years after the divergence (roughly four to six million years ago) of the human and
chimpanzee lineages.
Figure 1 Phylogenetic tree of nucleotide sequences.
Full legend
High resolution image and legend (17k)
Although our direct calculation indicated that m 1.66, this may be a slight underestimate.
Our calculation assumed that the 1.1% divergence observed between the human X- and Ylinked sequences was entirely attributable to mutations that arose after X-to-Y
transposition. However, the 38.6-kb sequence analysed may have been polymorphic (on the
X chromosome) at the time of transposition, three to four million years ago. Such ancient
polymorphism could account for part of the observed X–Y divergence and could result in
our underestimating both Y/X and m. Nonetheless, recent studies of sequence diversity on
X chromosomes in modern humans and chimpanzees suggest that the correction for ancient
polymorphism should be modest. Assuming that X-linked sequence diversity in ancient
hominids approximated that in modern human populations (4 10-4 per base pair, bp)17, our
estimate of m is essentially unchanged. Alternatively, if X-linked sequence diversity in
ancient hominids was as high as that in modern chimpanzee populations (1.3 10-3 per
bp)18, then our estimate of m should be corrected to about 1.8 (95% confidence interval
1.15–2.87).
In any case, our estimate of m in hominids is much lower than previous estimates in
primates (Table 3). Several factors may account for this discrepancy. Our study addressed
the last three to four million years of hominid evolution. By contrast, previous experimental
designs required comparisons among divergent primate lineages and thus yielded estimates
of m averaged across broad swaths of primate evolution3-6. These studies could not have
detected whether m was lower in hominids than in other primates. Second, our study
examined a much larger DNA segment and yielded more precise estimates of m. Large
standard errors in previous estimates (Table 3) may explain, in part, the apparent disparity
with present findings. Third, as has been pointed out, nucleotide sequence context may
influence or bias substitution rates7, 11, 12. This may have affected previous estimates of m,
as these were based on relative substitution rates in substantially diverged portions of the X
and Y chromosomes3-6. Such contextual bias should be negligible in our present
examination of X and Y-chromosomal regions whose DNA sequences are 99% identical.
Finally, and perhaps most importantly, the present analysis involved sequences which
appear to be physically distant from any gene; selective neutrality can reasonably be
assumed. By contrast, previous estimates of m were based on analyses of introns or exons.
The higher estimates of m in past studies could reflect: (1) relaxed selective constraints on
Y-linked introns and exons as compared with their X-linked homologues7-9; (2) diminished
mutation rates in X-linked genes10, 19, or both. Future studies of genes within the region of
99% X–Y identity may allow investigators to document the effects, if any, of mutation and
selection bias on estimates of m.
Our findings suggest that substitution rates were only modestly higher in males than
females during the last three to four million years of hominid evolution. If these inferences
extend to modern humans, as seems likely, then they have ramifications for medical
genetics. Many individuals are afflicted by autosomal dominant or X-linked recessive
disorders because of new mutations that appeared in their parents' or grandparents'
germlines1, 20, 21. In several such disorders, and especially those caused by recurrence of
specific substitutions at particular nucleotide positions in one gene (such as
achondroplasia22), nearly all new mutations arise in the male germline. Bolstered by studies
of X–Y gene evolution in primates2-6, this dramatic sex bias at a small number of
extraordinarily mutable nucleotides has been taken as evidence that substitution rates across
the human genome are much higher in males than in females20, 21. Our results suggest a reinterpretation of these medically ascertained hot spots for mutation. Our estimate of m
1.7 is based on the complete, diverse set of germline substitutions that accumulated within a
large, selectively neutral region. Our data may provide the best global estimate to date of
substitutional sex ratios in the human genome. From this point of reference, the large sex
biases at some medically important hot spots appear as marked departures from global
norms, underscoring the importance of unexplored interactions between sequence context
and sex at these unusual sites.
Our findings also challenge the model that human mutation rates are directly proportional
to the number of cell divisions, regardless of sex. Beginning with Haldane1, high m values
have been attributed to the much greater number of germline cell divisions in males (with
mitotically active spermatogonial stem cells) than in females (where germ cells cease
dividing during foetal development)2, 3. Our results, however, suggest that sexual
asymmetry in substitution rates is far less striking than sexual asymmetry in numbers of
cell divisions, at least in hominids. We suggest two possibilities for further investigation.
Perhaps errors in mitotic DNA replication and repair account for a minority of germline
substitutions in human genes. Alternatively, perhaps DNA replication and repair are
unusually accurate in spermatogonial stem cells and their prospermatogonial precursors,
which account for most of the excess cell divisions in the male germ line.
Methods
Reference DNA sequences We studied a 38.6-kb X–Y homologous segment which
corresponds both to nucleotides 28,842–67,422 of a human X-chromosomal BAC
(GenBank AC002488) and to nucleotides 88,369–127,049 of a human Y-chromosomal
BAC (GenBank AC002509). These BACs, isolated in our laboratory from the California
Institute of Technology A (CTA) library23, had been sequenced at the Whitehead
Institute/MIT Center for Genome Research. Interspersed repetitive elements within the
38.6-kb segment were identified electronically using RepeatMasker
(http://ftp.genome.washington.edu/RM/RepeatMasker.html); all repeats were included in
the analysis of mutations. Electronic searches employing GenScan24, Grail25 and BLAST26
failed to identify any genes or exons within the 38.6-kb segment. Electronic searches also
failed to identify any genes or exons in three adjoining Y-chromosomal BACs (GenBank
AC012078, AC010094 and AC010737), all sequenced at the Washington University
Genome Sequencing Centre.
Resequencing A series of overlapping fragments, each about 1 kb in length, that
collectively spanned the 38.6-kb region was generated by PCR using each of four DNAs as
starting material: the human X-chromosomal BAC, the human Y-chromosomal BAC,
chimpanzee female genomic DNA, and gorilla female genomic DNA. PCR primers were
selected using Primer3 (ref. 27) and are available upon request. PCR products were purified
on Sephacryl-S300 columns and sequenced using fluorescent-dye-terminator cycle
sequencing protocols (ThermoSequenase kit; Amersham). Primers used in PCR generation
of sequencing templates were also used as sequencing primers; additional sequencing
primers were selected at sites internal to the PCR-generated templates.
The sequences of each of the four chromosomes (human X, human Y, chimpanzee X,
gorilla X) were assembled using Sequencher 3.1 (Gene Codes Corp.). The four
chromosomes were aligned using MegAlign (DNASTAR, Inc.), and the alignment was
edited manually (alignment available upon request). There was only one discrepancy (T C
at nucleotide 56,776 in X-chromosomal sequence) between our PCR-generated human X
and Y-chromosomal sequences and the corresponding, GenBank-deposited reference
sequences.
Statistical calculations In this direct method, m was calculated directly from the inferred
numbers of Y and X substitutions via the formula Y/X = 3 m/(2 + m) (ref. 2). We
calculated confidence intervals for ratios of substitution rates using the formula for relative
risk28.
We also analysed our sequence data using the method of ref. 3. We began by calculating,
for each pairwise sequence comparison, the number of substitutions per 100 nucleotides16.
From there we estimated branch lengths and their variances29, and these values in turn
enabled us to estimate Y/X and m (ref. 3).
When correcting estimates of m for ancient polymorphism, we calculated means and
variances for the numbers of substitutions after X-to-Y transposition, and then calculated
means and standard deviations for Y/X using the delta method30.
GenBank accession numbers Gorilla: AF190869, AF190870 and AF190871.
Chimpanzee: AF190865, AF190866, AF190867 and AF190868.
Received 19 April 2000;
accepted 2 June 2000
References
1. Haldane, J. B. S. The mutation rate of the gene for haemophilia, and its segregation ratios in
males and females. Ann. Eugen. 13, 262-271 (1947). | ISI |
2. Miyata, T., Hayashida, H., Kuma, K., Mitsuyasu, K. & Yasunaga, T. Male-driven molecular
evolution: A model and nucleotide sequence analysis. Cold Spring Harbor Symp. Quant. Biol.
52, 863-867 (1987). | PubMed | ISI | ChemPort |
3. Shimmin, L. C., Chang, B. H. -J. & Li, W. -H. Male-driven evolution of DNA sequences. Nature
362, 745-747 (1993). | PubMed | ISI | ChemPort |
4. Shimmin, L. C., Chang, B. H. -J. & Li, W.-H. Contrasting rates of nucleotide substitution in the
X-linked and Y-linked zinc finger genes. J. Mol. Evol. 39, 569-578
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
(1994). | PubMed | ISI | ChemPort |
Chang, B. H.-J., Hewett-Emmett, D. & Li, W.-H. Male-to-female ratios of mutation rate in higher
primates estimated from intron sequences. Zool. Stud. 35, 36-48 (1996). | ISI | ChemPort |
Huang, W., Chang, B. H.-J., Gu, X., Hewett-Emmett, D. & Li, W.-H. Sex differences in mutation
rate in higher primates estimated from AMG intron sequences. J. Mol. Evol. 44, 463-465
(1997). | PubMed | ISI | ChemPort |
Shimmin, L. C., Chang, B. H.-J., Hewett-Emmett, D. & Li, W.-H. Potential problems in
estimating the male-to-female mutation rate ratio from DNA sequence data. J. Mol. Evol. 37,
160-166 (1993). | PubMed | ISI | ChemPort |
Charlesworth, B. The effect of background selection against deleterious alleles on weakly
selected, linked variants. Genet. Res. 63, 213-227 (1994). | PubMed | ISI | ChemPort |
Li, W.-H. Molecular Evolution (Sinauer, Sunderland, 1997).
McVean, G. T. & Hurst, L. D. Evidence for a selectively favourable reduction in the mutation
rate of the X chromosome. Nature 386, 388-392 (1997). | PubMed | ISI | ChemPort |
Bulmer, M. Neighboring base effects on substitution rates in pseudogenes. Mol. Biol. Evol. 3,
322-329 (1986). | PubMed | ISI | ChemPort |
Wolfe, K. H., Sharp, P. M. & Li, W.-H. Mutation rates differ among regions of the mammalian
genome. Nature 337, 283-285 (1989). | PubMed | ISI | ChemPort |
Page, D. C., Harper, M. E., Love, J. & Botstein, D. Occurrence of a transposition from the Xchromosome long arm to the Y-chromosome short arm during human evolution. Nature 311,
119-123 (1984). | PubMed | ISI | ChemPort |
Mumm, S., Molini, B., Terrell, J., Srivastava, A. & Schlessinger, D. Evolutionary features of the
4-Mb Xq21. 3 XY homology region revealed by a map at 60-kb resolution. Genome Res. 7,
307-314 (1997). | PubMed | ISI | ChemPort |
Schwartz, A. et al. Reconstructing hominid Y evolution: X-homologous block, created by X-Y
transposition, was disrupted by Yp inversion through LINE-LINE recombination. Hum. Mol.
Genet. 7, 1-11 (1998). | Article | PubMed | ISI | ChemPort |
Tajima, F. & Nei, M. Estimation of evolutionary distance between nucleotide sequences. Mol.
Biol. Evol. 1, 269-285 (1984). | PubMed | ISI | ChemPort |
Kaessmann, H., Heissig, F., von Haeseler, A. & Paabo, S. DNA sequence variation in a noncoding region of low recombination on the human X chromosome. Nature Genet. 22, 78-81
(1999). | Article | PubMed | ISI | ChemPort |
Kaessmann, H., Wiebe, V. & Paabo, S. Extensive nuclear DNA sequence diversity among
chimpanzees. Science 286, 1159-1162 (1999). | Article | PubMed | ISI | ChemPort |
Ellegren, H. & Fridolfsson, A. K. Male-driven evolution of DNA sequences in birds. Nature
Genet. 17, 182-184 (1997). | PubMed | ISI | ChemPort |
Vogel, F. & Motulsky, A. G. Human Genetics (Springer, Berlin, 1997).
Crow, J. The high spontaneous mutation rate: Is it a health risk? Proc. Natl Acad. Sci. USA 94,
8380-8386 (1997). | Article | PubMed | ISI | ChemPort |
Wilkin, D. J. et al. Mutations in fibroblast growth-factor receptor 3 in sporadic cases of
achondroplasia occur exclusively on the paternally derived chromosome. Am. J. Hum. Genet.
63, 711-716 (1998). | Article | PubMed | ISI | ChemPort |
Shizuya, H. B. et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human
DNA in Escherichia coli using an F-factor-based vector. Proc. Natl Acad. Sci. USA 89, 87948797 (1992). | PubMed | ISI | ChemPort |
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol.
Biol. 268, 78-94 (1997). | Article | PubMed | ISI | ChemPort |
Uberbacher, E. C. & Mural, R. J. Locating protein-coding regions in human DNA sequences by
a multiple sensor-neural network approach. Proc. Natl Acad. Sci. USA 88, 11261-11265
(1991). | PubMed | ISI | ChemPort |
26. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search
tool. J. Mol. Biol. 215, 403-410 (1990). | Article | PubMed | ISI | ChemPort |
27. Rozen, S. & Skaletsky, H. in Bioinformatics Methods and Protocols (eds Misener, S. &
Krawetz, S. A.) (Humana, Totowa, 1999).
28. Agresti, A. Categorical Data Analysis (Wiley, New York, 1990).
29. Li, W. -H. A statistical test of phylogenies estimated from sequence data. Mol. Biol. Evol. 6,
424-435 (1989). | PubMed | ISI | ChemPort |
30. Bishop, Y. V. V., Feinberg, S. E. & Holland, P. W. Discrete Multivariate Analysis (MIT Press,
Cambridge, Massachusetts, 1975).
Acknowledgements. We thank A. Schwartz for identifying homologous X and Ychromosomal BACs, colleagues at the Whitehead Institute/MIT Centre for Genome
Research for sequencing those BACs, and J. Bradley, A. Chakravarti, B. Charlesworth, A.
Clark, D. Haig, T. Kawaguchi, L. Kruglyak, F. Lewitter, Y.-F. Lim, D. Reich, W. Rice, S.
Rozen, C. Tilford and J. Wang for comments on the manuscript. Supported in part by the
NIH.
Figure 1 Phylogenetic tree of nucleotide sequences. Branch lengths were estimated from the
pairwise evolutionary distances (substitutions per 100 sites) in Table 2.
23 March 2000
Nature 404, 351 - 352 (2000); doi:10.1038/35006158
Y-chromosome variation and Irish origins
A pre-neolithic gene gradation starts in the near East and culminates in western
Ireland.
Ireland's position on the western edge of Europe suggests that the genetics of its
population should have been relatively undisturbed by the demographic movements
that have shaped variation on the mainland. We have typed 221 Y chromosomes from
Irish males for seven (slowly evolving) biallelic and six (quickly evolving) simple
tandem-repeat markers. When these samples are partitioned by surname, we find
significant differences in genetic frequency between those of Irish Gaelic and of
foreign origin, and also between those of eastern and western Irish origin. Connaught,
the westernmost Irish province, lies at the geographical and genetic extreme of a
Europe-wide cline.
Surnames have been used in Ireland from about AD 950 as markers of complex local
kinship systems. As both surnames and Y chromosomes are paternally inherited, we
divided our Irish sample into seven surname cohorts for which ancient geographical
information is known, with some error. Four are of prehistoric, Gaelic origin (Ulster,
Munster, Leinster and Connaught) and three are diagnostic of historical influx (Scottish,
Norman/Norse and English)1.
The biallelic markers (SRY-1532, M9, YAP, SRY-2627 (ref. 2); SRY-8299 (ref. 3); sY81 (ref.
4); and 92R7 (ref. 5)) define nine haplogroups (clusters of genetic variants) which are
highly non-randomly distributed among human populations6, including our samples. In
particular, haplogroup 1 (hg 1) has a very high frequency in Ireland (78.1% in the island as
a whole).
Surname subdivision reveals a cline in Irish samples, with exogenous samples clearly
showing lower frequencies (English, 62.5%; Scottish, 52.9%; Norman/Norse, 83.0%) than
Gaelic Irish samples (Leinster, 73.3%; Ulster, 81.1%; Munster, 94.6%), which almost reach
fixation in the westernmost province (Connaught, 98.3%). These highly significant
differences in the frequency of hg 1 between Irish Gaelic and non-Gaelic Y chromosomes
(P<0.001 ) and between eastern and western Gaelic Y chromosomes (P<0.001 ) persist
when duplicated surnames are removed.
Eighty per cent (n = 26; ref. 7) of European hg 1 Y chromosomes belong to 'haplotype 15',
defined by using the complex p49f/TaqI polymorphic system8. Using this relationship, we
estimated that hg 1 frequencies follow a cline within Europe9, extending from the Near East
(1.8% in Turkey) to a peak in the Spanish Basque country (89%; ref. 10) in the west (Fig.
1). This cline mirrors other genetic gradients in Europe and is best explained by the
migration of Neolithic farmers from the Near East9. When the surname-divided Irish data
are appended to this cline, it continues to the western edge of Europe, with hg 1 — the
putative pre-Neolithic western European variant — reaching its highest frequency in
Connaught (98.3%).
Figure 1 Distribution of observed and estimated haplogroup 1
Y-chromosome haplotypes in Europe. Full legend
High resolution image and legend (51k)
In a maximum-parsimony phylogenetic analysis of both bialellic and simple tandem-repeat
(STR) variation between Irish Gaelic haplotypes ( Fig. 2), the hg 1 chromosomes cluster
together tightly, with the highest-frequency haplotypes occupying central positions,
suggesting a coherent common ancestry. The smaller number of non-hg 1 haplotypes shows
no such coherence, consistent with their being immigrants. Their concentration in the
eastern Gaelic cohorts may be indicative of a prehistoric influx or of later gene flow across
the linguistic barrier from historical migrant groups.
Figure 2 Consensus maximum parsimony networks of Irish
Gaelic haplotypes summarizing both single tandem-repeat
(DYS19, DYS3891, DYS390, DYS391, DYS392 and DYS393 ;
ref. 14) and biallelic variation. Full legend
High resolution image and legend (8k)
These findings suggest that hg 1 is the earlier, indigenous Irish variant. By taking the
ancestral haplotype as that with the most common allele for each STR and calculating the
average squared distance11 (assuming a generation time of 27 years and a mutation rate of
0.21%; ref. 12) between it and all variants (Fig. 2 ), we estimate a date for Irish hg 1
coalescence of 4,200 BP (95% c.i. 1,800–14,800 BP). This relatively recent date (a global
estimate of hg 1 coalescence is 30,000 BP; ref. 13) falls well within Ireland's 9,000-year
history of human habitation. Although error margins are considerable and include
uncertainty related to method, this also provides an upper bound for any agriculturally
facilitated population expansion, which, at the fringe of Europe, may have taken place in an
insular Mesolithic population of hg 1 genotype.
EMMELINE W. HILL*, MARK A. JOBLING† & DANIEL G. BRADLEY*
* Department of Genetics, Trinity College, Dublin 2, Ireland
† Department of Genetics, University of Leicester, University Road, Leicester LE1 7RH, UK
e-mail: dbradley@mail.tcd.ie
References
1. MacLysaght, E. The Surnames of Ireland (Irish Academic, Dublin, 1997).
2. Hurles, M. E. et al. Am. J. Hum. Genet. 63, 1793-1806
(1998). | Article | PubMed | ISI | ChemPort |
3. Whitfield, L. S. et al. Nature 378, 379-380 (1995). | PubMed | ISI | ChemPort |
4. Seielstad, M. T. et al. Hum. Mol. Genet. 3, 2159-2161 (1994). | PubMed | ISI | ChemPort |
5. Hurles, M. E. et al. Am. J. Hum. Genet. 65, 1437-1448
(1999). | Article | PubMed | ISI | ChemPort |
6. Jobling, M. A. & Tyler-Smith, C. Trends Genet. 11, 449-456
(1995). | Article | PubMed | ISI | ChemPort |
7. Jobling, M. A. Hum. Mol. Genet. 3, 107-114 (1994). | PubMed | ISI | ChemPort |
8. Ngo, K. Y. et al. Am. J. Hum. Genet. 38, 407-418 (1986). | PubMed | ISI | ChemPort |
9. Semino, O. et al. Am. J. Hum. Genet. 59, 964-968 (1996). | PubMed | ISI | ChemPort |
10. Lucotte, G. & Hazout, S. J. Mol. Evol. 42, 472-475 (1996). | PubMed | ISI | ChemPort |
11. Goldstein, D. B. et al. Proc. Natl Acad. Sci. USA 92, 6723-6727
(1995). | PubMed | ISI | ChemPort |
12. Heyer, E. et al. Hum. Mol. Genet. 6, 799-803 (1997). | Article | PubMed | ISI | ChemPort |
13. Hammer, M. F. et al. Mol. Biol. Evol. 15, 427-441 (1998) | PubMed | ISI | ChemPort |
14. Kayser, M. et al. Int. J. Legal Med. 110, 125-133 (1997). | Article | PubMed | ISI | ChemPort |
Figure 1 Distribution of observed and estimated haplogroup 1 Y-chromosome haplotypes in
Europe. A cline stretches from a frequency of 1.8% in Turkey to peaks in the Basque country
(89%) and the west of Ireland (98% in Connaught, the westernmost marker).
Figure 2 Consensus maximum parsimony networks of Irish Gaelic haplotypes summarizing both
single tandem-repeat (DYS19, DYS3891, DYS390, DYS391, DYS392 and DYS393 ; ref. 14) and
biallelic variation. Branch lengths are proportional to the number of mutational steps and node
areas are proportional to haplotype frequencies. Haplogroup (hg) 1, blue; hg 2, yellow; hg 21,
green; hg 26, white. Asterisk, estimated ancestral haplotype. The separate clustering of haplogroups
and the tight clustering of the Irish hg 1 haplotypes around a few numerous, central variants are
constant through all most-parsimonious trees and were resistant to repetition of the analysis with
randomized inputs.
doi:10.1038/ng1113
volume 33 supplement pp 266 - 275
The application of molecular
genetic approaches to the
study of human evolution
L. Luca Cavalli-Sforza1 & Marcus W. Feldman2
1. Department of Genetics, Stanford Medical School, Stanford University,
Stanford, California 94305-5120, USA.
2. Department of Biological Sciences, Stanford University, Stanford,
California 94305-5020, USA.
Correspondence should be addressed to M W Feldman. e-mail:
marc@charles.stanford.edu
The past decade of advances in molecular genetic
technology has heralded a new era for all evolutionary
studies, but especially the science of human evolution.
Data on various kinds of DNA variation in human
populations have rapidly accumulated. There is
increasing recognition of the importance of this variation
for medicine and developmental biology and for
understanding the history of our species. Haploid
markers from mitochondrial DNA and the Y chromosome
have proven invaluable for generating a standard model
for evolution of modern humans. Conclusions from
earlier research on protein polymorphisms have been
generally supported by more sophisticated DNA analysis.
Co-evolution of genes with language and some slowly
evolving cultural traits, together with the genetic
evolution of commensals and parasites that have
accompanied modern humans in their expansion from
Africa to the other continents, supports and supplements
the standard model of genetic evolution. The advances in
our understanding of the evolutionary history of humans
attests to the advantages of multidisciplinary research.
Reconstructing human evolution requires both historical and
statistical research. Although conclusions are not
experimentally verifiable because the process cannot be
repeated, various disciplines such as physical and social
anthropology, archaeology, demography and linguistics
provide complementary approaches to researching questions
of human evolution. The existence of molecular genetic
variation among human populations was first demonstrated
by Hirszfeld and Hirszfeld1 in a classic study published in
1919 of the first human gene to be described—ABO, which
determines ABO blood groups. The subsequent identification
of blood group protein markers, such as MNS and Rh
expanded the repertoire of polymorphic markers that could be
analyzed using antibodies. R.A. Fisher showed that evolution
could be reconstructed by analyzing the multilocus genotypes
on a chromosome observed in populations and their
inheritance within families2. The term 'haplotype' for the
multilocus combination of alleles on a chromosome was
introduced by Ceppellini et al.3 during early research on the
major histocompatibility complex. Immunological methods
remained the only satisfactory technique for detecting genetic
variation until Pauling et al.4 introduced electrophoresis to
separate different mutants of hemoglobin, a technique that
was rapidly adapted to analyze variation in other blood
proteins.
It was soon obvious that genetic variation was not rare but, on
the contrary, that almost every protein had genetic variants5, 6.
These variants became useful markers for population studies.
The first book of allele frequencies in populations, published
in 1954, was limited almost completely to serological
variation7, and books listing genetic variation increased
rapidly in size and number8-10. In 1980, a method for studying
variation in DNA11 identified mutants of restriction sites by
using radioisotopes and generated several new markers. But
it was only with the development of PCR in 1986 that the
study of more general DNA variation became possible. The
development of automated DNA sequencing in the early
1990s paved the way for the application of systematic study
of genome variation to human evolutionary biology.
Data from protein markers (sometimes called 'classical'
markers) are still more abundant than are data from DNA,
although this situation is rapidly changing. For example,
Rosenberg et al.12 studied 377 autosomal microsatellite
polymorphisms in 1,065 individuals from 52 populations
producing a total of 4,199 different alleles, about half of which
were found in all principal continental regions. Another study13
of 3,899 single-nucleotide polymorphisms (SNPs) in 313
genes sampled in 82 Americans self-identified as African
American, Asian, European or Hispanic Latino found that only
21% of the sites were polymorphic in all four groups—a
fraction that would be expected to increase with more
sampled individuals. It is interesting to note, however, that so
far no conclusions derived from the earlier studies of classical
polymorphisms14 have been found to be in disagreement with
those obtained with DNA markers. Nonetheless, molecular
genetic markers have provided previously unavailable
resolution into questions of human evolution, migration and
the historical relationship of separated human populations. In
this review we discuss the evolutionary and historical forces
that have shaped genomic variation and how its interpretation
has led to a deeper understanding of the evolution of our
species.
Evolutionary events affecting genomic variation
All genetic variation is caused by mutations, of which there
are many different types. The most common and most useful
for many purposes are SNPs, which can be detected by DNA
sequencing and other recently developed methods, such as
denaturing high performance liquid chromatography15, mass
spectrometry16 and array-based resequencing17.
Allelic frequencies change in populations owing to two factors:
natural selection, which is the result of population variation
among individual genotypes in their probabilities of survival
and/or reproduction, and random genetic drift, which is due to
a finite number of individuals participating in the formation of
the next generation. Both natural selection and genetic drift
can ultimately lead to the elimination or fixation of a particular
allele. In the presence of mutation and in the absence of
selection (that is, under neutral conditions), the rate of neutral
evolution of a finite population is equal to the reciprocal of the
mutation rate18.
The earliest evidence of selection acting on a human gene
was the discovery that heterozygotes of the hemoglobin A/S
polymorphism have greater resistance to malaria than do AA
or SS homozygotes. In malarial environments, this results in a
balanced polymorphism that maintains the S allele even
though SS individuals are severely ill with sickle-cell anemia.
Recent studies of DNA variation have focused on detecting
signatures of selection, either balancing or directional19. This
has produced many different statistical tests using DNA
diversity20, 21 and comparisons of nucleotide substitutions that
do or do not affect the amino acid sequence of proteins22, 23.
Strong molecular evidence of balancing selection, also in
malarial environments, has been found for the G6PD locus,
the low-activity alleles of which seem to confer resistance to
malaria24, 25. Other analyses26 have found evidence for
positive selection at both G6PD and another gene TNFSF5,
which is also implicated in the response to infectious agents.
Strong directional selection has also been proposed 27 for
FOXP2, which shows a two amino-acid difference between
the human protein and the monomorphic form in primates. It
has been suggested that these changes may have been
selectively important for the evolution of speech and language
in modern humans27. In other genes, however, the agent of
selection is not at all obvious; for example, the CCR5 gene28
seems to be related to HIV resistance, and mutations in the
BRCA1 gene29 produce an increased risk of female breast
cancer. In such cases it is often very difficult to disentangle
the effects of population dynamics or structure from selective
pressures. These complications can be clearly observed in a
thorough analysis of the HFE locus30, mutations of which
result in hemochromatosis. In this study no evidence of
selection on single SNPs or on haplotypes was detected, but
significant between-continent variation was found. Unlike
other studies12, 13, African samples showed only slightly more
rare SNPs than Europeans or Asians. This suggests the
possibility that different evolutionary models are relevant to
the different continents.
Genetic statistics of the substructure underlying human
populations may also suggest which genes are candidates to
have been under selection. The idea, originally proposed by
Cavalli-Sforza31 and expanded by Lewontin and Krakauer32, is
to compare the expected and observed values of FST statistics
(a measure of the relevant amount of genetic diversity among
populations)33 for a large enough number of genes and focus
on those loci that produce extreme values. In a recent study
of 8,862 SNPs mapped to gene-associated regions34, 156
genes for which the FST value was exceptionally high and 18
for which it was exceptionally low were identified, suggesting
that these 174 genes are candidates for having been under
selection. Similar approaches have been applied to specific
genes such as G6PD24, the Duffy blood group locus35, lactase
haplotypes36, MAOA37 and skin pigmentation38; in each case,
unusually high variation among populations has been invoked
as a signature for the action of selection. The interactions
among population substructure, demography and phenotypic
variation are discussed in a recent review39.
Migration is another important factor in human evolution that
can profoundly affect genomic variation within a population.
Most populations are relatively isolated, however, although
rare exchange of marriage partners between groups does
occur. An average of one immigrant per generation in a
population is sufficient to keep drift partially in check and to
avoid complete fixation of alleles. Sometimes a whole
population (or a fraction of it) migrates and settles elsewhere.
If the migrant group is initially small but subsequently
expands, by chance alone the frequencies of alleles among
the founders of the new population will differ from those of the
original population and even more so from those among
which it settles. In this situation, group migration has an effect
that in some respects is opposite to that of individual
migration among neighboring populations: it creates more
chances for drift and therefore divergence40. The effect will be
intergroup variation in allele frequencies.
Genome structure and population history
A complete description of human genetic variation requires
more than just properties of isolated genes, microsatellites or
SNPs. How these vary simultaneously within a part or whole
chromosome requires statistics of correlation between the
variation at different positions, and these are usually
described by patterns of linkage disequilibrium (LD). The
stronger the LD, the more likely that alleles at each of two
positions will be found in association with one another. Using
studies of protein variants in the 1970s and 1980s it was rare
to identify strong LD in populations of outcrossing diploids.
However, as more details become available on variation in
human DNA across populations, LD between polymorphic
DNA sites is increasingly being detected.
Standard population genetic theory suggests that LD between
pairs of genetic markers should decrease as the
recombination between them increases. But early studies of
short segments of DNA did not show this relationship for
SNPs41. Because the pattern of LD is expected to vary both
with local effects, such as the extent of selection, the degree
to which pairs of sites interact in response to selection
(epistasis), and with population-scale forces, such as drift
(reviewed in ref. 42), migration and non-random mating43,
genomic patterns of LD can be expected to be fairly complex.
Recent studies of relatively long (200–500 kb) stretches of
DNA, however, have produced a picture of blocks of high LD
interspersed by short intervals of low LD. Within the blocks of
high LD there is evidence of lack of recombination, whereas
the regions between the blocks seem to be 'hot spots' in
which recombination occurs frequently44-47. It has been
therefore suggested that the next phase of research into
human variation should focus on these blocks of high LD, for
which haplotypes, rather than single markers, will become the
unit of variation44. Although it has been known for many years
that the extent of LD among specific sets of genes shows
great variation around the world48—for example, it is usually
much weaker in African than in European populations49—
genome-wide studies covering representative worldwide
populations remain to be done50, 51.
Interpreting evolutionary history
The history of population differentiations using genetic data
was initially inferred from phylogenetic trees52-54 and from
multivariate statistical methods such as principal
components53, 55 (of which multidimensional scaling is a
derivative) that use allele frequencies. Population trees are
especially useful for reconstructing history if population
differences can be assumed to result from fissions that occur
randomly in time, with a constant rate of neutral evolution in
each population between fissions. This is likely to be roughly
true for data on several autosomal genes from large
populations that are geographically and genetically distant, as
illustrated in Figure 1, which shows nine such groups from
around the world. Completely different types of DNA variation
provide the same basic conclusion regarding the relationships
between these populations (refs. 56, 57; and L.A.
Zhivotovsky, N.A. Rosenberg and M.W. Feldman, manuscript
in preparation).
Violation of the above assumptions, such as the presence of
migration or selection, affects the interpretation of population
trees. However, when migration between geographic
neighbors is frequent, principal components displayed in two
dimensions reflect the geographical distribution of
populations. Under the simple evolutionary model described
above, trees and principal components give similar results58.
For populations that are geographically close, genetic and
geographic distances are often highly correlated (Fig 2), with
an asymptote for the genetic distance at about 1,000–2,600
miles on average (but higher for Asia and the world, which are
not at equilibrium). Recent statistical developments in
detecting clustering among populations based on highly
polymorphic autosomal markers59 have been valuable for
analyzing very large population genetic data sets12. It is
important that this completely different approach produces the
same primary continental clusters as the earlier methods. In
its application to data sets with numerous polymorphic loci,
however, it does seem to be more sensitive in detecting and
assessing individual ancestry.
Early studies showed that genetic differences between
populations are relatively small as compared with those within
populations60, 61. Subsequent analyses, including molecular
polymorphisms of 14 populations representing all continents,
confirmed that the within-population variance was about 85%
of the total (Table 1)62. A recent analysis of 377 autosomal
microsatellite markers12 in 1,065 individuals from 52
worldwide populations found that only 5–7% of the variation
was between populations. It is the remaining 5–15%—the
between-population component—that can be used to
reconstruct the evolutionary history of populations.
Dating the origin of our species using genetic data
Archeological evidence is generally considered to support the
initial spread of humans within Africa from an East African
origin during the first half of the last 100 kya and the spread
from the same origin to all the world in the last 50–60 kya.
Analyses of numerous classical markers under this
assumption have estimated the dates of first occupation by
anatomically modern humans of Asia, Europe and Oceania at
60–40 kya, in agreement with archeological and fossil data.
Dates for the first occupation of America are estimated at 15–
35 kya. Thus, genetically derived dates are consistent with
evidence from physical anthropology, providing support for
the use of population trees63. Below we discuss how recent
analysis of DNA polymorphisms supports this timing of the
earliest split between Africans and non-Africans.
Studies of variation in DNA became possible in the early
1980s (refs. 64, 65). Subsequent estimates for the
emergence of modern humans from Africa using autosomal
restriction fragment length polymorphisms were consistent
with earlier estimates56, 66. From the analysis of several
mitochondrial DNA (mtDNA) polymorphisms, Cann et al.67
derived two important conclusions: the first major separation
in the evolutionary tree of modern humans was between
Africans and non-Africans; and the time back to the most
recent common ancestor (TMRCA) of modern human mtDNA
was 190,000 years (however, with a large error). After early
doubts about the statistical validity of these interpretations of
the data68, the order of magnitude was confirmed69-71. It is
important to note that TMRCA is usually significantly earlier
than the first archaeologically observable divergence among a
set of populations72, 73. Also, TMRCA does not necessarily
coincide with the onset of population expansion. The
'mismatch' method74 to analyze mtDNA, which analyzes the
distribution of between sequence differences, gives estimates
that are more compatible with the beginning of expansions
inferred from archeology.
Because mitochondria are transmitted along only female
lineages and mtDNA is genetically haploid, the effective size
of a population of mtDNAs is a quarter of that of the
corresponding autosomes. The mutation rate of the
mitochondrial genome is about ten times higher than that of
nuclear DNA75, which provides an abundance of polymorphic
sites, but creates difficulties in reconstructing genealogies
owing to repeated and reverse mutations. Like the nonrecombining part of the Y chromosome (NRY), there is no
evidence for recombination in mtDNA although low-frequency
rearrangements of somatic mtDNA have been observed in
heart muscle76.
The mutation rate of the NRY is comparable to that of nuclear
DNA, which means that polymorphisms are more difficult to
find but genealogies are easier to reconstruct. The greater
length of DNA on the NRY (perhaps 30 million bases of
euchromatic DNA) relative to mtDNA compensates in data
analyses for its lower mutation rate. Even though the NRY
behaves effectively as a single locus, which is usually
insufficient for evolutionary analyses, it has provided results
that are consistent across many studies and in agreement
with many archeological findings. In fact, the NRY genealogy
constructed from 167 mutations77 has been replicated with a
totally independent set of 114 mutations75 and confirmed
independently using mostly different population samples78, 79.
Statistical analysis of Y chromosome data have been carried
out using coalescent theory devised by Kingman80.
Coalescent-based techniques using numerical methods to
study complex likelihood functions derived from Bayesian
analyses were developed subsequently81, 82 and have
facilitated estimation of key parameters in the Y chromosome
genealogy (ref. 83; and H. Tang et al., manuscript in
preparation) under specific assumptions about demographic
history. Tang et al.75 have shown that important evolutionary
properties of the Y chromosome TMRCA, which is close to
100 kya, can be derived under few demographic
assumptions.
Two recent estimates of TMRCA from mtDNA have been
made using different methods. From complete mtDNA
sequences (excluding the D loop) in a sample of 53
individuals, 516 segregating sites were seen and a TMRCA
was estimated at 171 50 kya70. From a sample of 179
individuals with 971 SNPs, the TMRCA was estimated at
200–281 kya using a generation time of 25 years, and 160–
225 kya using a generation time of 20 years75. Corresponding
estimates for the NRY-based TMRCA are 60–130 kya and
72–156 kya, with generation times of 25 and 30 years,
respectively75.
It is important to stress that such estimates of TMRCAs do not
imply that the human population contained only one woman at
230 kya (the time of the mtDNA-based TMRCA, assuming
constant mutation rates) or only one man at 100 kya (the time
of the NRY-based TMRCA). The only implication is that all
human mitochondria existing today descend from that of a
single woman living 230 kya, and all NRYs descend from that
of a single man living 100 kya. In both cases, it is likely that
there were many more human individuals alive at the
TMRCA—whether they were of the same species as Homo
sapiens is hard to determine, but descendants of other
species are either absent or extremely rare.
Although the reconstructed genealogies of mtDNA and NRY
are broadly similar, there are some notable differences,
probably owing to social differences in migration customs. For
example, patrilocal marriage has historically been more
common than matrilocal84, which can explain differences in
mtDNA and Y chromosome data in a number of populations8593. Demographic differences between the sexes, such as
greater male than female mortality, the greater variance in
reproductive success of males than females and possibly the
greater frequency of polygyny than polyandry, may explain
the discrepancy between the NRY and mtDNA dates. These
factors reduce the effective number of males and may explain
the more than twofold difference between the NRY-based and
the mtDNA-based TMRCA. Another attractive alternative
explanation is that mutation rates in mtDNA are very variable,
and when this variation is taken into account TMRCA of
mtDNA could become closer to that of NRY.
Estimates of TMRCAs from autosomal genes are higher than
those from mtDNA or NRY. In theory, they should be higher
by a factor of four and the estimates are in this direction,
although the number of autosomal genes studied is small and
estimates of TMRCAs vary considerably94. For analyses of
autosomal and X chromosomes, recombination can
complicate genealogies and make TMRCAs impossible to
estimate. There is also the possibility of heterozygote
advantage, which has the potential to increase estimates of
TMRCA. Heterozygote advantage may be widespread
throughout the human genome but has been very difficult to
show unequivocally, and the only fully confirmed example is
sickle cell anemia, for which very large samples were
required. There is some optimism, however, that the
development of techniques that can detect heterosis for some
genes in yeast95 may lead to greater success in other
organisms, including humans.
Tracking migrations of our species using DNA
A recent synthesis of Y chromosome phylogeography,
paleoanthropological and paleoclimatological evidence
suggests a possible hypothesis for the evolution of human
diversity96-98. Around 100 kya or shortly after, a small
population of about 1,000 individuals (that is, a tribe), most
probably from East Africa, expanded throughout much of
Africa. Then, between 60 and 40 kya there was a second
expansion, most probably from a descendant population, into
Asia and from there to the other continents (Fig. 3). This may
be referred to as the 'standard model of modern human
evolution'; it is also called 'out of Africa 2' in recognition of an
earlier expansion of Homo erectus from Africa into Eurasia
around 1.7 million years ago and assumes that anatomically
modern humans (also called Homo sapiens sapiens) replaced
earlier poorly known species of Homo that descended from
the first migrants of H. erectus98. Genetic data provide some
indication that the spread of humans into Asia occurred
through two routes. The first was a southern route, perhaps
along the coast to south and southeast Asia, from where it
bifurcated north and south99. In the south, these modern
humans reached Oceania between 60 and 40 kya, whereas
the northern expansion later reached China, Japan and
eventually America (this might represent the second migration
to America, associated with the NaDene languages,
postulated by Greenberg100). The second was a central route
through the Middle East, Arabia or Persia to central Asia, from
where migration occurred in all directions reaching Europe,
east and northeast Asia about 40 kya, after which the first and
principal migration to America suggested by Greenberg
occurred not later than 15 kya101.
It is still unresolved whether the divergence between these
two expansion routes occurred in Africa or after entry into
west Asia, and, if the latter, where it happened. Most literature
accepts without discussion that the entry to Europe and
central Asia was through the Levant. It is not at all certain that
this was the only or the earliest route. These two initially
divergent routes converged later, especially in the extreme
East and America.
An alternative to the out of Africa 2 hypothesis, originated by
Weidenreich102 and expanded and called 'multiregional' by
Wolpoff103, maintains that all human populations living today
originated in their various continents and evolved in parallel
into modern humans. The main basis of this hypothesis is the
claim that most ancient fossils (essentially those from Europe
and Asia but not Oceania and America, where the human
fossils found are all very recent and of modern human type)
show a continuous morphological transition to modern
humans. An extreme example of parallel evolution that
included the doubling of brain volume is invoked to explain
this scenario. In later versions of the multiregional model,
parallelism is claimed to be the result of substantial
intermigration100, 104.
Recent quantitative anthropological research on several
human skulls has shown no morphological continuity in the
various continents85. In addition, in the only part of the world
where there existed a human type with some clear similarity
to modern humans—namely Neandertals in Europe and west
Asia —this purported ancestor of modern Europeans
disappeared shortly after the appearance of modern humans
(40–30 kya). MtDNA analysis of three Neandertals from
Germany105, 106, Croatia107 and the Caucasus108 detected no
similarity with modern humans and indicated that the
evolutionary separation of Neandertal from modern humans
took place at least 500 kya.
It has been claimed that the age of TMRCA derived from the
few human autosomal genes examined (between 500 and
1,000 kya) is proof of early expansions that have not been
detected in NRY and mtDNA but are compatible with the
multiregional hypothesis109, 110. Templeton100 proposes that
this ancient TMRCA of autosomal genes is due to multiple
migrations from Asia of H. erectus types before out of Africa 2
and the origin of modern humans. There is no evidence for
such early migrations; even small populations tend to
maintain high genetic variation (Table 1), and the amount of
variation observed between human populations today is so
small relative to the average variation within populations that
it could have easily accumulated in the 100–200 ky before the
present.
Recent simulation-based tests of the nested-clade method
used by Templeton have found that it may produce an
inference of long-term recurrent gene flow where this is
specifically excluded from the simulation111. It is also
important that the extent of LD in autosomal genes is much
lower in African than non-African populations48, 49, 112, 113,
suggesting that non-African populations represent a small
genetic subset of the Africans. LD has had a long time to
dissipate in Africa, and the polymorphisms of the autosomal
genes from which their (expected) long TMRCAs are
calculated are much more likely to have arisen in Africa than
in Asia.
High resolution history using haploid markers
The identification in recent years of a large number of SNPs
on the NRY and mtDNA has afforded higher resolution of
population history through the reconstruction of the
phylogenetic relationships of extant Y chromosomes and
mtDNA (Fig. 4). Using the nomenclature developed by the Y
Chromosome Consortium114, the first two haplogroups (Fig.
4a; A and B) are almost completely African and even today
represent mostly hunter-gatherers or their descendants, who
have never reached high population densities or undergone
high rates of increase. Slow growth is indicated by the
accumulation of many mutations within a branch, as in most
descendants of haplogroup A and B and in those of the
earliest branches of haplogroups C, D, E and F. By contrast,
when there are many branches (called a starburst) after a
specific mutation or group of mutations, we can infer rapid
growth115, 116. The major expansions are those of haplogoup F
(seven branches) after an initial lag in population growth, and
even more remarkable is the later expansion of haplogroup K
(nine branches). These began in the last 40 kya and led to the
major settlement of all continents from Africa, first to Asia, and
from Asia to the other three continents. The tree of mtDNA
(Fig. 4b) is more bushy, but there are more haplogroups
because of the higher mutation rate. The general structures of
the male and female genealogies in Figure 4 are the same.
The earliest branches all remain in Africa; in both trees they
clearly refer to the slowly growing hunter-gatherers. In both
trees the major growth in Africa is due to a late branch, taking
place in the second part of the last 100,000 years and clearly
connected with the expansion to Asia. The M, N branches of
the mtDNA phylogeny indicate the separation of the
expansion from Africa to Asia into a southern and a northern
branch. In the NRY genealogy, the southern branch is on
average earlier than the northern, and includes mostly
haplogroups C, D, H, M and L. Of these, H and L remained in
India and part of C went to Oceania, the rest to Mongolia,
Siberia, and eventually to northwest America (Na-Dene
speakers). D went as far as southeast Asia and Japan. The
northern Asian expansion remains mostly in East Asia
(haplogroup O—its branch N has a major propagule to N.E.
Europe, among Uralic speakers). Haplogroup I from north
Asia generates what is probably the first major Paleolithic
expansion to central Europe. G and J are found today in the
Middle East and from there expanded to Europe, mostly in the
south and probably with Neolithic farmers. R is found in
Europe, India, Pakistan, and America, but an early branch
seems to have returned to the central part of the Sahel in
North Africa. Haplogroup Q generates most Amerinds, except
for Na-Dene speakers and Eskimos. Haplogroup I is also
found in north and central Europe, where it probably
originated around 20,000 kya. A few indigenous individuals in
America and Australia probably inherited European Y
chromosomes.
Parallel developments to human evolution
What were the causes of the expansions that increased the
number of modern humans by a million times or more over
the past 100 kyr? Many capabilities distinguish modern
humans from our predecessors (especially our closest
relative, Neandertal): sophistication of stone tools, art, religion
and, above all, language. We cannot totally exclude art or
religion among Neandertals, but it is usually claimed that
modern humans showed a very early, sudden development of
art, with common themes related to magic, religion and an
afterlife linked to the making of tombs117, 118, although there is
evidence that many of these aspects of modern human
behavior have a long history in Africa119.
It has been rejected that Neandertal could speak languages
like ours for anatomical reasons, but the evidence offered is
considered inconclusive120, 121. Modern human languages are
mutually incomprehensible and superficially unrelated to each
other. A general classification based on 12 language families
has been suggested by Greenberg (Fig. 5)88, 122-125. For
geneticists like us, it seems natural to think that modern
languages derive mostly or completely from a single language
spoken in East Africa around 100 kya, given that today's
genes also derive from that population. This does not mean
that this was the only language in existence at the time; in
parallel with genetic TMRCAs, it was the only language then
existing that survived and evolved with rapid differentiation
and transformation. Evidence supporting the existence of a
common single language include the shared lexicon, sounds
and grammar of present-day languages. Language, like many
other forms of cooperation, must have originated as
intrafamilial communication126.
The expansion of modern humans may have been stimulated
by the development of a new, more sophisticated culture of
stone tools (called Aurignacian), which developed at the time
of the expansion127. It is also very likely that navigation
became available (or else the passage from southeast Asia to
Oceania would have been impossible) and may even have
been used earlier, such as in coastal south Asia87, or later
along the Pacific American coast.
Innovations that increased food availability may have then
allowed groups to remain in the same area and to increase in
size. This apparently happened in many parts of the world on
a massive scale starting 10–13 kya with the adoption of
agriculture and pastoralism. From the beginning of food
production to the present, there must have been a thousandfold population increase. Demographic growth in the well
identified, specific areas of origin of agriculture must have
stimulated a continuous peripheral population expansion
wherever the new technologies were successful. 'Demic
expansion' is the name given to the phenomenon (that is,
farming spread by farmers themselves) as contrasted with
'cultural diffusion' (that is, the spread of farming technique
without movement of people). Innovations favoring
demographic growth would be expected to determine both
demic and cultural diffusion55, 128, 129. Recent research
suggests a roughly equal importance of demic and cultural
diffusion of agriculture from the Near East into Europe in the
Neolithic period130, 131.
Demic diffusion also results in the spread of the language of
the initiators of the expansion. This probably occurred for
Indo-European languages spreading from the Middle East to
Europe and India132, or for Austronesian languages spreading
to Polynesia133. There is generally a strong correlation
between linguistic families and the genetic tree of major
populations14, 63, with some important exceptions.
There is generally a strong correlation of genetic tree clusters
with language families63, 134, but there are also clear examples
of historically dated language replacements. It is likely that
these language shifts have become more common recently,
with massive colonizations made possible by development of
transportation and military technology.
Knowledge, which forms the basis of human behavior, is
accumulated by 'cultural transmission' over generations and is
subject to rapid change within generations. We have
developed a theory of cultural transmission, in which the most
important feature is 'duality': culture is transmitted either
'vertically' from parents to children or 'horizontally' between
people with no particular age or genetic relationship135.
Evolution under vertical transmission is slow, although faster
than genetic evolution, and its time unit of one generation is
the same. In assessing the importance of vertical
transmission, we note that children are more prone to accept
parental education because of specific susceptibilities during
'critical periods' of maturation135, 136. For example, most
'mother tongues' are learned without accent only in the first 4–
5 years. But under coercion or other special circumstances,
the language of a whole population can be fully replaced in 3–
4 generations. Although complete rapid replacement of
languages may occur, such events are probably rare.
Evolution under horizontal cultural transmission is usually
much faster than under vertical transmission, and modern
means of communication have made it exceptionally fast.
Present-day humans are a 'cultural animal', but even today
old customs may persist because some vertical cultural
transmission remains important.
Humans carry many parasites or commensal organisms,
some of which began their relationship with humans more
than 100 kya. If their transmission is even partly vertical—as it
is for hepatitis B virus—then their evolution is similar to that of
humans, with origins in Africa and a spread first to Asia and
then, independently, from Asia to the other three continents. It
has been suggested that this is true of other viruses, such as
polyomavirus137, and also of the bacterium Helicobacter
pylori138, which was recently found to be the causative agent
of gastric ulcer. It is likely that the same evolutionary
properties will be detected for other commensals and
parasites, indicating that at least part of their transmission is
vertical.
Summary and outlook
Late twentieth century population genetic research was
marked by a significant expansion in the available research
tools through a greater appreciation of the level of
polymorphism in the human genome. The development of
assays for loci that allowed inferences about female
(mtDNA)– or male (Y chromosome)–specific histories yielded
new insights into human history. A growing appreciation of the
importance of the genetic structure of human populations has
seen the scope and application of population genetic studies
expand.
In many ways we are currently hampered by the limited range
of populations from which samples are available for detailed
analysis. The World Cell Line Collection of 1,064 individuals
from 52 populations is a beginning, but at least 5,000–10,000
from a more representative sampling of all continents would
be preferable. Inferences about human history from small
samples13, 17 are invariably fallible. Most published analyses
concern genes chosen because of a putative relation to some
phenotype, but sampling of DNA variation should be random
with respect both to coding and non-coding regions71.
Current statistical procedures to estimate the extent of
migration or to measure the strength of selection from
patterns of nucleotide variation are still primitive. New
computational and analytical methods are needed for both if
we are to increase our confidence in the calculation of ages of
mutations and TMRCAs. A key requirement here is the ability
to separate selection from demographic effects. Comparative
sequencing of primates may facilitate the detection and
estimation of selection.
For haplotype determination, large samples of trios—father,
mother, son—would be useful but expensive to obtain on a
worldwide scale. Thus, improved algorithms for estimating
haplotypes are required. Systems that combine SNPs and
microsatellites may provide a way to map haplotypes more
finely, to assess erosion of LD and to reconstruct the
evolutionary history of gene regions139. Construction of
somatic cell hybrids might, in the future, enable individual
chromosomes to be isolated and made available for
haplotypic analysis.
There is great scope for more interaction among
anthropologists and population geneticists. Recent work by
Hewlett et al.140 suggests that correlation of microcultural
variation and genetic variation in the same groups can be
very informative about population interactions on various
timescales. In the same vein, there are still few studies that
compare patterns of variation in representative populations of
human pathogens with those in their hosts. Perhaps this is a
symptom of our focus on the genetics and diseases of
developed countries and of the tiny fraction of available
resources allocated to studying genetic variation in those
populations about whom we have the least knowledge.
REFERENCES
1. Hirszfeld, L. & Hirszfeld, H. Essai d'application des
methods au probléme des races. Anthropologie 29, 505537 (1919).
2. Race, R.R. & Sanger, R. Blood Groups in Man (Blackwell
Scientific, Oxford, 1975).
3. Ceppellini, R. et al. Genetics of leukocyte antigens. A
family study of segregation and linkage. In
Histocompatibility Testing (eds. Curtoni, E.S., Mattiuz,
P.L. & Tosi, R.M.) (Munksgaard, Copenhagen, 1967).
4. Pauling, L., Itano, H.A., Singer, S.J. & Wells, I.C. Sicklecell anemia, a molecular disease. Science 110, 543-548
(1949). | ChemPort |
5. Harris, H. Enzyme polymorphisms in man. Proc. R. Soc.
Lond. B 164, 298-310 (1966). | PubMed | ChemPort |
6. Lewontin, R.C. & Hubby, J.L. A molecular approach to the
study of genetic hetrozygosity in natural populations. II.
Amount of variation and degree of heterozygosity in
natural populations of Drosophila pseudoobscura.
Genetics 54, 595-609 (1966). | PubMed | ChemPort |
7. Mourant, A.E. The Distribution of Human Blood Groups
(Blackwell Scientific, Oxford, 1954).
8. Mourant, A.E., Kopec, A.C. & Domaniewska-Sobczak, K.
The Distribution of the Human Blood Groups and Other
Polymorphisms (Oxford Univ. Press, London, 1976).
9. Mourant, A.E., Kopec, A.C. & Domaniewska-Sobczak, K.
Blood Groups and Diseases (Oxford Univ. Press, Oxford,
1978).
10. Nei, M. & Roychoudhury, A.K. Human Polymorphic Genes:
World Distribution (Oxford Univ. Press, New York, 1988).
11. Botstein, D., White, R.L., Skolnick, M. & Davis, R.W.
Construction of a genetic linkage map in man using
restriction fragment length polymorphisms. Am. J. Hum.
Genet. 32, 314-331 (1980). | PubMed | ChemPort |
12. Rosenberg, N.A. et al. Genetic structure of human
populations. Science 298, 2381-2385
(2002). | Article | PubMed | ChemPort |
13. Stephens, J.C. et al. Haplotype variation and linkage
disequilibrium in 313 human genes. Science 293, 489493 (2001). | Article | PubMed | ChemPort |
14. Cavalli-Sforza, L.L., Menozzi, P. & Piazza, A. The History
and Geography of Human Genes (Princeton Univ. Press,
Princeton, NJ, 1994).
15. Xiao, W. & Oefner, P.J. Denaturing high-performance
liquid chromatography. Hum. Mutat. 17, 439474. | PubMed | ChemPort |
16. Oberacher, H. et al. Re-sequencing of multiple single
nucleotide polymorphisms by liquid chromatography-electrospray ionization mass spectrometry. Nucleic Acids
Res. 30, e67. | PubMed | ChemPort |
17. Patil, N. et al. Blocks of limited haplotype diversity
revealed by high-resolution scanning of human
chromosome 21. Science 294, 1719-1723
(2001). | Article | PubMed | ChemPort |
18. Kimura, M. Evolutionary rate at the molecular level.
Nature 217, 624-626 (1968). | PubMed | ChemPort |
19. Przeworski, M., Hudson, R.R. & Di Rienzo, A. Adjusting
the focus on human variation. Trends Genet. 16, 296-302
(2000). | Article | PubMed | ChemPort |
20. Hudson, R.R., Kreitman, M. & Aquadé, M. A test of
neutral molecular evolution based on nucleotide data.
Genetics 116, 153-159 (1987). | PubMed | ChemPort |
21. Tajima, F. Statistical method for testing the neutral
mutation hypothesis by DNA polymorphism. Genetics
123, 585-595 (1989). | PubMed | ChemPort |
22. Muse, S.V. & Gaut, B.S. A likelihood approach for
comparing synonymous and non-synonymous nucleotide
substitutions. Mol. Biol. Evol. 11, 715-724
(1994). | PubMed | ChemPort |
23. Yang, Z. & Nielsen, R. Synonymous and non-synonymous
rate variation in nuclear genes of mammals. J. Mol. Evol.
46, 409-418 (1998). | PubMed | ChemPort |
24. Tishkoff, S.A. et al. Haplotype diversity and linkage
disequilibrium at human G6PD: recent origin of alleles
that confer malarial resistance. Science 293, 455-462
(2001). | Article | PubMed | ChemPort |
25. Verrelli, B.C. et al. Evidence for balancing selection from
nucleotide sequence analyses of human G6PD. Am. J.
Hum. Genet. 71, 1112-1128 (2002). | Article | PubMed
| ChemPort |
26. Sabeti, P.C. et al. Detecting recent positive selection in
the human genome from haplotype structure. Nature
419, 832-837 (2002). | Article | PubMed | ChemPort |
27. Enard, W. et al. Molecular evolution of FOXP2, a gene
involved in speech and language. Nature 418, 869-872
(2002). | Article | PubMed | ChemPort |
28. Bamshad, M.J. et al. A strong signature of balancing
selection in the 5' cis-regulatory region of CCR5. Proc.
Natl. Acad. Sci. USA 99, 10539-10544
(2002). | Article | PubMed | ChemPort |
29. Huttley, G.A. et al. Adaptive evolution of the tumour
suppressor BRCA1 in humans and chimpanzees. Nat.
Genet. 25, 410-413 (2000). | Article | PubMed
| ChemPort |
30. Toomajian, C. & Kreitman, M. Sequence variation and
haplotype structure at the human HFE locus. Genetics
161, 1609-1623 (2002). | PubMed | ChemPort |
31. Cavalli-Sforza, L.L. Population structure and human
evolution. Proc. R. Soc. Lond. B 164, 362-379
(1966). | PubMed | ChemPort |
32. Lewontin, R.C. & Krakauer, J. Distribution of gene
frequency as a test of the theory of the selective
neutrality of polymorphisms. Genetics 74, 175-195
(1973). | PubMed | ChemPort |
33. Weir, B.W. Genetic Data Analysis II (Sinauer, Sunderland,
MA, 1996).
34. Akey, J.M., Zhang, G., Zhang, K., Jin, L. & Shriver, M.D.
Interrogating a high-density SNP map for signatures of
natural selection. Genome Res. 12, 1805-1814
(2002). | Article | PubMed | ChemPort |
35. Hamblin, M.T., Thompson, E.E. & Di Rienzo, A. Complex
signatures of natural selection at the Duffy blood group
locus. Am. J. Hum. Genet. 70, 369-383
(2002). | Article | PubMed |
36. Hollox, E.J. et al. Lactase haplotype diversity in the Old
World. Am. J. Hum. Genet. 68, 160-172
(2001). | Article | PubMed | ChemPort |
37. Gilad, Y., Rosenberg, S., Przeworski, M., Lancet, D. &
Skorecki, K. Evidence for positive selection and
population structure at the human MAO-A gene. Proc.
Natl. Acad. Sci. USA 99, 862-867
(2002). | Article | PubMed | ChemPort |
38. Rana, B.K. et al. High polymorphism at the human
melanocortin 2 receptor locus. Genetics 151, 1547-1557
(1999). | PubMed | ChemPort |
39. Goldstein, D.B. & Chikhi, L. Human migrations and
population structure: what we know and why it matters.
Annu. Rev. Genom. Hum. Genet. 3, 129-152
(2002). | Article | ChemPort |
40. Cavalli-Sforza, L.L. Some current problems in human
population genetics. Am. J. Hum. Genet. 25, 82-104
(1973). | PubMed | ChemPort |
41. Clark, A.G. et al. Haplotype structure and population
genetic inferences from nucleotide-sequence variation in
human lipoprotein lipase. Am. J. Hum. Genet. 63, 595612 (1998). | Article | PubMed | ChemPort |
42. Ewens, W.J. in Mathematical Population Genetics 98-104
(Springer, Berlin, 1979).
43. Feldman, M.W. & Christiansen, F.B. The effect of
population subdivision on two loci without selection.
Genet. Res. 24, 151-162 (1974). | PubMed | ChemPort |
44. Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J. &
Lander, E.S. High-resolution haplotype structure in the
human genome. Nat. Genet. 29, 229-232
(2001). | Article | PubMed | ChemPort |
45. Jeffreys, A.J., Kauppi, L. & Neumann, R. Intensely
punctate meiotic recombination in the class II region of
the major histocompatibility complex. Nat. Genet. 29,
217-222 (2001). | Article | PubMed | ChemPort |
46. Goldstein, D.B. Islands of linkage disequilibrium. Nat.
Genet. 29, 109-111 (2001). | Article | PubMed
| ChemPort |
47. Reich, D.E. et al. Human genome sequence variation and
the influence of gene history, mutation and
recombination. Nat. Genet. 32, 135-142
(2002). | Article | PubMed | ChemPort |
48. Payne, R., Feldman, M.W., Cann, H. & Bodmer, J.G. A
comparison of HLA data of the North American black with
African black and North American caucasoid populations.
Tissue Antigens 9, 135-147 (1977). | PubMed
| ChemPort |
49. Kidd, K.K. et al. A global survey of haplotype frequencies
and linkage disequilibrium at the DRD2 locus. Hum.
Genet. 103, 211-227 (1998). | Article | PubMed
| ChemPort |
50. Reich, D.E. et al. Linkage disequilibrium in the human
genome. Nature 411, 199-204 (2001). | Article | PubMed
| ChemPort |
51. Gabriel, S.B. et al. The structure of haplotype blocks in
the human genome. Science 296, 2225-2229
(2002). | Article | PubMed | ChemPort |
52. Edwards, A.W.F. & Cavalli-Sforza L.L. Reconstruction of
evolutionary trees. In Phenetic and Phylogenetic
Classification (eds. Heywood, V.E. & McNeill, J.) 67-76
(The Systematics Association, London, 1964).
53. Cavalli-Sforza, L.L. & Edwards, A.W.F. Analysis of human
evolution. Proc. 11th Int. Congr. Genet. 2, 923-933
(1964).
54. Cavalli-Sforza, L.L. & Edwards, A.W.F. Phylogenetic
analysis: models and estimation procedures. Am. J. Hum.
Genet. 19, 223-257 (1967).
55. Menozzi, P., Piazza, A. & Cavalli-Sforza, L.L. Synthetic
maps of human gene frequencies in Europe. Science 201,
786-792 (1978). | PubMed | ChemPort |
56. Bowcock, A.M. et al. Drift, admixture, and selection in
human evolution: A study with DNA polymorphisms. Proc.
Natl. Acad. Sci. USA 88, 839-843 (1991). | PubMed
| ChemPort |
57. Bowcock, A.M. et al. High resolution of human
evolutionary trees with polymorphic microsatellites.
Nature 368, 455-457 (1994). | PubMed | ChemPort |
58. Cavalli-Sforza, L.L. & Piazza, A. Analysis of evolution:
evolutionary rates, independence, and treeness. Theor.
Popul. Biol. 8, 127-165 (1975). | PubMed | ChemPort |
59. Pritchard, J.K., Stephens, M. & Donnelly, P.J. Inference of
population structure using multilocus genotype data.
Genetics 155, 945-959 (2000). | PubMed | ChemPort |
60. Lewontin, R.C. The apportionment of human diversity. In
Evolutionary Biology Vol. 6 (eds. Dobzhansky, T.H.,
Hecht, M.K. & Steere, W.C.) 381-398 (Appleton-CenturyCrofts, New York, 1972).
61. Nei, M & Roychoudhury, A.K. Genic variation within and
between the three major races of Man, Caucasoids,
Negroids, and Mongoloids. Am J. Hum. Genet. 26, 421443 (1974). | PubMed | ChemPort |
62. Barbujani, G., Magagni, A., Minch, E. & Cavalli-Sforza,
L.L. An apportionment of human DNA diversity. Proc.
Natl. Acad. Sci. USA 94, 4516-4519
(1997). | Article | PubMed | ChemPort |
63. Cavalli-Sforza, L.L., Piazza, A., Menozzi, P. & Mountain, J.
Reconstruction of human evolution; bringing together
genetic, archaeological, and linguistic data. Proc. Natl.
Acad. Sci. USA 85, 6002-6006 (1988). | PubMed
| ChemPort |
64. Brown, W.M., George, M. Jr. & Wilson, A.C. Rapid
evolution of animal mitochondrial DNA. Proc. Natl. Acad.
Sci. USA 76, 1967-1971 (1979). | PubMed | ChemPort |
65. Johnson, M.J., Wallace, D.C., Ferris, S.D., Rattazzi, M.C.
& Cavalli-Sforza, L.L. Radiation of human mitochondrial
DNA types analyzed by restriction endonuclease cleavage
patterns. J. Mol. Evol. 19, 255-271 (1983). | PubMed
| ChemPort |
66. Mountain, J.L., Lin, A.A., Bowcock, A.M. & Cavalli-Sforza,
L.L. Evolution of modern humans: evidence from nuclear
DNA polymorphisms. Phil. Trans. R. Soc. Lond. B. 337,
159-165 (1992). | ChemPort |
67. Cann, R.L., Stoneking, M. & Wilson, A.C. Mitochondrial
DNA and human evolution. Nature 325, 31-36
(1987). | PubMed | ChemPort |
68. Templeton, A.R. Human origins and analysis of
mitochondrial DNA sequences. Science 255, 737
(1992). | PubMed | ChemPort |
69. Chen, Y.-S., Torroni, A., Excoffier, L., SantachiaraBenerecetti, A.S. & Wallace, D.C. Analysis of mtDNA
variation in African populations reveals the most ancient
of all human continent-specific haplogroups. Am. J. Hum.
Genet. 57, 133-149 (1995). | PubMed | ChemPort |
70. Ingman, M. Kaessmann, H., Pääbo, S. & Gyllensten, U.
Mitochondrial genome variation and the origin of modern
humans. Nature 408, 708-713 (2000). | Article | PubMed
| ChemPort |
71. Shen, P. et al. Population genetic implications from DNA
polymorphism in random human genomic sequences.
Hum. Mutat. 20, 209-217 (2002). | Article | PubMed
| ChemPort |
72. Satta, Y., Klein, J. & Takahata, N. DNA archives and our
nearest relative: the trichotomy problem revisited. Mol.
Phylogenet. Evol. 14, 259-275 (2000). | Article | PubMed
| ChemPort |
73. Rosenberg, N.A & Feldman, M.W. The relationship
between coalescent times and population divergence
times. In Modern Developments in Theoretical Population
Genetics (eds. Slatkin, M. & Veuille, M) 130-164 (Oxford
Univ. Press, Oxford, 2002).
74. Rogers, A.R. & Harpending, H. Population growth makes
waves in the distribution of pairwise genetic differences.
Mol. Biol. Evol. 9, 552-569 (1992). | PubMed
| ChemPort |
75. Tang, H., Siegmund, D.O., Shen, P., Oefner, P.J. &
Feldman, M.W. Frequentist estimation of coalescence
times from nucleotide sequence data using a tree-based
partition. Genetics 161, 447-459 (2002). | PubMed
| ChemPort |
76. Kajander, O.A., Karhunen, P.J., Holt, I.J. & Jacobs, H.T.
Prominent mitochondrial DNA recombination
intermediates in human heart muscle. EMBO Rep. 2,
1007-1012 (2001). | Article | PubMed | ChemPort |
77. Underhill, P.A. et al. Y chromosome sequence variation
and the history of human populations. Nat. Genet. 26,
358-361 (2000). | Article | PubMed | ChemPort |
78. Hammer, M.F. et al. Hierarchical patterns of global human
Y-chromosome diversity. Mol. Biol. Evol. 18, 1189-1203
(2001). | PubMed | ChemPort |
79. Paracchini, S., Arredi, B., Chalk, R. & Tyler-Smith, C.
Hierarchical high-throughput SNP genotyping of the
human Y chromosome using MALDI-TOF mass
spectrometry. Nucleic Acids Res. 30, e27
(2002). | Article | PubMed |
80. Kingman, J.F.C. The coalescent. Stochastic Processes and
their Applications. 13, 235-248 (1982).
81. Hudson, R.R. Gene genealogies and the coalescent
process. Oxf. Surv. Evol. Biol. 7, 203-217 (1990).
82. Griffiths, R.C. & Tavaré, R.C. Ancestral inference in
population genetics. Stat. Sci. 9, 307-319 (1994).
83. Thomson, R., Pritchard, J.K., Shen, P., Oefner, P.J. &
Feldman, M.W. Recent common ancestry of human Y
chromosomes: evidence from DNA sequence data. Proc.
Natl. Acad. Sci. USA 97, 7360-7365
(2000). | Article | PubMed | ChemPort |
84. Seielstad, M.T., Minch, E. & Cavalli-Sforza, L.L. Genetic
evidence for a higher female migration rate in humans.
Nat. Genet. 20, 278-280 (1998). | Article | PubMed
| ChemPort |
85. Salem, A.H., Badr, F.M., Gaballah, M.F. & Pääbo, S. The
genetics of traditional living: Y-chromosomal and
mitochondrial lineages in the Sinai Peninsula. Am. J.
Hum. Genet. 59, 741-743 (1996). | PubMed
| ChemPort |
86. Sajantila, A. et al. Paternal and maternal DNA lineages
reveal a bottleneck in the founding of the Finnish
population. Proc. Natl. Acad. Sci. USA 93, 12035-12039
(1995). | Article |
87. Finnilä, S., Hassinen, I.E., Ala-Kokko, L. & Majamaa, K.
Phylogenetic network of the mtDNA haplogroup U in
northern Finland based on sequence analysis of the
complete coding region by conformation-sensitive gel
electrophoresis. Am. J. Hum. Genet. 66, 1017-1026
(2000). | Article | PubMed |
88. Richards, M. et al. Tracing European founder lineages in
the near eastern mtDNA pool. Am. J. Hum. Genet. 67,
1251-1276 (2000). | PubMed | ChemPort |
89. Zerjal, T. et al. Genetic relationships of Asians and
northern Europeans, revealed by Y-chromosomal DNA
analysis. Am. J. Hum. Genet. 60, 1174-1183
(1997). | PubMed | ChemPort |
90. Richards, M., Oppenheimer, S. & Sykes, B. mtDNA
suggests Polynesian origins in eastern Indonesia. Am. J.
Hum. Genet. 62, 1234-1236 (1998). | Article |
91. Kayser, M. et al. Melanesian origin of Polynesian Y
chromosomes. Curr. Biol. 10, 1237-1246
(2000). | Article | PubMed | ChemPort |
92. Underhill, P.A. et al. Maori origins, Y chromosome
haplotypes and implications for human history in the
Pacific. Hum. Mutat. 17, 271-280
(2001). | Article | PubMed | ChemPort |
93. Oota, H., Settheetham-Ishida, W., Tiwaweck, D., Ishida,
T. & Stoneking, M. Human mtDNA and Y-chromosome
variation is correlated with matrilocal versus patrilocal
residence. Nat. Genet. 29, 20-21
(2001). | Article | PubMed | ChemPort |
94. Templeton, A.R. Out of Africa again and again. Nature
416, 45-51 (2002). | Article | PubMed | ChemPort |
95. Steinmetz, L.M. et al. Dissecting the architecture of a
quantitative trait locus in yeast. Nature 416, 326-330
(2002) | Article | PubMed | ChemPort |
96. Underhill, P. et al. The phylogeography of Y chromosome
binary haplotypes and the origins of modern human
populations. Ann. Hum. Genet. 65, 43-62
(2001). | Article | PubMed | ChemPort |
97. Lahr, M.M. & Foley, R. Multiple dispersals and modern
human origins. Evol. Anthropol. 3, 48-60 (1994).
98. Lahr, M.M. & Foley, R.A. Towards a theory of modern
human origins: geography, demography, and diversity in
recent human evolution. Am. J. Phys. Anthropol. 27, 137176 (1998).
99. Stringer, C. Coasting out of Africa. Nature 405, 24-27
(2000). | Article | PubMed | ChemPort |
100.
Greenberg, J. Language in the Americas (Stanford
Univ. Press, Stanford, CA, 1987).
101.
Fagan, B.M. The Great Journey: The Peopling of
Ancient America (Thames and Hudson, London, 1987).
102.
Weidenreich, F. Apes, Giants, and Man (Univ.
Chicago Press, Chicago, IL, 1946).
103.
Wolpoff, M.H. Multiregional evolution: The fossil
alternative to Eden. In The Human Revolution:
Behavioural and Biological Perspectives on the Origins of
Modern Humans (eds. Mellar, P. & Stringer, C) 62-108
(Princeton Univ. Press, Princeton, NJ, 1989).
104.
Weiss, K.M. & Maruyama, T. Archeology,
population genetics and studies of human racial ancestry.
Am. J. Phys. Anthropol. 44, 31-50 (1976). | PubMed
| ChemPort |
105.
Krings, M. et al. Neandertal DNA sequences and
the origin of modern humans. Cell 90, 19-30
(1997). | PubMed | ChemPort |
106.
Krings, M., Geisert, H., Schmitz, R.W., Krainitzki,
H. & Pääbo, S. DNA sequence of the mitochondrial
hypervariable region II from the Neandertal type
specimen. Proc. Natl. Acad. Sci. USA 96, 5581-5585
(1999). | Article | PubMed | ChemPort |
107.
Krings, M. et al. A view of Neandertal genetic
diversity. Nat. Genet. 26, 144-146
(2000). | Article | PubMed | ChemPort |
108.
Ovchinnikov, I.V. et al. Molecular analysis of
Neanderthal DNA from the northern Caucasus. Nature
404, 490-493 (2000). | Article | PubMed | ChemPort |
109.
Harding, R.M. et al. Archaic African and Asian
lineages in the genetic ancestry of modern humans. Am.
J. Hum. Genet. 60, 772-789 (1997). | PubMed
| ChemPort |
110.
Harris, E.E. & Hey, J. X-chromosome evidence for
ancient human histories. Proc. Natl. Acad. Sci. USA 96,
3320-3324 (1999). | Article | PubMed | ChemPort |
111.
Knowles, L.L. & Madison, W.P. Statistical
phylogeography. Mol. Ecol. 11, 2623-2635
(2002). | Article | PubMed |
112.
Kidd, J.R. et al. Haplotypes and linkage
disequilibrium at the phenylalanine hydroxylase locus,
PAH, in a global representation of populations. Am. J.
Hum. Genet. 66, 1882-1899 (2000). | Article | PubMed
| ChemPort |
113.
Osier, M.V. et al. A global perspective on genetic
variation at the ADH genes reveals unusual patterns of
linkage disequilibrium and diversity. Am. J. Hum. Genet.
71, 84-99 (2002). | Article | PubMed | ChemPort |
114.
The Y Chromosome Consortium. A nomenclature
system for the tree of human Y-chromosomal binary
haplogroups. Genome Res. 12, 339-348 (2002).
115.
Slatkin, M. & Hudson, R.R. Pairwise comparisons of
mitochondrial DNA sequences in stable and exponentially
growing populations. Genetics 129, 555-562
(1991). | PubMed | ChemPort |
116.
Donnelly, P. Interpreting genetic variability: the
effects of shared evolutionary history. In Variation in the
Human Genome, Ciba Foundation Symposium No. 197
(Wiley, Chichester, UK, 1996).
117.
Anati, E. The Intellectual Expressions of Prehistoric
Man: Art and Religion. Acts of the Valcamonica
Symposium '79. Centro Camuno di Studi Preistorici, Capo
di Ponte, Brescia, Italy, and (Editoriale Jaca Book SpA,
Milano, Italy, 1983).
118.
Conkey, M.W., Soffer, O., Stratmann, D. &
Jablonski, N.G. (eds.) Beyond Art: Pleistocene Image and
Symbol, Watts Symposium Series in Anthropology:
Memoirs of the California Academy of Sciences No. 23
(California Academy of Sciences, San Francisco, CA,
1997).
119.
McBrearty, S. & Brooks, A.S. The revolution that
wasn't: a new interpretation of the origin of modern
human behavior. J. Hum. Evol. 39, 453-563
(2000). | Article | PubMed | ChemPort |
120.
Lieberman, P. & Crelin, E.S. On the speech of
Neanderthal man. Linguistic Inquiry. 2, 203-222 (1971).
121.
Arensburg, B., Schepartz, L.A., Tillier, A.M.,
Vandermeersch, B. & Rak, Y. A reappraisal of the
anatomical basis for human speech in Middle Paleolithic
hominids. Am. J. Phys. Anthropol. 83, 137-146
(1990). | PubMed | ChemPort |
122.
Greenberg, J.H. The Languages of Africa
(Bloomington, Indiana, 1963).
123.
Greenberg, J.H. The Indo-Pacific Hypothesis.
Current Trends in Linguistics, Volume 8, 809-871 (1971).
124.
Greenberg, J.H. Indo-European and Its Closest
Relatives: The Eurasiatic Language Family: Grammar
(Stanford Univ. Press, Stanford, California, 2000).
125.
Ruhlen, M. On the Origin of Languages: Studies in
Linguistic Taxonomy (Stanford Univ. Press, Stanford,
California, 1994).
126.
Eshel, I. & Cavalli-Sforza, L.L. Assortment of
encounters and evolution of cooperativeness. Proc. Natl.
Acad. Sci. USA 79, 1331-1335 (1982).
127.
Klein, R.G. The Human Career 2nd edn (Univ. of
Chicago Press, Chicago, IL, 1999).
128.
Ammerman, A.J. & Cavalli-Sforza, L.L. The
Neolithic Transition and the Genetics of Populations in
Europe (Princeton Univ. Press, Princeton, New Jersey,
1984).
129.
King, R. & Underhill, P.A. Congruent distribution of
Neolithic painted pottery and ceramic figurines with Ychromosome lineages. Antiquity 76, 707-714 (2002).
130.
Chikhi, L., Destro-Bisol, G., Bertorelle, G., Pascali,
V. & Barbujani, G. Clines of nuclear DNA markers suggest
a largely Neolithic ancestry of the European gene pool.
Proc. Natl. Acad. Sci. USA 95, 9053-9058
(1998). | Article | PubMed | ChemPort |
131.
Chikhi, L., Nichols, R.A., Barbujani, G. &
Beaumont, M.A. Y genetic data support the Neolithic
demic diffusion model. Proc. Natl. Acad. Sci. USA 99,
11008-11013 (2002). | Article | PubMed | ChemPort |
132.
Renfrew, C. Archaeology and Language: The
Puzzle of Indo-European Origins (Jonathan Cape, London,
1987).
133.
Bellwood, P.S. The colonization of the Pacific:
some current hypotheses. In The Colonization of the
Pacific: A Genetic Trail (eds. Hill, A.V.S. & Serjeantson,
S.W.) 1-59 (Oxford Univ. Press, New York, 1989).
134.
Cavalli-Sforza, L.L., Minch, E. & Mountain, J.
Coevolution of genes and languages revisited. Proc. Natl.
Acad. Sci. USA 89, 5620-5624 (1992). | PubMed
| ChemPort |
135.
Cavalli-Sforza, L.L. & Feldman, M.W. Cultural
Transmission and Evolution: A Quantitative Approach.
(Princeton Univ. Press, Princeton, New Jersey, 1981).
136.
Cavalli-Sforza, L.L. Genes, Peoples and Languages
(North Point Press, New York, 2000).
137.
Sugimoto, C. et al. Typing of urinary JC virus DNA
offers a novel means of tracing human migrations. Proc.
Natl. Acad. Sci. USA 94, 9191-9196
(1997). | Article | PubMed | ChemPort |
138.
Covacci, A., Telford, J.L., Del Giudice, G.,
Parsonnet, J. & Rappuoli, R. Helicobacter pylori virulence
and genetic geography. Science 284, 1328-1333
(1999). | Article | PubMed | ChemPort |
139.
Mountain, J.L. et al. SNPSTRs: Empirically derived,
rapidly typed, autosomal haplotypes for inference of
population history and mutational processes. Genome
Res. 12, 1766-1772 (2002). | Article | PubMed
| ChemPort |
140.
Hewlett, B.S., De Silvestri, A. & Guglielmino, C.R.
Semes and genes in Africa. Curr. Anthropol. 43, 313-321
(2002). | Article |
141.
Latter, B.D.H. Genetic differences within and
between populations of the major human subgroups. Am.
Nat. 116, 220-237 (1980). | Article |
142.
Ryman, N., Chakraborty, R. & Nei, M. Differences
in the relative distribution of human gene diversity
between electrophoretic, and red and white cell antigen
loci. Hum. Hered. 33, 93-102 (1983). | PubMed
| ChemPort |
143.
Kivisild, T. et al. The genetic heritage of earliest
settlers persist in both the Indian tribal and caste
populations. Am. J. Hum. Genet. (in the press).
144.
Thangaraj, K. et al. Genetic affinities of the
Andaman islanders, a vanishing human population. Curr.
Biol. (in the press).
145.
Semino, O., Santachiara-Benerecetti, A.S.,
Falaschi, F., Cavalli-Sforza, L.L. & Underhill, P.A.
Ethopians and Khoisan share the deepest clades of the
human Y-chromosome phylogeny. Am. J. Hum. Genet.
70, 265-268 (2002). | Article | PubMed | ChemPort |
146.
Cruciani, F. et al. An Asia to sub-Saharan Africa
back migration is supported by high-resolution analysis of
human Y chromosome haplotypes. Am. J. Hum. Genet.
70, 1197-1214 (2002). | Article | PubMed | ChemPort |
Figure 1: Summary tree of world populations.
Phylogenetic tree based on polymorphisms of 120 protein genes in 1,915
populations grouped by continental sub-areas and Fst genetic distances14. Root
placed assuming a constant rate of evolution.
Figure 2: Relationship between genetic and geographic distance.
Genetic distance of population pairs measured by Fst as a function of geographic
distance between members of the pairs14. Only samples from indigenous people
were included. Continents where primitive economies predominate (huntinggathering or tropical gardening) show highest asymptotes. Asia and the world do
not asymptote within the range shown.
Figure 3: The migration of modern Homo sapiens.
The scheme outlined above begins with a radiation from East Africa to the rest of
Africa about 100 kya and is followed by an expansion from the same area to Asia,
probably by two routes, southern and northern between 60 and 40 kya. Oceania,
Europe and America were settled from Asia in that order.
Figure 4: High resolution molecular phylogeny to study human history.
a, Phylogeny of human mtDNA haplogroups and their continental affiliation
composed from resequencing of 277 individuals143, 144. The length of the branches
corresponds approximately to the number of mutations. b, Phylogeny of human Y
chromosome haplogroups and the continental affiliation of their most frequent
occurrence, composed from population genotyping of over 1,000 individuals and
resequencing of over 100145, 146. The length of the branches corresponds
approximately to the number of mutations.
Figure 5: Language families of the world.
The 12 families of the Greenberg classification88, 122-125. The Eurasiatic superfamily
includes six families (most of which are recognized by most linguists) and an
isolate, Gilyak, listed in the central column. The oldest family is the Khoisan that
includes Bushmen and Hottentots, many of whom also belong genetically to the
oldest haplogroups of both mtDNA and NRY. Australian and Indopacific are also
old families. Other African languages are Niger-Kordofanian (mostly west Africa),
Nilo-Saharan and Afroasiatic (that includes Semitic languages like Arab and
Hebrew). American languages belong to three families: Amerinds were the first to
migrate from Asia, according to some (Fagan, ref. 89) as late as 15 kya, and
Amerind shows affinities with Eurasiatic. One of the other two American families is
Na-Dene (belonging to Dene-Caucasian), a family that probably spread to Eurasia
before Eurasiatic and includes Sinotibetan, spoken in almost all of China, as well
as some isolated, probably relic, languages (Basque, a few Caucasian languages
and Burushaski, spoken in N. Pakistan) that all survived the later spread of
Eurasiatic languages. The third American family is Eskimo-Aleut, the last to spread
to America from N.E. Siberia. The Austric family is very large and is spoken in S.E.
Asia, Indonesia, all of Polynesia to the east and Madagascar to the west.
Table 1: Variation components within and between populations
Nature Reviews Genetics 2, 207-216 (2001); doi:10.1038/35056058
[1898K]
THE HUMAN Y CHROMOSOME, IN THE LIGHT OF
EVOLUTION
Bruce T. Lahn1, Nathaniel M. Pearson1, 2 & Karin Jegalian3
about the authors
Howard Hughes Medical Institute, Department of Human Genetics, University of Chicago, 920 East 58th Street,
Chicago, Illinois 60637, USA. blahn@genetics.uchicago.edu
2
Department of Ecology and Evolution, University of Chicago, 920 East 58th Street, Chicago, Illinois 60637, USA. npearson@uchicago.edu
3 National Human Genome Research Institute, National Institutes of Health, 9,000 Rockville Pike, Bethesda ,
Maryland 20892, USA.
1
Most eukaryotic chromosomes, akin to messy toolboxes, store jumbles of
genes with diverse biological uses. The linkage of a gene to a particular
chromosome therefore rarely hints strongly at that gene's function. One
striking exception to this pattern of gene distribution is the human Y
chromosome. Far from being random and diverse, known human Ychromosome genes show just a few distinct expression profiles. Their relative
functional conformity reflects evolutionary factors inherent to sex-specific
chromosomes.
In many DIOECIOUS taxa, karyotype determines sex. To mediate this developmental
decision, sex-chromosome pairs have arisen independently among such lineages from
separate pairs of ordinary autosomes1. The sex chromosomes of one taxon can,
therefore, differ phylogenetically and structurally from those of another. The
mammalian sex chromosomes, for example, are not specifically related to those of
birds, insects or plants.
Despite their many origins, the sex chromosomes of diverse life forms are strikingly
alike. Ever-hemizygous chromosomes (that is, the Y chromosome (hereafter the Y) in
XY or the W chromosome in ZW systems) tend to be small, gene-poor and rich in
repetitive sequence. Their non-sex-specific partners, the X chromosome (hereafter the
X) and Z chromosome, tend to be more autosome-like in form and content, and in
many cases undergo dosage compensation to equalize gene activity between the
sexes. This gross convergence of sex chromosomes among disparate lineages hints
that common factors drive their evolution. Such factors are increasingly well
understood, thanks largely to studies of the mammalian sex chromosomes and of the
human Y in particular. Here, we review how studies of the human Y have already cast
a spotlight on the role of evolution in moulding the distinctive biological properties of
sex chromosomes.
Classes of human Y-chromosome genes
A typical eukaryotic chromosome encodes a motley assortment of gene products;
functionally related genes do not tend to jointly occupy particular chromosomes. It is
curious, then, that one of the shortest human chromosomes — the Y — might contain
the longest human genomic region, in which genes show only a few distinct expression
profiles. To the extent that tissue specificity reflects functionality, the human Y thus
harbours remarkably low gene-functional diversity. In fact, if classified jointly by
location and apparent function, known human Y genes boil down to pseudoautosomal
loci and three basic classes of non-recombining, male-specific loci.
The pseudoautosomal regions (PARs) at the ends of the human Y comprise 5% of its
sequence (this fraction, consistently small, varies among mammals)2, 3. In male
meiosis, the PARs of the X and Y recombine with each other at high, if subregionally
varied, rates4, 5. Accordingly, PAR genes, like autosomal genes, are shared freely
between the sexes. Although highly recombinogenic relative to the human genome as
a whole, the human PARs generally resemble autosomes in base composition, and in
gene density and diversity. About a dozen pseudoautosomal genes, most of them on
the short arm, have been identified. Most of these genes elude X inactivation, as would
be expected of genes with sex-uniform dosage. Curiously, two genes on the long arm
human PAR, SYBL1 (synaptobrevin-like 1) and HSPRY3 (sprouty (Drosophila)
homologue 3), reportedly undergo X and Y inactivation in females and males,
respectively, which indicates that this region might have a complex evolutionary
history that involves recent X-to-Y translocation6.
Most of the remainder of the human Y recombines with neither the X nor any other
chromosome. This non-recombining region of the Y (NRY) consists largely of highly
repetitive sequences that are rich in transposons and other elements whose replication
and/or expression is unlikely to directly benefit the human host 7. Of the 60megabase (Mb) human NRY, 35 Mb are euchromatic. Most of the remainder is a block
of heterochromatin on the long arm. Nearly one-half of the euchromatic portion of the
NRY has been sequenced through the publicly funded Human Genome Project.
Representative sequencing of the entire euchromatic NRY is expected to be completed
within this year. So far, 21 distinct genes or gene families that are expressed in
healthy tissues have been identified in the human NRY. These group into three salient
classes — classes 1, 2 and 3 — largely on the basis of expression profile and homology
to the X.
The eight known class 1 genes are single copy, are expressed widely in the body and
have like-functioning X-linked homologues. Class 2 also has eight known members,
each of which is multicopy, expressed only in the testis and without an active X
homologue (Fig. 1, Table 1). Class 3 contains the human NRY genes that blur an
otherwise sharp bipartition defined by classes 1 and 2. Most prominent among these is
the SRY (sex-determining region Y) gene, the master trigger of male embryonic
differentiation. The single-copy SRY gene is expressed in the embryonic BIPOTENTIAL
GONAD — where it initiates the development of the testis — and also in the adult testis.
The X carries the SOX3 (SRY-box 3) gene, an active homologue of SRY8, 9. Two other
notable class 3 NRY genes are AMELY (amelogenin Y) and PCDHY (protocadherin Y).
Unlike the widely expressed class 1 and testis-specific class 2 genes, AMELY and its X
homologue, AMELX (amelogenin X), are expressed only in developing tooth buds 10.
Similarly, PCDHY and its X homologue, PCDHX (protocadherin X), are expressed mainly
in the brain11, 12. The remaining NRY genes are RBMY (RNA-binding motif protein Y)
and VCY (variable charge Y, previously called BPY1), which have features of both
classes 1 and 2. Like class 1 genes, they have active X homologues (named RBMX
(RNA-binding motif protein X) and VCX (variable charge X), respectively); like class 2
genes, they are expressed from multiple copies, but in the testis only. The single-copy
X homologue of RBMY is widely expressed and dosage compensated 8, 9, whereas the
many X homologues of VCY are expressed only in the testis (and so are inactive in
females)13.
Figure 1 | Active genes on the human Y chromosome.
Yellow bar, euchromatic portion of the non-recombining region of the Y chromosome
(NRY); black bar, heterochromatic portion of the NRY; grey bar, centromere; red bars,
pseudoautosomal regions (genes omitted). Genes named to the right of the
chromosome have active X-chromosome homologues. Genes named to the left of the
chromosome lack known X homologues. Genes in red are widely expressed
housekeeping genes; genes in black are expressed in the testis only; and genes in
green are expressed neither widely, nor testis specifically (AMELY (amelogenin Y) is
expressed in developing tooth buds, whereas PCDHY (protocadherin Y) is expressed in
the brain). With the exception of the SRY (sex-determining region Y) gene, all the testis-specific Y
genes are multicopy. Some multicopy gene families form dense clusters, the constituent loci of which
are indistinguishable at the resolution of this map. Three regions often found deleted in infertile men,
AZFa, b, c (azoospermia factor region a, b, c), are indicated.
Table 1 | Classification of human Y-chromosome genes
Converging theoretical and empirical evidence shows how and why the gene content of
the NRY reflects the region's distinctive history. Altogether, the three gene classes of
the region show markedly limited functional themes — in stark contrast to the genic
miscellany of other human chromosomes. This remarkable functional specialization
highlights two evolutionary processes inherent to Ys: genetic decay and the
accumulation of genes that specifically benefit male fitness.
Degeneration of the Y chromosome
The mammalian sex chromosomes are thought to have arisen from an ordinary pair of
autosomes 300 million years ago14. Until then, ambient temperature during
embryonic development might have determined the sex of mammalian ancestors, as in
many modern reptiles and other descendants of bony fish 15. The foremost sexchromosome bearers in this CLADE are, notably, birds and mammals — both
HOMEOTHERMS, for whom temperature might have ceased to be useful as a signal for
developmental switching. In mammals, sex chromosomes probably arose with the
differentiation of SRY from its homologue, SOX3, which persists on the mammalian X8,
9
. Sequence and expression comparisons indicate that SRY and SOX3 descend from a
specific progenitor gene, with the more derived SRY having gained and kept the maledetermining function9. The emergence of a dominant and PENETRANT sex-determining
allele of the proto-SOX3/SRY gene would have effectively rendered an autosome pair
into sex chromosomes, starting a long and dramatic evolutionary process. Over aeons,
the mammalian X and Y diverged, with the gross structure of the X changing
remarkably little, while the Y rapidly degenerated1, 14, 16, 17.
The rampant attrition of gene activity from evolving Ys has long been noted. In fact,
MEROHAPLODIPLOID sex determination (for example, XX:XO) is thought to represent a
relatively stable endgame in sex-chromosome evolution18. Potential causes and
mechanisms of Y-specific degeneration have drawn heated speculation. Why and how
have large X and Y regions stopped recombining with each other? And why might Y
genes tend to decay once they stop recombining with their X counterparts?
Recent results indicate that, on the evolutionary lineage leading to humans, the
mutually non-recombining portions of the human Xs and Ys greatly expanded several
times, each time converting a block of previously freely recombining sequence into Xand Y-specific regions14. The striking similarity in gene order seen among disparate
mammalian Xs, compared with the relative scrambling of genes seen among
mammalian Ys (Fig. 2), indicates that such coarse blockwise (versus smooth)
consolidation of Y-haplotype linkage was probably caused by serial, large-scale
inversion of much of the Y itself. Such inversions would have disrupted alignment, and
thus recombination, between progressively larger regions of the Xs and Ys. At least
four multigene inversions seem to mark the human Y lineage: the first 300 million
years ago and the last 30 million years ago14 ( Fig. 3). Consolidating linkage across
wide swathes of the chromosome, such inversions might have swept to fixation in
ancestral populations either by GENETIC DRIFT, or by selection if they bound together
alleles that conferred benefit only in the presence of the sex-determining gene. That
gene, SRY, seems to have been the first active gene on the Y to cease recombination
with the X, as ranked by silent divergence between X and Y homologues14. The history
of Y gene rearrangement (as well as gain and loss) varies among mammalian lineages
(Fig. 2); such variation will prove phylogenetically informative as more non-human
mammalian Y sequences become available.
Figure 2 | Sex chromosomes in mammals.
The radiation hybrid maps show a | conservation of locus order in disparate mammalian
X chromosomes (cat and human) compared with b | the relative rearrangement of Y
chromosomes in the same taxa. A similar comparison of the human Y to those of other
primates (omitted for simplicity) reveals more recent taxon-specific rearrangements108.
Adapted from Ref. 109.
Figure 3 | Human sex-chromosome evolution.
The figure shows the overall shrinkage of the Y chromosome and the
blockwise expansion of its non-recombining region (NRY), probably mediated by serial large-scale
inversion as posited by Lahn and Page14. Main events are noted and roughly dated (Myr ago, millions
of years ago), with new NRY genes placed in parentheses, and phylogenetic branches indicated by
arrows. Blue regions are freely recombining. Yellow regions are X-chromosome specific. Red regions
are Y-specific (NRY). The green region represents PCDHX/Y (protocadherin X/Y)-containing sequence
that has translocated from the X to the NRY (some other likely translocations are omitted for
simplicity). The diagram is not drawn to scale and centromeres are omitted, as their locations are
uncertain for many evolutionary stages. (PARp, short arm pseudoautosomal region.)
But why do NRY genes tend to decay? Several models point to their lack of
recombination as a key factor. Edmund Wilson, and later Hermann Muller, proposed
that the NRY accumulates null alleles because intact X homologues shelter them 19, 20;
such defunct loci are not selectively purged as they would be if rendered homozygous
by recombination. A more general theory by Muller, dubbed "Muller's ratchet" (and
extended by Brian Charlesworth and others), holds that, in the face of largely harmful
mutations, only recombination can adequately regenerate highly fit alleles (that is,
crossover between harmful variants that occupy different sites in a locus can yield a
repaired allele)21, 22. William Rice invoked Muller's ratchet in considering tight linkage
across multiple loci, not all of which carry beneficial alleles; he gave the name "genetic
hitchhiking" to the spread of potentially harmful alleles that are linked to selectively
favoured alleles, with a concomitant reduction in local nucleotide diversity 23.
Human NRY haplotypes are — as predicted by such models — nearly static, strikingly
poor in variation (despite relatively frequent mutation, apparently owing to greater
male than female germ-cell turnover in mammals) and greatly eroded in function
relative to other genomic regions24. By recombination, such other regions can maintain
diverse, highly fit haplotypes that readily spread by selection, thanks to the greater,
and thus less genetic-drift-prone, EFFECTIVE POPULATION SIZE of diploid versus haploid
regions. Although the details of relevant models spur debate, most evolutionary
biologists agree that recombination shuffles alleles so that well-adapted haplotypes can
readily replace ill-adapted ones. Indeed, experimentally restricting local recombination
in laboratory fruitfly populations has been shown to threaten their long-term genetic
integrity25.
The functional blight of NRYs might also explain their characteristic shrinkage and/or
accumulation of non-essential — perhaps even parasitic — retroviral and
heterochromatic sequences. Many gene-like NRY loci are not expressed in humans, as
in many other taxa with XY systems, whereas their X counterparts remain active. This
observation belies the pervasive decay that is associated with overly robust linkage.
Nevertheless, a handful of non-recombining homologue pairs remain active on both
chromosomes. Bucking the decay trend, these genes attest to the common ancestry of
the Xs and Ys. Two alternative scenarios might account for their persistence in the
NRY.
Persistence of XY-chromosome homologues
In the first scenario, X-homologous NRY genes might have functions crucial to both
sexes. Such genes persist, with little differentiation, if proper development requires
their double dosage (two X copies in females, or X and Y copies in males)26. In that
case, X and Y homologues should function roughly equivalently, and, to maintain sexuniform dosage, the former should elude X inactivation. Class 1 human NRY genes
meet these conditions. They and their X homologues encode widely expressed
housekeeping proteins, many of which are crucial to viability26. The observed ratio of
protein-to-nucleotide divergence between such XY homologues is significantly lower
than that for other neighbouring loci — consistent with the idea that selection has
conserved the functional similarity of X and Y copies26. Finally, the X homologues of
nearly all these class 1 NRY genes elude X inactivation26-28.
In the second scenario, NRY genes persist because they have specialized in malespecific functions, such as somatic masculinization or spermatogenesis. As such, they
differ significantly in function from their X homologues (which presumably preserve
ancestral functions). An exemplar is SRY, which apparently differentiated from its
widely expressed X homologue, SOX3, to gain and maintain a key function in male
development8, 9. Another is the testis-specific class 3 gene RBMY, the X homologue of
which, RBMX, is expressed in diverse tissues29, 30. Presumably, in both cases, the
progenitor of the XY-homologue pair was widely expressed. During subsequent
evolution, the X homologue (SOX3 or RBMX) maintained this expression status,
whereas the activity of the Y homologue (SRY or RBMY) became testis-specific (and
thus male-specific). Other examples of NRY genes that have adopted specialized male
functions are reported in the mouse. Three mouse NRY genes, Zfy (zinc-finger
protein), Ube1y (ubiquitin-activating enzyme E1) and Usp9y (ubiquitin-specific
protease 9), show testis-specific expression, whereas their X homologues are
expressed in many other tissues31-33.
Accumulation of spermatogenic genes
Although NRY genes with X homologues clearly attest to ancestral XY homology, the
evolutionary origins of class 2 NRY genes (which lack X homologues) are less obvious.
Early clues to the history of these testis-specific genes came from studies of the CDY
(chromo-domain protein Y) and DAZ (deleted in azoospermia) genes. Both have
specific autosomal PARALOGUES: CDYL (chromodomain protein Y-like) and DAZL (deleted
in azoospermia-like), respectively. These autosomal genes are found throughout
mammals, whereas CDY and DAZ are found only on primate Ys. These observations
indicate that early mammals might have had only DAZL and CDYL, the paralogues of
which arose de novo at some point and were maintained on the primate Y lineage26, 3436
. DAZ and DAZL are spliced alike, which indicates that DAZ might have reached the Y
by inter-chromosomal transposition of DAZL35. CDY is an intronless version of CDYL,
which indicates that CDY might have arisen by retroposition of CDYL mRNA36.
The gain and retention of genes that specifically benefit male fecundity — and promote
spermatogenesis in particular — seems to be a global theme in Y evolution. Biologists
have long suspected, and sometimes confirmed, the great importance of male-specific
chromosomes in spermatogenesis34, 37-45. Male fruitflies that lack a Y, for example,
produce no fertile sperm39, 42. Factors that potentially drive the accumulation of
spermatogenic function in Ys have drawn much speculation. Ronald Fisher posited a
selective advantage in sequestering, within a male-specific portion of the genome, any
genes that benefit males but harm females46. This sexual antagonism model was
invoked to account for the Y linkage of ornamentation genes in guppies47 (Fig. 4);
these genes probably enhance male attractiveness and fecundity, but would reduce
fecundity in female carriers, as female ornamentation increases predation risk without
effectively boosting mating chances.
Figure 4 | Example of a Y-chromosome-linked trait.
Male (top) and female (bottom) guppies (Poecilia reticulata). Colourful male
ornamentation, which enhances both sexual attractiveness to females and
visibility to would-be predators, reflects the expression of Y-chromosomelinked genes. Photo courtesy of N.M.P.
Sexual antagonism might plausibly explain the accumulation of spermatogenic genes
on Ys, because such genes clearly benefit males but might harm females. Indeed,
women that carry Y fragments are especially prone to gonadoblastoma, a form of
ovarian tumour48, 49. However, impairment of female fitness by spermatogenic genes
could alternatively be mitigated, potentially at low metabolic cost, by transcriptionally
silencing these genes in females, instead of moving them to the Y. This possibility
makes the sexual antagonism model less generally compelling.
Accordingly, we invoke an additional argument — "constant selection" — to further
explain the preferential accumulation on the Y of any spermatogenesis genes that
might be nearly neutral in females. Studies in several taxa indicate that genes that
drive sperm production evolve unusually rapidly, presumably owing to fierce rivalry
among sperm from one or multiple males, whose fecundity tends to vary more than
that of females50-52. Under such stringent selection for winning strategies in the race
for fertilization, alleles that enhance sperm success might readily spread in a
population. Their actual selective advantage, however, is likely to vary with
chromosomal linkage. If they are Y linked, such alleles are always favoured, because
they are expressed in each generation; if they are autosomal or X linked, they can
selectively spread only when male-transmitted — roughly every other generation for
autosomal loci and every third generation for X loci. So, generation-invariant selection
on spermatogenic genes might intensify the overall selective advantage for their gain,
retention and adaptive change on the male-specific NRY. Whether such intensified
advantage actually makes allelic fixation significantly more likely on the NRY than
elsewhere in the genome remains to be fully modelled. The several-fold lower effective
population size of the NRY than of the X or an autosome, for example, might diminish
the advantage of constant selection, because small populations allow nonadvantageous alleles a greater chance to drift to fixation in place of advantageous
alleles53.
Amplification of gene copy number might be a second counter to the decay of NRY
genes. Most class 2 genes exist in multiple copies on the Y, although current counts
are inexact26. Gene amplification might buffer against harmful mutations: although
mutations accumulate to impair the function of single copies, other intact copies might
carry out a gene family's spermatogenic duties and, also, seed further amplification.
Notably, the great density of long-repeat sequences throughout the NRY might
mediate frequent amplification of repeat-flanked genic regions26, 54, 55.
Altogether, NRY genes have two distinct origins and three distinct evolutionary fates.
Their origins are: descent from the proto-Y, which was extensively homologous with
the X, or specific recruitment to the Y from elsewhere in the genome. The three
evolutionary fates of NRY genes are: functional decay, preservation in ancestral
(typically housekeeping) form, or specialization in male-specific function.
Despite degeneration, some Ys (for example, that of the fruitfly Drosophila miranda)
seem to have ballooned in size through large translocations from autosomes56. Such a
translocation apparently occurred in an early placental mammal ancestor, shortly after
the placental–MARSUPIAL split57, 58 ( Fig. 3). This translocation generated new XYhomologous sequence, which then encountered the factors that drive ongoing XY
differentiation. Recombination was eventually suppressed in much of the new Y-linked
portion; most genes in the region then decayed, and their X homologues became
subject to inactivation in females.
Y-chromosome genes and disease
A striking feature of the human NRY is that its two largest gene classes correspond to
two disorders: Turner syndrome (TS) and male infertility. Turner syndrome results
from a 45,XO karyotype59-61. Most such embryos die in utero, accounting for roughly
one-tenth of recognized human foetal deaths. TS is detected in about 1 out of 3,000
human live-births62. Short stature, failure of gonadal development and diverse
macroanatomic anomalies typify the syndrome59-61.
The TS karyotype can be seen as the lack of either an X, relative to XX females, or a Y,
relative to XY males. Recognizing this, Malcolm Ferguson-Smith argued in 1965 that
the syndrome reflects the haploinsufficiency of "TS genes", which he predicted would
be common to the Xs and Ys and would elude X inactivation60. Class 1 human NRY
genes and their X homologues meet these conditions and are considered to be TS
candidates. Their widespread expression is consistent with the broad range of
symptoms observed in TS patients. Pseudoautosomal genes might also contribute to
TS, as they occupy both the Xs and Ys and typically elude female X inactivation 2.
Indeed, a gene called short stature homeobox ( SHOX), identified recently in the freely
recombining region of the human sex chromosomes, seems to contribute to the short
stature of TS individuals63, 64. The syndrome highlights the crucial importance of the Y
in body-wide housekeeping functions and underscores the incompleteness of human Y
degeneration. In the mouse, whose Y degeneration seems relatively more advanced,
XO individuals reportedly show no salient phenotype.
The second common Y-associated disorder is male infertility. About 1 out of 1,000
human males is infertile, owing to spermatogenic failure65. Remarkably, newly arisen Y
deletions account for 10% of such cases34, 44, which is consistent with a rate of de
novo partial Y deletion of at least 10-4. Class 2 genes, which are testis-specific in
expression and male-specific in the genome, are probably important for
spermatogenesis. Deletion mapping in infertile men has defined particular Y regions
that are involved in fertility. Three such regions — AZF a, b, c (azoospermia factor
region a, b and c) — are well characterized (Fig. 1); deletion within any one region
might severely impair spermatogenesis34, 43, 44. Among the three, AZFc deletion is by
far the most common. The need for an intact Y for spermatogenesis might largely
reflect the presence of testis-specific genes in these regions. Still, the possibility
cannot be ruled out that the more widely expressed Y genes might also be required for
male fertility. For example, lesions of USP9Y (previously known as DFFRY) or DBY
(DEAD/H (Asp-Glu-Ala-Asp/His)-box polypeptide Y) genes, which are both widely
expressed class 1 genes in AZFa, have been linked to spermatogenic failure 66, 67.
In summary, two principal Y-associated disorders reflect the two most salient
functional themes of the human Y, again highlighting the two main gene classes
therein.
Class 3 genes
The active human NRY genes that fit neither class 1 nor 2 have provoked considerable
curiosity, and some functional and phylogenetic inquiry. Five such genes are known;
perhaps other putative coding sequences on the NRY will, upon more thorough
expression assay in a broad range of tissues, prove to be additional class 3 genes. In
general, these genes seem to be in various states of evolutionary limbo. Some (for
example, RBMY and SRY) clearly reflect the evolutionary trend of the Y for malespecific fitness and, thus, most resemble class 2 genes; in some rodents, Sry is
multicopy68, as are human class 2 genes and RBMY. Other class 3 genes, especially
those that recombined recently, might still decay and join the ranks of evolutionarily
informative — if functionally inert — NRY pseudogenes. Some such genes, however,
might reflect the influence of additional evolutionary factors at work on the NRY. Here,
within the broad context of mammalian Y history, we speculate on potential biological
roles and evolutionary histories of the most intriguing class 3 genes.
Amelogenin X/Y genes. Amelogenin proteins aggregate to scaffold the accretion of
tooth enamel, which is the most densely mineralized vertebrate tissue69, 70. Placental
mammals express these proteins from an X locus and, in some taxa (for example,
primate, cat, cow, deer and horse, but not murid or pig), more weakly from a Y
locus71-73. In humans, some AMELX (but not AMELY) alleles reportedly segregate with
enamel defects, although studies on the X inactivation status of the gene are
inconclusive74, 75.
Given the expression profile of amelogenin, its active expression from Ys is puzzling. In
the light of basic trends of Y-gene evolution, such conservation might reflect chance
long-term persistence or, perhaps, adaptive evolution for some function specifically
benefiting males. The latter possibility is particularly intriguing in the human case.
Human AMELX and AMELY probably stopped recombining with each other between 30
and 50 million years ago — ample evolutionary time for Y-gene decay, as attested by
the fact that all other known human X genes that ceased X–Y recombination during
that time now lack active Y homologues14. Moreover, when aligned with one another,
human AMELX and AMELY show, in addition to a single-codon gap, the most aminoacid replacements per synonymous nucleotide divergence of known human XY
homologues, including those whose Y copies are pseudogenes. Likewise, partially
sequenced deer amelogenin homologues show 3 frame-preserving gaps and 11 aminoacid differences, but no synonymous differences 76. Such sequence divergence might be
more consistent with differential adaptive protein evolution by the homologues than
with chance persistence of functionally unconstrained AMELY loci.
If AMELY has persisted by adaptive evolution in the mode of other NRY genes, what
male-specific benefit might it confer? Notably, to explain the evolution of genomic
imprinting, David Haig, Laurence Hurst and others have modelled sexual antagonism
as mediated through differences between maternal and paternal epigenetic regulation
of early growth. They posit that promiscuous, nurturing mothers prefer (in the
evolutionary sense) equitable offspring growth, whereas fathers prefer resourceintensive offspring growth at the expense of rival-fathered half-siblings77, 78. Imprinting
research has largely targeted systemic growth modifiers as candidates for such
parental antagonism, but one could also predict localized processes such as
mammalian tooth development as relevant to such conflict.
Namely, promiscuous mammal mothers might prefer relatively early teething of
offspring in order to speed weaning and regain fertility. By contrast, fathers might
prefer later teething, relative to other growth, in order to monopolize maternal
resources. Indeed, first molar eruption age in HAPLORHINE primates reportedly correlates
tightly with both weaning age and the inter-birth interval of the mother79.
Furthermore, the delay typical of marsupial primary incisor eruption is widely deemed
adaptive for prolonged suckling (K. Smith, personal communication).
Intriguingly, females in many primate populations teethe earlier overall than males 80-83
(albeit that females outpace males on other development fronts too). Moreover, there
is anecdotal evidence of delayed tooth eruption in XYY males84. Such observations are
grossly consistent with Y-linked tooth eruption delay, which might simply reflect
systemic sex-differential growth. Alternatively, might AMELY, acting as a parentally
antagonistic gene, delay male tooth eruption in at least some of the taxa that preserve
it?
Yoh Iwasa, Hurst and others, have noted that sex-linkage, like imprinting, can mark
alleles by parentage85-87. Although any inhibition of tooth eruption by AMELY would be
manifest only in males, a Y harbouring such a parentally antagonistic gene would still
be predicted to spread at the expense of other Y variants in some populations, perhaps
as defined by the degree of POLYANDRY, distribution of litter size and other factors.
But how might AMELY actively delay teething? Recent work shows that a well-attested
short amelogenin splice product might strongly promote bone and/or cartilage growth,
rather than enamel formation, indicating a previously unsuspected regulatory function
for the gene88. Intriguingly, a splice junction crucial to this product has been eliminated
by separate mutations in both the human and cow AMELY loci, leaving them able to
encode only the long transcripts generally associated with enamel-forming, but not
osteogenic, function (Fig. 5). Notably, regulatory signals from the enamel organ are
implicated in the early stages of tooth eruption, which is thought to involve
programmatic turnover in local bone and cartilage tissues89. These observations are
consistent with, if not clearly supportive of, our speculation that AMELY of some
mammals might have diverged in function from AMELX in a manner benefiting males
through teething delay.
Figure 5 | Amelogenin gene-splicing patterns.
Comparative alignment of cow X- (GenBank accession number M63499),
cow Y- (M63500), human X- (M86932) and human Y- (M86933) chromosome-encoded amelogenins,
excluding cow alternatively spliced exon 3 for simplicity. Dots indicate identity to cow X-derived
sequence; hyphens indicate relative gaps. Purple regions, linked by lines to indicate mRNA splicing,
are homologous to a highly osteogenic splice product in rat88. Blue boxes show inferred parallel
mutations in the cow and human Y loci, which destroy an exonic splice site (ancestral CAG glutamine
codon) that is crucial to the osteogenic transcript. Green regions (notably excluded from the
osteogenic product) are homologous to the glyco-binding motif that is crucial for enamel formation in
rat, as reported by Ravindranath et al. in Ref. 110. Yellow sites have known variants associated with
human X-linked enamel defects, as in Ref. 75. Note that relative sequence similarities indicate that
the cow and human AMELY (amelogenin Y) loci became non-recombining separately after cow–human
divergence, consistent with the model posited in Fig. 3.
Rare human Y lineages that lack AMELY have been reported90. In the context of our
model, it will be of great interest to learn more about tooth eruption timing in these
lineages.
Variable charge X/Y genes. These genes are the only known active human XY
homologues that are both expressed exclusively in the testis. They form a large family:
two reported Y-linked loci, which encode identical proteins, and roughly a dozen Xlinked loci, the protein products of which vary mainly in the tandem iteration of an
acidic ten-amino-acid motif present singly in the Y homologues13. The predicted VCX/Y
proteins are 125–206 amino acids long, with an invariant highly basic amino-terminal
segment. So, with predicted isoelectric points ranging from 4.3 to 9.4, these proteins
probably vary greatly in net charge at living pH, prompting their name: variable
charge, X and Y13.
Human VCX/Y-derived probes hybridize well only in anthropoids, among those
mammals assayed. The gene family thus seems to have arisen recently and/or evolved
rapidly in the anthropoid lineage13. The cellular function(s) of VCX and VCY proteins
are unknown. But in size, absolute charge and superficial structural features, VCX and
VCY resemble chromatin-associated proteins such as histones and protamines; the
latter mediate condensed DNA packaging in sperm52.
More strikingly, however, the testis-specific expression, multiple copies of sex-linked
homologues, variable motif iteration and phylogenetic novelty of VCX/Y recall the
fruitfly X-linked Stellate (Ste) and Y-linked crystal ((Su)Ste) loci91. These genes,
confined to Drosophila melanogaster and close relatives, are contentiously viewed as
MEIOTIC DRIVE antagonists, with Stellate expression putatively hindering transmission of
Y-bearing sperm in a dosage-dependent manner and crystal expression putatively
suppressing such bias92-94. Sex-chromosome drive is theoretically predicted to arise
readily and is generally well attested in the heterogametic sexes of flies, lepidopterans,
birds and mammals95. Such drive, however, carries an unusual cost in skewing the sex
ratio; this is predicted to favour the genome-wide emergence of drive modifiers.
Human VCX copies cluster near the Xp telomere, in the X region that most recently
ceased to recombine with the Y14. There, two VCX clusters flank the STS (steroid
sulphatase) gene96, 97. Deletion-induced STS deficiency, seen mostly in males as the
skin anomaly called ichthyosis, might mark VCX-deficient individuals, because wholegene deletions often reflect unbalanced recombination among flanking VCX repeat
clusters13, 98, 99. If VCX acts analogously to Stellate, males should overabound among
offspring of VCX -/VCY+ men. Sex-ratio assessment in X-linked ichthyotic pedigrees
might, therefore, reveal any resulting meiotic drive. Several such pedigrees are at
least partially reported98, 100-102. Interestingly, before knowledge of VCX/Y, there was
speculation of male-bias among offspring of ichthyosis-carrier females103 (rather than
of affected males, as expected in spermatogenic X versus Y drive). However, such
speculation was disputed on the grounds of male-biased ascertainment and
reporting102. Perhaps more concerted study of VCX/ Y will ultimately provide a new
window on human sex-linked meiotic drive — a phenomenon so far only cursorily
studied103, 104.
Protocadherin X/Y genes. The recently characterized hominid PCDHX/Y loci encode
protocadherins expressed mainly in the brain11, 12. The X- and Y-derived protein
sequences have diverged slightly from one another and show different cellular
expression distributions, leading Patricia Blanco et al to suggest that PCDHY might
have gained a male-specific function in brain morphogenesis12; the nature of such a
hypothetical function is unclear, although large-scale sexual dimorphism of the adult
human brain is well attested105. Alternatively, considering that the PCDHY region is
thought to have transposed to the Y from the X only 3–4 million years ago106, the
gene might simply be in an early stage of functional degeneration.
Conclusion
Theodosius Dobzhansky's claim that "nothing in biology makes sense except in the
light of evolution" is a mantra of the field107. Viewed practically, it might be an
overstatement: much coherent insight into the functioning of living systems has been
gained without explicitly invoking evolutionary arguments. However, reference to
evolution is crucial to a working understanding of Y functionality. As discussed here,
gross classification of the genes of the human Y elucidates much of its unusual history.
And in turn, such evolutionary insight helps to elucidate the functional ranges of the
molecules that those genes encode.
Links
DATABASE LINKS
SYBL1 | HSPRY3 | SRY | SOX3 | AMELY | PCDHY | AMELX | PCDHX | RBMY | VCY |
RBMX | VCX | Zfy | Ube1y | Usp9y | CDY | DAZ | CDYL | DAZL | Turner syndrome |
male infertility | SHOX | AZF | DBY | Stellate | crystal
FURTHER INFORMATION
Human Genome Project
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Bull, J. J. Evolution of Sex Determining Mechanisms(Benjamin Cummings, Menlo
Park, California, 1983).
Rappold, G. A. The pseudoautosomal regions of the human sex chromosomes.
Hum. Genet. 92, 315-324 (1993). | PubMed | ISI | ChemPort |
Graves, J. A. M., Wakefield, M. J. & Toder, R. The origin and evolution of the
pseudoautosomal regions of human sex chromosomes. Hum. Mol. Genet. 7,
1991-1996 (1998). | Article | PubMed | ISI | ChemPort |
Henke, A., Fischer, C. & Rappold, G. A. Genetic map of the human
pseudoautosomal region reveals a high rate of recombination in female meiosis
at the Xp telomere. Genomics 18, 478-485 (1993). | PubMed | ISI | ChemPort |
Lien, S., Szyda, J., Schechinger, B., Rappold, G. & Arnheim, N. Evidence for
heterogeneity in recombination in the human pseudoautosomal region: high
resolution analysis by sperm typing and radiation-hybrid mapping. Am. J. Hum.
Genet. 66, 557-566 (2000). | Article | PubMed | ISI | ChemPort |
Ciccodicola, A. et al. Differentially regulated and evolved genes in the fully
sequenced Xq/Yq pseudoautosomal region. Hum. Mol. Genet. 9, 395-401
(2000). | Article | PubMed | ISI | ChemPort |
Erlandsson, R., Wilson, J. F. & Paabo, S. Sex chromosomal transposable element
accumulation and male-driven substitutional evolution in humans. Mol. Biol. Evol.
17, 804-812 (2000). | PubMed | ISI | ChemPort |
Stevanovic, M., Lovell-Badge, R., Collignon, J. & Goodfellow, P. N. SOX3 is an Xlinked gene related to SRY. Hum. Mol. Genet. 2, 2013-2018
(1993). | PubMed | ISI | ChemPort |
Foster, J. W. & Graves, J. A. An SRY-related sequence on the marsupial X
chromosome: implications for the evolution of the mammalian testis-determining
gene. Proc. Natl Acad. Sci. USA 91, 1927-1931 (1994).
References 8 and 9 together identify SOX3as the X-chromosome
homologue of the male-determining gene SRY, and show that the SRYSOX3 split probably initiated the X-Y divergence in mammalian
ancestors. | PubMed | ISI | ChemPort |
Salido, E. C., Yen, P. H., Koprivnikar, K., Yu, L. C. & Shapiro, L. J. The human
enamel protein gene amelogenin is expressed from both the X and the Y
chromosomes. Am. J. Hum. Genet. 50, 303-316 (1992). | PubMed | ISI | ChemPort |
Yoshida, K. & Sugano, S. Identification of a novel protocadherin gene (PCDH11)
on the human XY homology region in Xq21. 3. Genomics 62, 540-543
(1999). | Article | PubMed | ISI | ChemPort |
Blanco, P., Sargent, C. A., Boucher, C. A., Mitchell, M. & Affara, N. A.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
Conservation of PCDHX in mammals; expression of human X/Y genes
predominantly in brain. Mamm. Genome 11, 906-914
(2000). | Article | PubMed | ISI | ChemPort |
Lahn, B. T. & Page, D. C. A human sex-chromosomal gene family expressed in
male germ cells and encoding variably charged proteins. Hum. Mol. Genet. 9,
311-319 (2000). | Article | PubMed | ISI | ChemPort |
Lahn, B. T. & Page, D. C. Four evolutionary strata on the human X chromosome.
Science 286, 964-967 (1999).
This paper shows that X and Y chromosomes in the human lineage
ceased to recombine with each other in progressive blocks during
evolution. | Article | PubMed | ISI | ChemPort |
Korpelainen, H. Sex ratios and conditions required for environmental sex
determination in animals. Biol. Rev. Camb. Phil. Soc. 65, 147-184
(1990). | ISI | ChemPort |
Barlow, D. P. Imprinting: a gamete's point of view. Trends Genet. 10, 194-199
(1994). | Article | PubMed | ISI | ChemPort |
Graves, J. A. The origin and function of the mammalian Y chromosome and Yborne genes -- an evolving understanding. Bioessays 17, 311-320
(1995). | PubMed | ISI | ChemPort |
Rice, W. R. Evolution of the Y sex chromosome in animals. BioScience 46, 331343 (1996).
A highly readable synopsis of how the lack of recombination can foster
functional decay of the Y chromosome, including a thorough explanation
of concepts such as Muller's ratchet. | ISI |
Wilson, E. B. Studies on chromosomes. III. The sexual difference of
chromosome-groups in Hemiptera, with some consideration on the determination
and inheritance of sex. J. Exp. Zool. 2, 507-545 (1906).
Muller, H. J. A gene for the fourth chromosome of Drosophila. J. Exp. Zool. 17,
325-336 (1914).
Muller, H. J. The relation of recombination to mutational advance. Mutat. Res. 1,
2-9 (1964). | Article | ISI |
Charlesworth, B. Model for evolution of Y chromosomes and dosage
compensation. Proc. Natl Acad. Sci. USA 75, 5618-5622
(1978). | PubMed | ISI | ChemPort |
Rice, W. R. Genetic hitchhiking and the evolution of reduced genetic activity of
the Y sex chromosome. Genetics 116, 161-167 (1987). | PubMed | ISI | ChemPort |
Shen, P. et al. Population genetic implications from sequence variation in four Y
chromosome genes. Proc. Natl Acad. Sci. USA 97, 7354-7359 (2000).
This paper, from a research group which has made considerable use of
non-genic non-recombining Y region (NRY) sequence variation to infer
aspects of human population history, summarizes the current
understanding of human Y-chromosome population genetics using new
data on sequence variation in NRY genes
themselves. | Article | PubMed | ISI | ChemPort |
Rice, W. R. Degeneration of a nonrecombining chromosome. Science 263, 230232 (1994).
Experimental evidence that the suppression of recombination is
detrimental to the functional integrity of genes in diploid
organisms. | PubMed | ISI | ChemPort |
Lahn, B. T. & Page, D. C. Functional coherence of the human Y chromosome.
Science 278, 675-680 (1997).
Genes in the non-recombining region of the human Y chromosome
conform to a small number of functional
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
themes. | Article | PubMed | ISI | ChemPort |
Jegalian, K. & Page, D. C. A proposed path by which genes common to
mammalian X and Y chromosomes evolve to become X inactivated. Nature 394,
776-780 (1998).
The degeneration of genes on the Y chromosome is probably what drives
the corresponding X-chromosome homologues to become subject to Xchromosome inactivation. | Article | PubMed | ISI | ChemPort |
Brown, C. J., Carrel, L. & Willard, H. F. Expression of genes from the human
active and inactive X chromosomes. Am. J. Hum. Genet. 60, 1333-1343
(1997). | PubMed | ISI | ChemPort |
Delbridge, M. L., Lingenfeler, P. A., Disteche, C. M. & Graves, J. A. M. The
candidate spermatogenesis gene RBMY has a homologue on the human X
chromosome. Nature Genet. 22, 223-224 (1999). | Article | PubMed | ISI | ChemPort |
Mazeyrat, S., Saut, N., Mattei, M. & Mitchell, M. J. RBMY evolved on the Y
chromosome from a ubiquitously transcribed X-Y identical gene. Nature Genet.
22, 224-226 (1999).
References 29 and 30 show that the testis-specific human Ychromosome gene RBMY evolved from a widely expressed X-Ychromosome homologous gene, demonstrating that Y-chromsome genes
can acquire a testis-specific expression pattern during evolution. | Article
| PubMed | ISI | ChemPort |
Mardon, G. & Page, D. C. The sex-determining region of the mouse Y
chromosome encodes a protein with a highly acidic domain and 13 zinc fingers.
Cell 56, 765-770 (1989). | PubMed | ISI | ChemPort |
Odorisio, T., Mahadevaiah, S. K., McCarrey, J. R. & Burgoyne, P. S.
Transcriptional analysis of the candidate spermatogenesis gene Ube1y and of the
closely related Ube1x shows that they are co-expressed in spermatogonia and
spermatids but are repressed in pachytene spermatocytes. Dev. Biol. 180, 336343 (1996). | Article | PubMed | ISI | ChemPort |
Brown, G. M. et al. Characterisation of the coding sequence and fine mapping of
the human DFFRY gene and comparative expression analysis and mapping to the
Sxrb interval of the mouse Y chromosome of the Dffry gene. Hum. Mol. Genet. 7,
97-107 (1998). | Article | PubMed | ISI | ChemPort |
Reijo, R. et al. Diverse spermatogenic defects in humans caused by Y
chromosome deletions encompassing a novel RNA-binding protein gene. Nature
Genet. 10, 383-393 (1995). | PubMed | ISI | ChemPort |
Saxena, R. et al. The DAZ gene cluster on the human Y chromosome arose from
an autosomal gene that was transposed, repeatedly amplified and pruned.
Nature Genet. 14, 292-299 (1996). | PubMed | ISI | ChemPort |
Lahn, B. T. & Page, D. C. Retroposition of autosomal mRNA yielded testis-specific
gene family on human Y chromosome. Nature Genet. 21, 429-433 (1999).
References 35 and 36 show that some testis-specific genes on the
human Y chromosome are transposed copies of autosomal genes. | Article
| PubMed | ISI | ChemPort |
Bridges, C. B. Non-disjunction as proof of the chromosome theory of heredity
(concluded). Genetics 1, 107-163 (1916).
Tiepolo, L. & Zuffardi, O. Localization of factors controlling spermatogenesis in
the nonfluorescent portion of the human Y chromosome long arm. Hum. Genet.
34, 119-124 (1976). | PubMed | ISI | ChemPort |
Hardy, R. W., Tokuyasu, K. T. & Lindsley, D. L. Analysis of spermatogenesis in
Drosophila melanogaster bearing deletions for Y-chromosome fertility genes.
Chromosoma 83, 593-617 (1981). | PubMed | ISI | ChemPort |
Levy, E. R. & Burgoyne, P. S. The fate of XO germ cells in the testes of XO/XY
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
and XO/XY/XYY mouse mosaics: evidence for a spermatogenesis gene on the
mouse Y chromosome. Cytogenet. Cell Genet. 42, 208-213
(1986). | PubMed | ISI | ChemPort |
Bishop, C. E. Mouse Y chromosome. Mamm. Genome 3, S289-S293
(1992). | PubMed | ISI | ChemPort |
Gepner, J. & Hays, T. S. A fertility region on the Y chromosome of Drosophila
melanogasterencodes a dynein microtubule motor. Proc. Natl Acad. Sci. USA 90,
11132-11136 (1993). | PubMed | ISI | ChemPort |
Ma, K. et al. A Y chromosome gene family with RNA-binding protein homology:
candidates for the azoospermia factor AZF controlling human spermatogenesis.
Cell 75, 1287-1295 (1993). | PubMed | ISI | ChemPort |
Vogt, P. H. et al. Human Y chromosome azoospermia factor (AZF) mapped to
different subregions in Yq11. Hum. Mol. Genet. 5, 933-943 (1996).
This paper describes several regions of the human Y chromosome that
are critically involved in spermatogenesis. | Article | PubMed | ISI | ChemPort |
Zhang, P. & Stankiewicz, R. L. Y-Linked male sterile mutations induced by P
element in Drosophila melanogaster. Genetics 150, 735-744
(1998). | PubMed | ISI | ChemPort |
Fisher, R. A. The evolution of dominance. Biol. Rev. 6, 345-368 (1931).
Brooks, R. Negative genetic correlation between male sexual attractiveness and
survival. Nature 406, 67-70 (2000). | Article | PubMed | ISI | ChemPort |
Salo, P. et al. Molecular mapping of the putative gonadoblastoma locus on the Y
chromosome. Genes Chromosomes Cancer 14, 210-214
(1995). | PubMed | ISI | ChemPort |
Tsuchiya, K., Reijo, R., Page, D. C. & Disteche, C. M. Gonadoblastoma: molecular
definition of the susceptibility region on the Y chromosome. Am. J. Hum. Genet.
57, 1400-1407 (1995). | PubMed | ISI | ChemPort |
Metz, E. C. & Palumbi, S. R. Positive selection and sequence rearrangements
generate extensive polymorphism in the gamete recognition protein bindin. Mol.
Biol. Evol. 13, 397-406 (1996). | PubMed | ISI | ChemPort |
Ting, C. T., Tsaur, S. C., Wu, M. L. & Wu, C. I. A rapidly evolving homeobox at
the site of a hybrid sterility gene. Science 282, 1501-1504
(1998). | Article | PubMed | ISI | ChemPort |
Wyckoff, G. J., Wang, W. & Wu, C. I. Rapid evolution of male reproductive genes
in the descent of man. Nature 403, 304-309 (2000).
References 50-52 provide intriguing evidence from diverse taxa that
male reproductive proteins might evolve relatively rapidly compared
with most other proteins. | Article | PubMed | ISI | ChemPort |
Gillespie, J. H. Population Genetics: A Concise Guide(Johns Hopkins Univ. Press,
Baltimore, Maryland, 1998).
Foote, S., Vollrath, D., Hilton, A. & Page, D. C. The human Y chromosome:
overlapping DNA clones spanning the euchromatic region. Science 258, 60-66
(1992). | PubMed | ISI | ChemPort |
Vollrath, D. et al. The human Y chromosome: a 43-interval map based on
naturally occurring deletions. Science 258, 52-59
(1992). | PubMed | ISI | ChemPort |
Yi, S. & Charlesworth, B. Contrasting patterns of molecular evolution of the
genes on the new and old sex chromosomes of Drosophila miranda. Mol. Biol.
Evol. 17, 703-717 (2000). | PubMed | ISI | ChemPort |
Spencer, J. A., Sinclair, A. H., Watson, J. M. & Graves, J. A. Genes on the short
arm of the human X chromosome are not shared with the marsupial X. Genomics
11, 339-345 (1991). | PubMed | ISI | ChemPort |
Watson, J. M., Spencer, J. A., Riggs, A. D. & Graves, J. A. Sex chromosome
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
evolution: platypus gene mapping suggests that part of the human X
chromosome was originally autosomal. Proc. Natl Acad. Sci. USA 88, 1125611260 (1991).
This paper shows that there was a large translocation from autosome to
sex chromosome in an ancestor of placental mammals, resulting in great
enlargement of the placental sex chromosomes. | PubMed | ISI | ChemPort |
Turner, H. H. A syndrome of infantilism, congenital webbed neck, and cubitus
valgus. Endocrinology 23, 566-574 (1938).
Ferguson-Smith, M. A. Karyotype-phenotype correlations in gonadal dysgenesis
and their bearing on the pathogenesis of malformations. J. Med. Genet. 2, 142155 (1965).
A classic paper that puts forward the hypothesis that Turner syndrome is
due to haploinsufficiency of XY-chromosome-common genes.
Zinn, A. R., Page, D. C. & Fisher, E. M. Turner syndrome: the case of the missing
sex chromosome. Trends Genet. 9, 90-93 (1993). | Article | PubMed | ISI | ChemPort |
Hook, E. B. & Warburton, D. The distribution of chromosomal genotypes
associated with Turner's syndrome: livebirth prevalence rates and evidence for
diminished fetal mortality and severity in genotypes associated with structural X
abnormalities or mosaicism. Hum. Genet. 64, 24-27
(1983). | PubMed | ISI | ChemPort |
Rao, E. et al. Pseudoautosomal deletions encompassing a novel homeobox gene
cause growth failure in idiopathic short stature and Turner syndrome. Nature
Genet. 16, 54-63 (1997). | PubMed | ISI | ChemPort |
Ellison, J. W. et al. PHOG, a candidate gene for involvement in the short stature
of Turner syndrome. Hum. Mol. Genet. 6, 1341-1347
(1997). | Article | PubMed | ISI | ChemPort |
Hull, M. G. et al. Population study of causes, treatment, and outcome of
infertility. Br. Med. J. (Clin. Res. Ed.) 291, 1693-1697
(1985). | PubMed | ISI | ChemPort |
Sun, C. et al. An azoospermic man with a de novo point mutation in the Ychromosomal gene USP9Y. Nature Genet. 23, 429-432 (1999). | Article
| PubMed | ISI | ChemPort |
Foresta, C., Ferlin, A. & Moro, E. Deletion and expression analysis of AZFa genes
on the human Y chromosome revealed a major role for DBY in male infertility.
Hum. Mol. Genet. 9, 1161-1169 (2000). | Article | PubMed | ISI | ChemPort |
Bullejos, M., Sanchez, A., Burgos, M., Jimenez, R. & Diaz De La Guardia, R.
Multiple mono- and polymorphic Y-linked copies of the SRY HMG-box in
microtidae. Cytogenet. Cell Genet. 86, 46-50
(1999). | Article | PubMed | ISI | ChemPort |
Moradian-Oldak, J. et al. A review of the aggregation properties of a recombinant
amelogenin. Connect. Tissue Res. 32, 125-130 (1995). | PubMed | ISI | ChemPort |
Fincham, A. G., Moradian-Oldak, J. & Simmer, J. P. The structural biology of the
developing dental enamel matrix. J. Struct. Biol. 126, 270-299
(1999). | Article | PubMed | ISI | ChemPort |
Nakahori, Y., Takenaka, O. & Nakagome, Y. A human X-Y homologous region
encodes 'amelogenin'. Genomics 9, 264-269 (1991). | PubMed | ISI | ChemPort |
Watson, J. M., Spencer, J. A., Graves, J. A., Snead, M. L. & Lau, E. C. Autosomal
localization of the amelogenin gene in monotremes and marsupials: implications
for mammalian sex chromosome evolution. Genomics 14, 785-789
(1992). | PubMed | ISI | ChemPort |
Samonte, R. V., Conte, R. A. & Verma, R. S. Molecular phylogenetics of the
hominoid Y chromosome. J. Hum. Genet. 43, 185-186
(1998). | Article | PubMed | ISI | ChemPort |
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
Alvesalo, L. Sex chromosomes and human growth. A dental approach. Hum.
Genet. 101, 1-5 (1997). | Article | PubMed | ISI | ChemPort |
Ravassipour, D. B. et al. Unique enamel phenotype associated with amelogenin
gene (AMELX) codon 41 point mutation. J. Dent. Res. 79, 1476-1481
(2000). | PubMed | ISI | ChemPort |
Yamauchi, K. et al. Sex determination based on fecal DNA analysis of the
amelogenin gene in sika deer (Cervus nippon). J. Vet. Med. Sci. 62, 669-671
(2000). | Article | PubMed | ISI | ChemPort |
Haig, D. Intragenomic conflict and the evolution of eusociality. J. Theor. Biol.
156, 401-403 (1992). | PubMed | ISI | ChemPort |
Hurst, L. D. Is multiple paternity necessary for the evolution of genomic
imprinting? Genetics 153, 509-512 (1999).
References 77 and 78 represent, respectively, a statement of the original
sexual antagonism model of genomic imprinting, and thoughtful
secondary theoretical work exploring that model's assumptions and
implications. | PubMed | ISI | ChemPort |
Lee, P. C. in Comparative Primate Socioecology(ed. Lee, P. C.) (Cambridge Univ.
Press, Cambridge, UK, 1999).
Mooney, M. P., Siegel, M. I., Eichberg, J. W., Lee, D. R. & Swan, J. Deciduous
dentition eruption sequence of the laboratory-reared chimpanzee (Pan
troglodytes). J. Med. Primatol. 20, 138-139 (1991). | PubMed | ISI | ChemPort |
Kuykendall, K. L., Mahoney, C. J. & Conroy, G. C. Probit and survival analysis of
tooth emergence ages in a mixed-longitudinal sample of chimpanzees (Pan
troglodytes). Am. J. Phys. Anthropol. 89, 379-399
(1992). | PubMed | ISI | ChemPort |
Kaul, S. S., Pathak, R. K. & Santosh Emergence of deciduous teeth in Punjabi
children, north India. Z. Morphol. Anthropol. 79, 25-34
(1992). | PubMed | ChemPort |
Rajic, Z., Rajic Mestrovic, S. & Vukusic, N. Chronology, dynamics and period of
primary tooth eruption in children from Zagreb, Croatia. Coll. Antropol. 23, 659663 (1999). | PubMed | ISI | ChemPort |
Alvesalo, L., Osborne, R. H. & Kari, M. The 47,XYY male, Y chromosome, and
tooth size. Am. J. Hum. Genet. 27, 53-61 (1975). | PubMed | ISI | ChemPort |
Iwasa, Y. & Pomiankowski, A. Sex specific X chromosome expression caused by
genomic imprinting. J. Theor. Biol. 197, 487-495
(1999). | Article | PubMed | ISI | ChemPort |
Hurst, L. D. Embryonic growth and the evolution of the mammalian Y
chromosome. I. The Y as an attractor for selfish growth factors. Heredity 73,
223-232 (1994). | PubMed | ISI |
Hurst, L. D. Embryonic growth and the evolution of the mammalian Y
chromosome. II. Suppression of selfish Y-linked growth factors may explain
escape from X-inactivation and rapid evolution of Sry. Heredity 73, 233-243
(1994). | PubMed | ISI |
Veis, A. et al. Specific amelogenin gene splice products have signaling effects on
cells in culture and in implants in vivo. J. Biol. Chem. 275, 41263-41272
(2000). | Article | PubMed | ISI | ChemPort |
Marks, S.C.Jr The basic and applied biology of tooth eruption. Connect. Tissue
Res. 32, 149-157 (1995). | PubMed | ISI |
Santos, F. R., Pandya, A. & Tyler-Smith, C. Reliability of DNA-based sex tests.
Nature Genet. 18, 103 (1998). | PubMed | ChemPort |
Kogan, G. L., Epstein, V. N., Aravin, A. A. & Gvozdev, V. A. Molecular evolution
of two paralogous tandemly repeated heterochromatic gene clusters linked to the
X and Y chromosomes of Drosophila melanogaster. Mol. Biol. Evol. 17, 697-702
92.
93.
94.
95.
96.
97.
98.
99.
100.
101.
102.
103.
104.
105.
106.
107.
108.
109.
110.
(2000). | PubMed | ISI | ChemPort |
Hurst, L. D. Is Stellate a relict meiotic driver? Genetics 130, 229-230
(1992). | PubMed | ISI | ChemPort |
Palumbo, G., Bonaccorsi, S., Robbins, L. G. & Pimpinelli, S. Genetic analysis of
Stellate elements of Drosophila melanogaster. Genetics 138, 1181-1197
(1994). | PubMed | ISI | ChemPort |
Hurst, L. D. Further evidence consistent with Stellate's involvement in meiotic
drive. Genetics 142, 641-643 (1996). | PubMed | ISI | ChemPort |
Hurst, L. D. Evolution. Sex, slime and selfish genes. Nature 354, 23-24
(1991). | PubMed | ISI | ChemPort |
Yen, P. H. et al. Cloning and expression of steroid sulfatase cDNA and the
frequent occurrence of deletions in STS deficiency: implications for X-Y
interchange. Cell 49, 443-454 (1987). | PubMed | ISI | ChemPort |
del Castillo, I., Cohen-Salmon, M., Blanchard, S., Lutfalla, G. & Petit, C.
Structure of the X-linked Kallmann syndrome gene and its homologous
pseudogene on the Y chromosome. Nature Genet. 2, 305-310
(1992). | PubMed | ISI | ChemPort |
Ballabio, A. et al. Deletions of the steroid sulphatase gene in 'classical' X-linked
ichthyosis and in X-linked ichthyosis associated with Kallmann syndrome. Hum.
Genet. 77, 338-341 (1987). | PubMed | ISI | ChemPort |
Li, X. M., Yen, P. H. & Shapiro, L. J. Characterization of a low copy repetitive
element S232 involved in the generation of frequent deletions of the distal short
arm of the human X chromosome. Nucleic Acids Res. 20, 1117-1122
(1992). | PubMed | ISI | ChemPort |
Filippi, G. & Meera Khan, P. Linkage studies on X-linked ichthyosis in Sardinia.
Am. J. Hum. Genet. 20, 564-569 (1968). | PubMed | ISI | ChemPort |
Went, L. N., De Groot, W. P., Sanger, R., Tippett, P. & Gavin, J. X-linked
ichthyosis: linkage relationship with the Xg blood groups and other studies in a
large Dutch kindred. Ann. Hum. Genet. 32, 333-345
(1969). | PubMed | ISI | ChemPort |
Adam, A. Sex ratio in families with X-linked ichthyosis. Am. J. Hum. Genet. 32,
763-764 (1980). | PubMed | ISI | ChemPort |
Gladstien, K., Shapiro, L. J. & Spence, M. A. Estimating sex ratio biases in Xlinked disorders: is there an excess of males in families with X-linked ichthyosis?
Am. J. Hum. Genet. 31, 741-746 (1979). | PubMed | ISI | ChemPort |
Naumova, A. & Sapienza, C. The genetics of retinoblastoma, revisited. Am. J.
Hum. Genet. 54, 264-273 (1994). | PubMed | ISI | ChemPort |
Nopoulos, P., Flaum, M., O'Leary, D. & Andreasen, N. C. Sexual dimorphism in
the human brain: evaluation of tissue volume, tissue composition and surface
anatomy using magnetic resonance imaging. Psychiatry Res. 98, 1-13
(2000). | Article | PubMed | ISI | ChemPort |
Schwartz, A. et al. Reconstructing hominid Y evolution: X-homologous block,
created by X-Y transposition, was disrupted by Yp inversion through LINE-LINE
recombination. Hum. Mol. Genet. 7, 1-11 (1998). | Article | PubMed | ISI | ChemPort |
Dobzhansky, T. G. Genetic Diversity and Human Equality(Basic Books, New York,
New York, 1973).
Archidiacono, N. et al. Evolution of chromosome Y in primates. Chromosoma
107, 241-246 (1998). | Article | PubMed | ISI | ChemPort |
Murphy, W. J. et al. Extensive conservation of sex chromosome organization
between cat and human revealed by parallel radiation hybrid mapping. Genome
Res. 9, 1223-1230 (1999). | Article | PubMed | ISI | ChemPort |
Ravindranath, R. M., Tam, W. Y., Nguyen, P. & Fincham, A. G. The enamel
protein amelogenin binds to the N-acetyl-D-glucosamine-mimicking peptide motif
of cytokeratins. J. Biol. Chem. 275, 39654-39661
(2000). | Article | PubMed | ISI | ChemPort |
Acknowledgements
We thank B. Charlesworth, S. Dorus, R. Hudson, M. Kreitman, E. Stahl, A. Veis, G.
Wyckoff and S. Yi for stimulating discussion; G. Wyckoff for computational support;
and C. Andrews and J. Socha for help with the guppies.
Glossary
BIPOTENTIAL GONAD The last embryonic tissue precursor that can differentiate into
either the ovary or the testis.
CLADE An organismal lineage comprising an ancestor and all its descendants.
DIOECIOUS Having separate male and female organisms.
EFFECTIVE POPULATION SIZE (N e). The theoretical number of organisms or copies
of a locus for which the genetic variation in a given sample of the organisms or copies
can be explained solely by mutation and genetic drift; Ne is related to, but never
exceeds, the actual population size (N).
GENETIC DRIFT The random fluctuation of allele frequencies across generations in a
finite population.
HAPLORHINE A member of the clade comprising apes, monkeys and tarsiers only.
HOMEOTHERM An organism that uses cellular metabolism specifically to stabilize its
own body temperature.
MARSUPIAL Non-placental mammal whose liveborn young suckle in maternal
pouches.
MEIOTIC DRIVE Preferential transmission of one gamete genotype over another
genotype, in which the genotypes in question might derive from the same meiosis.
MEROHAPLODIPLOID Characterized by one sex lacking part, but less than half, of
the diploid chromosome set typical of the other sex.
PARALOGUE A locus that is homologous to another within the same haploid genome.
PENETRANCE The frequency of affected individuals among the carriers of a particular
genotype.
POLYANDRY A population mating structure in which a female might mate with
multiple males during her lifetime.
30 May 2002
Nature 417, 559 - 563 (2002); doi:10.1038/nature751
Nature AOP, published online 12 May 2002
DMY is a Y-specific DM-domain gene required for male development in the
medaka fish
MASARU MATSUDA*, YOSHITAKA NAGAHAMA*, AI SHINOMIYA†, TADASHI SATO†,
CHIKA MATSUDA*, TOHRU KOBAYASHI*, CRAIG E. MORREY*, NAOKI SHIBATA‡,
SHUICHI ASAKAWA§, NOBUYOSHI SHIMIZU§, HIROSHI HORI , SATOSHI HAMAGUCHI† &
MITSURU SAKAIZUMI†
* Laboratory of Reproductive Biology, National Institute for Basic Biology, Okazaki 444-8585, Japan
† Graduate School of Science and Technology, Niigata University, Ikarashi, Niigata 950-2181, Japan
‡ Department of Biology, Faculty of Science, Shinshu University, Asahi 3-1-1, Matsumoto, Nagano 390-8621, Japan
§ Department of Molecular Biology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo 160-8582, Japan
Graduate School of Science, Nagoya University, Chikusa-ku, Nagoya 464-8602, Japan
Correspondence and requests for materials should be addressed to Y.N. (e-mail: nagahama@nibb.ac.jp). The DNA Data Bank of
Japan (DDBJ) accession number of the medaka DMY cDNA sequence is AB071534.
Although the sex-determining gene Sry has been identified in mammals1, no
comparable genes have been found in non-mammalian vertebrates. Here, we used
recombinant breakpoint analysis to restrict the sex-determining region in medaka fish
(Oryzias latipes) to a 530-kilobase (kb) stretch of the Y chromosome. Deletion analysis
of the Y chromosome of a congenic XY female further shortened the region to 250 kb.
Shotgun sequencing of this region predicted 27 genes. Three of these genes were
expressed during sexual differentiation. However, only the DM-related2 PG17 was Y
specific; we thus named it DMY. Two naturally occurring mutations establish DMY's
critical role in male development. The first heritable mutant—a single insertion in
exon 3 and the subsequent truncation of DMY—resulted in all XY female offspring.
Similarly, the second XY mutant female showed reduced DMY expression with a high
proportion of XY female offspring. During normal development, DMY is expressed
only in somatic cells of XY gonads. These findings strongly suggest that the sexspecific DMY is required for testicular development and is a prime candidate for the
medaka sex-determining gene.
The medaka has two major advantages for genetic research: a large genetic diversity within
the species3, 4 and the existence of several inbred strains5. As in mammals, sex
determination in medaka is male heterogametic6, although the Y chromosome is not
cytogenetically distinct7. Alteration of phenotypic sex with no reproductive consequences,
and recombination over the entire sex chromosome pair8-10, suggest that there are no major
differences, other than a sex-determining gene, between the X and Y chromosomes. To
clone positionally the sex-determining region, we generated a Y congenic strain to
highlight the genetic differences between the X and Y chromosomes from inbred strains of
medaka11. The Y congenic strain has a sex-determining region derived from the HNI-strain
Y chromosome on the genetic background of an Hd-rR strain. In this congenic strain, the
wild-type allele (R) of the r locus (a sex-linked pigment gene) is located only on the Y
chromosome. Therefore, the female XrXr results in a white body colour, and the male XrYR
results in an orange-red body colour. Using this strain, we had previously constructed a
genetic map of the medaka sex chromosome9.
In this study, we first performed chromosome walking using several recombinants to map
the sex-determining region of the Y congenic strain (Fig. 1a). Two recombinants (HdrR.YHNI(R1) and (R2)) between the sex-determining (SD) locus and a sex-linked marker
(SL1) (centromere side of SD) were obtained from an oestrogen-induced XY female of the
Y congenic strain9. To obtain recombinants between SD and r (a body-colour gene), we rescreened progeny from an oestrogen-induced XY female of the recombinant Hd-rR.YHNI
(R1) strain. We subsequently found one white male (Hd-rR.YHNIrr) and one orange-red
female, which were recombinants between the SD and r loci. After confirming their
genotypes from fin clippings, these recombinants were maintained as strains.
Figure 1 Positional cloning strategy of the sex-determining
region and subsequent identification of PG17/DMY.
Full legend
High resolution image and legend (63k)
For chromosome walking from SL1, we constructed a bacterial artificial chromosome
(BAC) genomic library from the Y congenic strain. By this approach, we obtained 47 sexlinked BAC clones, sequenced their end fragments, and designed polymerase chain reaction
(PCR) primers for sequence tag site (STS) markers on the Y chromosome. Analyses of
recombinant genotypes with these STS markers indicated that the sex-determining region
was located between 135D12.F and 51H7.F (Fig. 1a). This stretch of the Y chromosome
was encompassed by four BAC clones (Fig. 1b), although 19 other BAC clones included
some portion of this region.
We then determined the location of the sex-determining region on the Y chromosome by
fluorescence in situ hybridization (FISH) using one of the BAC clones as a probe. The sexdetermining region is located on the centromere side of the long arms of the sex
chromosomes (Fig. 1c, d). Centromere and DNA marker locations on the sex
chromosomes, determined using triploid hybrid12 and gynogenetic diploids10, confirmed the
findings of the FISH analysis.
We used shotgun sequencing to determine the sequence covered by the four BAC clones.
The entire sequence of the two centromere-side BAC clones (mCON089P3 and
mCON144M14) was determined; however, the remaining two, telomere-side, BAC clones
(mCON104P2 and mCON137M1) could not be completely sequenced mainly owing to
numerous repetitive sequences. Consequently, we sequenced 422,202 nucleotides and
estimated that the four BAC clones covered about 530 kb (Fig. 1b). The gene-predicting
program Genscan (Version 1.0, http://genes.mit.edu/GENSCAN.html) predicted 52 genes
in this region.
We also found an orange-red female in our congenic progeny (Fig. 2). Mating this female
with a sex-reversed (androgen-induced) XX Hd-rR male resulted in all female offspring
with a 50:50 (white:orange-red) body colour ratio. Assuming this female's X chromosome
was derived from a recombination event between the SD and r loci, sex-linked DNA
markers flanking the SD locus should be homozygous (Hd-rR/Hd-rR type). The markers
(for example, SL1 and 51H7.F) flanking the SD were heterozygous (Hd-rR/HNI type),
indicating that this particular orange-red female had the congenic sex-determining region
on the Y chromosome (Fig. 2b), a conclusion confirmed by other DNA markers within the
sex-determining region. From these observations, this XY female was determined to lack
250 kb within the sex-determining region of the Y chromosome (Fig. 1e). Consequently,
the corresponding region of a normal Y chromosome should contain the sex-determining
gene.
Figure 2 Characteristics of medaka lacking a part of the Y
chromosome. Full legend
High resolution image and legend (39k)
Genscan analysis of the deleted 250-kb region predicted 27 genes. To identify which
predicted genes were expressed, we designed specific primers for each, and examined their
expression in medaka embryos during sexual differentiation (before hatching to 10 days
after hatching, d.a.h.). PCR with reverse transcription (RT–PCR) detected expression of
three of these 27 genes in embryos (Fig. 1f). Furthermore, only one of these three genes,
PG17 (predicted gene 17)/DMY, was expressed exclusively in XY embryos. In contrast, the
remaining two genes (PG21, PG30) were expressed in both XY and XX embryos. BAC
clones derived from the X and Y chromosomes confirmed that PG17 is specific to the Y
chromosome whereas the other two genes are present on both the X and Y chromosomes.
The full-length complementary DNA sequence (1,320 base pairs) of DMY was obtained by
5' and 3' rapid amplification of cloned ends (RACE). The longest open reading frame spans
six exons and encodes a putative protein of 267 amino acids, including the highly
conserved DM domain (Fig. 1g, h). The DM domain was originally described as a DNAbinding motif shared between doublesex (dsx) in Drosophila melanogaster and mab-3 in
Caenorhabditis elegans2. Since the initial characterizations of dsx and mab-3, DM-related
genes have been identified from virtually all species examined, including medaka13-19. In
vertebrate species, DMRT1 (DM-related transcription factor 1), the DM-related gene most
homologous to DMY (about 80%; data not shown), correlates with male development13-18,
20
. Combined with its Y chromosome specificity, this finding suggests that DMY is pivotal
in testicular differentiation.
To establish a role for DMY during sexual differentiation, we screened wild medaka
populations for naturally occurring DMY mutants. Two XY females with distinct mutations
in DMY were found in separate populations (Awara and Shirone). The XYwAwr mutant from
Awara contains a single nucleotide insertion in exon 3 of DMY. Although the DM domain
remains intact, this insertion causes a frame shift from residue 110 and premature
termination at residue 139 (Fig. 1h). Offspring obtained by mating the XYwAwr female with
an Hd-rR male (XY, DMY on Y chromosome) revealed typical mendelian combinations
(XX, XY, XYwAwr, YYwAwr); however, phenotypes were female, male, female and male,
respectively (Table 1). Despite the presence of pseudo-DMY messenger RNA (which does
not encode full-length DMY) in offspring containing the PG17wAwr mutant allele (Fig. 1h),
the absence of about two-thirds of the protein presumably renders DMY nonfunctional, thus
resulting in XY sex reversal (female phenotype). Although the mechanism apparently
differs, the second mutant also results in XY sex reversal. The entire DMY coding region of
the XYwSrn mutant is intact; however, an unknown transcriptional anomaly severely
depresses or eliminates DMY expression in embryos (Fig. 3). In addition to the original
XYwSrn female mutant, 60% of XYwSrn offspring developed as phenotypic females (Table
1), suggesting that a threshold level of DMY expression is required for male development.
Taken together, the loss-of-function mutant (XYwAwr) and the depressed expression mutant
(XYwSrn) strongly suggest that DMY is required for normal testicular differentiation.
Figure 3 cDNA PCR of progeny from XYwAwr and XYwSrn
female Hd-rR male crosses. Full legend
High resolution image and legend (53k)
To confirm the role of DMY during normal development, in situ hybridization was used to
determine its spatial expression during gonadal differentiation. Because DMRT1 and DMY
are very similar, the in situ hybridization probe was not able to discriminate between the
two mRNAs. We therefore determined the temporal expression of both DMY and DMRT1
during sexual differentiation using specific RT–PCR. At hatching and 5 d.a.h., DMY
mRNA was present in XY embryos, but not in XX embryos. In contrast, there was no
DMRT1 expression in either XX or XY embryos at either of these time points (data not
shown). Because expression of DMY and DMRT1 do not appear to overlap during this
period, we assumed that the in situ hybridization signal represented DMY expression. As
predicted from our PCR studies, DMY signal was detected only in the somatic cells
surrounding germ cells in XY embryos (Fig. 4). Although its function in the pre-Sertoli
cells remains unclear, these data further indicate that DMY has a critical role in testicular
differentiation.
Figure 4 PG17/DMY mRNA expression in fry gonads shown
by digoxigenin-labelled in situ hybridization of larval sections.
Full legend
High resolution image and legend (43k)
Although evidence of sufficiency is required to definitively identify DMY as a sexdetermining gene, we have shown that it is necessary for normal male development and
falls within the sex-determining region of the Y chromosome. Given the absence of other
reasonable candidate genes within the region, DMY is certainly the leading candidate for
the medaka sex-determining gene. Interestingly, phylogenetic analyses indicate that DMY
was probably derived from DMRT1, suggesting an evolutionary pattern similar to one of
the proposed origins of Sry. Sry is thought to have either arisen from an autosomal Sox gene
duplication event and subsequent formation of the Y chromosome or, more probably,
divergence from Sox3 after formation of the sex (X) chromosomes21. Regardless, the
linkage of Sry to the Y chromosome renders the entire male pathway dependent on Sry.
Further evidence concerning the ability of DMY to trigger male development and its
evolutionary relationship to DMRT1 are needed to confirm this hypothesis, but the linkage
of DMY to the Y chromosome and its requirement for testicular differentiation strongly
suggest that DMY represents a non-mammalian vertebrate equivalent of Sry.
Methods
Fish Two recombinant strains between SL1 and SD, Hd-rR.YHNI (R1) and Hd-rR.YHNI
(R2), were established from recombinant offspring of sex-reversed XY females of the
described Hd-rR.YHNI strain9. Another recombinant strain, Hd-rR.YHNIrr, was established
from a recombinant between SD and r. This recombinant was obtained from offspring of a
sex-reversed Hd-rR.YHNI (R1) XY female crossed with a sex-reversed Hd-rR XX male. Sex
reversal in medaka was accomplished by oestrogen (XY females) or androgen (XX males)
treatment during early development, according to previously published methods9. Naturally
occurring mutants XYwAwr and XYwSrn were found in wild populations near Awara (Fukui
prefecture) and Shirone (Niigata prefecture), Japan, respectively.
RNA and DNA extraction Total RNA and genomic DNA were extracted from each
hatched embryo after homogenization in a 1.5-ml tube with 350 µl RLT buffer supplied
with the RNeasy Mini Kit (Qiagen). The homogenized lysates were centrifuged and
supernatants were used for RNA extraction using the RNeasy Mini Kit with the RNaseFree DNase set protocol (Qiagen). Precipitated material was used for DNA extraction using
the DNeasy tissue Kit (Qiagen) according to the manufacturer's protocol.
Chromosome walking A BAC genomic library was constructed from the Y congenic HdrR.YHNI strain as described22. High-molecular-mass genomic DNA was extracted from
sperm, partially digested with HindIII, and selected for a size range of 150–250 kb. The
size-selected DNA fragments were ligated to pBAC-lac23 and used to transform DH10B. A
total of 55,292 BAC clones was picked and arrayed to 144 microtitre plates each with 384
wells. The library was gridded at high density on Hybond-N + nylon membranes
(Amersham Pharmacia Biotech). Chromosome walking started at SL1. Two-thirds of the
library was usually screened. Inserted end fragments of positive BAC clones were
amplified by vectorette PCR24 and used for assembling the positive clones. An amplified
end fragment at the far end of the SD side was used in subsequent screening of the BAC
library.
RT–PCR First-strand cDNA was synthesized from 0.5 µg total RNA in 25 µl using
PowerScript (Clontech) with oligo-dT primers. PCR was carried out in a 25 µl reaction
mixture containing 0.25 µl of the first-strand cDNA. PCR conditions were 5 min at 95 °C;
20 s at 96 °C, 30 s at 55 °C, 30 s at 72 °C for 30 cycles; and 5 min at 72 °C. For PG17,
0.01 µl of the initial PCR products were re-amplified under the same conditions. Specific
primers for each gene were as follows; PG17—first PG17.11,
GAGTCGGAGCCAAGCGGGTACAA CATTC, PG17.12,
GACCATCTCATTTTTTATTCTTGATTTT, second PG17.5,
CCGGGTGCCCAAGTGCTCCCGCTG, PG17.6,
GATCGTCCCTCCACAGAGAAGAGA; PG21—PG21.1,
TGTGATTCTGAAGGGGGAGTTTGTAA, PG21.3, GACCTCCAGAGTC
ATCTTGCACAC); and PG30—PG30.1, GGAGGAAAGTGTCAGGAGTGTTGTGT,
PG30.2, GCCGTCCCTCTGATGTACTCGTTCCT).
FISH mapping Metaphase cells from Hd-rR embryos were prepared by standard
cytogenetic methods25. FISH was performed using an SL2 fragment labelled by PCR9 with
digoxigenin-11-dUTP (Boehringer) and BAC clones mCON072N5 (containing SL1) and
mCON049E13 (SD region) labelled by nick translation with fluorescein and biotin,
respectively. Cells were co-hybridized with SL2 and SL1 or SL2, SL1 and SD, and
counterstained with 4,6-diamidino-2-phenylindole (DAPI). Probes were detected with
rhodamine-labelled anti-digoxigenin antibodies (Boehringer), Alexa Fluor 488-labelled
anti-fluorescein, and Alexa Fluor 660-labelled streptavidin (Molecular Probes). Metaphase
cells were examined using four filters (A4, L5, N3 and Y5) and images were captured using
a CoolSNAP charge-coupled device camera (Nippon roper) and Openlab software
(Improvision). Hybridization was detected on the identical sex chromosome.
Shotgun sequencing BAC DNA was hydrodynamically sheared to average sizes of 1.5 and
4.5 kb, and the DNA was ligated into a pUC18 vector. We sequenced each BAC to 13
coverage using Dyeterminatore chemistry. Individual BACs were assembled from the
shotgun sequences using phred Version 0.000925.c, crossmatch Version 0.990319 and
phrap Version 0.990319 (Codon Code), and PCP Version 2.1.6 and CAP4 Version 2.1.6
(Paracel). The gaps in each BAC were closed using a combination of BAC walking,
directed PCR and re-sequencing of individual clones.
Analyses of wild mutants We crossed wild mutant females (XY; -/PG17wAwr or XY; /PG17wSrn) with Hd-rR males (XY; -/PG17Hd-rR) to obtain the following sex chromosomes
and PG17 genotypes in the offspring: XX; -/-, XY; -/PG17wAwr, XY; -/PG17Hd-rR, and YY;
PG17wAwr/PG17Hd-rR. Genotypes of these offspring were determined by genomic PCR of
PG17 and SL1. PG17 expression was examined by RT–PCR (see above). Sexes were
confirmed by histological examination of gonads (fry and adult fish) and secondary sex
characteristics (adult fish). At hatching, germ cell numbers were counted to determine
gonadal phenotype26. Exon sequences of the mutant PG17 genes (PG17wAwr and PG17wSrn)
were determined using DNA extracted from the mutants and their progeny.
In situ hybridization Gonads of 0–5 d.a.h. Hd-rR fry were dissected with trunk of the
body and fixed in 4% paraformaldehyde in 0.1 M phosphase buffer (pH 7.4) at 4 °C
overnight. After fixation, gonads were embedded in paraffin and cross-sectioned at 5 µm.
In situ hybridization was performed using published methods27. Genetic sex of fry was
confirmed by PCR using the SL1 genotype11.
Received 22 February 2002;
accepted 24 April 2002
References
1. Sinclair, A. H. et al. A gene from the human sex-determining region encodes a protein with
homology to a conserved DNA-binding motif. Nature 346, 240-244
(1990) | PubMed | ISI | ChemPort |
1. Sinclair, A. H. et al. A gene from the human sex-determining region encodes a protein with
homology to a conserved DNA-binding motif. Nature 346, 240-244
(1990) | PubMed | ISI | ChemPort |
2. Raymond, C. S. et al. Evidence for evolutionary conservation of sex-determining genes. Nature
391, 691-695 (1998) | Article | PubMed | ISI | ChemPort |
2. Raymond, C. S. et al. Evidence for evolutionary conservation of sex-determining genes. Nature
391, 691-695 (1998) | Article | PubMed | ISI | ChemPort |
3. Sakaizumi, M., Moriwaki, K. & Egami, N. Allozymic variation and regional differentiation in wild
populations of the fish Oryzias latipes. Copeia 1983, 311-318 (1983)
4. Matsuda, M., Yonekawa, H., Hamaguchi, S. & Sakaizumi, M. Geographic variation and
diversity in the mitochondrial DNA of the medaka, Oryzias latipes, as determined by restriction
endonuclease analysis. Zool. Sci. 14, 517-526 (1997) | ISI |
4. Matsuda, M., Yonekawa, H., Hamaguchi, S. & Sakaizumi, M. Geographic variation and
diversity in the mitochondrial DNA of the medaka, Oryzias latipes, as determined by restriction
endonuclease analysis. Zool. Sci. 14, 517-526 (1997) | ISI |
5. Hyodo-Taguchi, Y. & Sakaizumi, M. List of inbred strains of the medaka, Oryzias latipes,
maintained in the Division of Biology, National Institute of Radiological Sciences. Fish Biol. J.
MEDAKA 5, 29-30 (1993)
6. Aida, T. On the inheritance of colour in a freshwater fish, Aplocheilus latipes Temminck and
Schlegel, with special reference to sex-linked inheritance. Genetics 6, 554-573 (1921)
7. Uwa, H. & Ojima, Y. Detailed and banding karyotype analysis of the medaka, Oryzias latipes,
in cultured cells. Proc. Jpn Acad. B 57, 39-43 (1981) | ISI |
7. Uwa, H. & Ojima, Y. Detailed and banding karyotype analysis of the medaka, Oryzias latipes,
in cultured cells. Proc. Jpn Acad. B 57, 39-43 (1981) | ISI |
8. Yamamoto, T. Progenies of sex-reversal females mated with sex-reversal males in the
medaka, Oryzias latipes. J. Exp. Zool. 146, 163-179 (1961) | ISI | ChemPort |
8. Yamamoto, T. Progenies of sex-reversal females mated with sex-reversal males in the
9.
9.
10.
10.
11.
11.
12.
12.
13.
13.
14.
14.
15.
15.
16.
16.
17.
17.
medaka, Oryzias latipes. J. Exp. Zool. 146, 163-179 (1961) | ISI | ChemPort |
Matsuda, M., Sotoyama, S., Hamaguchi, S. & Sakaizumi, M. Male-specific restriction of
recombination frequency in the sex chromosomes of the medaka, Oryzias latipes. Genet. Res.
73, 225-231 (1999) | Article | ISI |
Matsuda, M., Sotoyama, S., Hamaguchi, S. & Sakaizumi, M. Male-specific restriction of
recombination frequency in the sex chromosomes of the medaka, Oryzias latipes. Genet. Res.
73, 225-231 (1999) | Article | ISI |
Kondo, M., Nagao, E., Mitani, H. & Shima, A. Differences in recombination frequencies during
female and male meioses of the sex chromosomes of the medaka, Oryzias latipes. Genet.
Res. 78, 23-30 (2001) | Article | PubMed | ISI | ChemPort |
Kondo, M., Nagao, E., Mitani, H. & Shima, A. Differences in recombination frequencies during
female and male meioses of the sex chromosomes of the medaka, Oryzias latipes. Genet.
Res. 78, 23-30 (2001) | Article | PubMed | ISI | ChemPort |
Matsuda, M. et al. Isolation of a sex chromosome-specific DNA sequence in the medaka,
Oryzias latipes. Genes Genet. Syst. 72, 263-268 (1997) | Article | ISI | ChemPort |
Matsuda, M. et al. Isolation of a sex chromosome-specific DNA sequence in the medaka,
Oryzias latipes. Genes Genet. Syst. 72, 263-268 (1997) | Article | ISI | ChemPort |
Sato, T., Yokomizo, S., Matsuda, M., Hamaguchi, S. & Sakaizumi, M. Gene-centromere
mapping of medaka sex chromosomes using triploid hybrids between Oryzias latipes and O.
luzonensis. Genetica 111, 71-75 (2001) | Article | PubMed | ISI | ChemPort |
Sato, T., Yokomizo, S., Matsuda, M., Hamaguchi, S. & Sakaizumi, M. Gene-centromere
mapping of medaka sex chromosomes using triploid hybrids between Oryzias latipes and O.
luzonensis. Genetica 111, 71-75 (2001) | Article | PubMed | ISI | ChemPort |
Raymond, C. S., Kettlewell, J. R., Hirsch, B., Bardwell, V. J. & Zarkower, D. Expression of
Dmrt1 in the genital ridge of mouse and chicken embryos suggests a role in vertebrate sexual
development. Dev. Biol. 215, 208-220 (1999) | Article | PubMed | ISI | ChemPort |
Raymond, C. S., Kettlewell, J. R., Hirsch, B., Bardwell, V. J. & Zarkower, D. Expression of
Dmrt1 in the genital ridge of mouse and chicken embryos suggests a role in vertebrate sexual
development. Dev. Biol. 215, 208-220 (1999) | Article | PubMed | ISI | ChemPort |
Smith, C. A., McClive, P. J., Western, P. S., Reed, K. J. & Sinclair, A. H. Conservation of a
sex-determining gene. Nature 402, 601-602 (1999) | Article | PubMed | ISI | ChemPort |
Smith, C. A., McClive, P. J., Western, P. S., Reed, K. J. & Sinclair, A. H. Conservation of a
sex-determining gene. Nature 402, 601-602 (1999) | Article | PubMed | ISI | ChemPort |
De Grandi, A. et al. The expression pattern of a mouse doublesex-related gene is consistent
with a role in gonadal differentiation. Mech. Dev. 90, 323-326
(2000) | Article | PubMed | ISI | ChemPort |
De Grandi, A. et al. The expression pattern of a mouse doublesex-related gene is consistent
with a role in gonadal differentiation. Mech. Dev. 90, 323-326
(2000) | Article | PubMed | ISI | ChemPort |
Kettlewell, J. R., Raymond, C. S. & Zarkower, D. Temperature-dependent expression of turtle
Dmrt1 prior to sexual differentiation. Genesis 26, 174-178
(2000) | Article | PubMed | ISI | ChemPort |
Kettlewell, J. R., Raymond, C. S. & Zarkower, D. Temperature-dependent expression of turtle
Dmrt1 prior to sexual differentiation. Genesis 26, 174-178
(2000) | Article | PubMed | ISI | ChemPort |
Marchand, O. et al. DMRT1 expression during gonadal differentiation and spermatogenesis in
the rainbow trout, Oncorhynchus mykiss. Biochim. Biophys. Acta 1493, 180-187
(2000) | Article | PubMed | ISI | ChemPort |
Marchand, O. et al. DMRT1 expression during gonadal differentiation and spermatogenesis in
the rainbow trout, Oncorhynchus mykiss. Biochim. Biophys. Acta 1493, 180-187
(2000) | Article | PubMed | ISI | ChemPort |
18. Guan, G., Kobayashi, T. & Nagahama, Y. Sexually dimorphic expression of two types of DM
(Doublesex/Mab-3)-domain genes in a teleost fish, the tilapia (Oreochromis niloticus).
Biochem. Biophys. Res. Commun. 272, 662-666 (2000) | Article | PubMed | ISI | ChemPort |
18. Guan, G., Kobayashi, T. & Nagahama, Y. Sexually dimorphic expression of two types of DM
(Doublesex/Mab-3)-domain genes in a teleost fish, the tilapia (Oreochromis niloticus).
Biochem. Biophys. Res. Commun. 272, 662-666 (2000) | Article | PubMed | ISI | ChemPort |
19. Brunner, B. et al. Genomic organization and expression of the doublesex-related gene cluster
in vertebrates and detection of putative regulatory regions for dmrt1. Genomics 1, 8-17
(2001) | Article |
20. Moniot, B., Berta, P., Scherer, G., Sudbeck, P. & Poulat, F. Male specific expression suggests
role of DMRT1 in human sex determination. Mech. Dev. 91, 323-325
(2000) | Article | PubMed | ISI | ChemPort |
20. Moniot, B., Berta, P., Scherer, G., Sudbeck, P. & Poulat, F. Male specific expression suggests
role of DMRT1 in human sex determination. Mech. Dev. 91, 323-325
(2000) | Article | PubMed | ISI | ChemPort |
21. Schepers, G. & Koopman, P. Phylogeny of the SOX family of developmental transcription
factors based on sequence and structural indicators. Dev. Biol. 227, 239-255
(2000) | Article | PubMed |
22. Matsuda, M. et al. Construction of a BAC library derived from the inbred Hd-rR strain of the
teleost fish, Oryzias latipes. Genes Genet. Syst. 76, 61-63
(2001) | Article | PubMed | ISI | ChemPort |
22. Matsuda, M. et al. Construction of a BAC library derived from the inbred Hd-rR strain of the
teleost fish, Oryzias latipes. Genes Genet. Syst. 76, 61-63
(2001) | Article | PubMed | ISI | ChemPort |
23. Asakawa, S. et al. Human BAC library: construction and rapid screening. Gene 191, 69-79
(1997) | Article | PubMed | ISI | ChemPort |
23. Asakawa, S. et al. Human BAC library: construction and rapid screening. Gene 191, 69-79
(1997) | Article | PubMed | ISI | ChemPort |
24. Ragoussis, J. & Olavesen, M. G. in Genome Mapping: A Practical Approach (ed. Dear, P. H.)
253-260 (Oxford Univ. Press, New York, 1997)
25. Matsuda, M., Matsuda, C., Hamaguchi, S. & Sakaizumi, M. Identification of the sex
chromosomes of the medaka, Oryzias latipes, by fluorescence in situ hybridization. Cytogenet.
Cell Genet. 82, 257-262 (1998) | Article | PubMed | ISI | ChemPort |
25. Matsuda, M., Matsuda, C., Hamaguchi, S. & Sakaizumi, M. Identification of the sex
chromosomes of the medaka, Oryzias latipes, by fluorescence in situ hybridization. Cytogenet.
Cell Genet. 82, 257-262 (1998) | Article | PubMed | ISI | ChemPort |
26. Hamaguchi, S. A light- and electron-microscopic study on the migration of primordial germ
cells in the teleost, Oryzias latipes. Cell Tissue Res. 227, 139-151
(1982) | PubMed | ISI | ChemPort |
26. Hamaguchi, S. A light- and electron-microscopic study on the migration of primordial germ
cells in the teleost, Oryzias latipes. Cell Tissue Res. 227, 139-151
(1982) | PubMed | ISI | ChemPort |
27. Kobayashi, T., Kajiura-Kobayashi, H. & Naghama, Y. Differential expression of vasa
homologue gene in the germ cells during oogenesis and spermatogenesis in a teleost fish,
tilapia, Oreochromis niloticus. Mech. Dev. 99, 139-142
(2000) | Article | PubMed | ISI | ChemPort |
27. Kobayashi, T., Kajiura-Kobayashi, H. & Naghama, Y. Differential expression of vasa
homologue gene in the germ cells during oogenesis and spermatogenesis in a teleost fish,
tilapia, Oreochromis niloticus. Mech. Dev. 99, 139-142
(2000) | Article | PubMed | ISI | ChemPort |
Acknowledgements. We are grateful to P. Koopman for advice; G. Young for critical
reading of the manuscript; and M. Takeda, E. Uno and R. Hayakawa for technical
assistance. This work was supported in part by Grants-in-Aid for Research for the Future
from the Japan Society for the Promotion of Science, Scientific Research of Priority Area,
Environmental Endocrine Disrupter Studies from the Ministry of the Environment, Bio
Design from the Ministry of Agriculture, Forestry and Fisheries, and Japan Society for the
Promotion of Science Research Fellowships for Young Scientists.
Competing interests statement. The authors declare that they have no competing financial
interests.
Figure 1 Positional cloning strategy of the sex-determining region and subsequent identification of
PG17/DMY. a, Genetic maps of the sex-determining regions and Y chromosomes of the
recombinant strains. Pink indicates regions derived from an HNI Y chromosome; yellow represents
Hd-rR X-chromosome-derived regions. b, BAC contigs and a physical map of the sex-determining
region. Horizontal bars indicate BAC clones. Blue blocks show the sequenced regions. c, d,
Cytogenetical mapping of the sex-determining region of medaka. c, FISH of metaphase
chromosomes. SL2 (red) localizes on the short arms of the sex chromosomes, whereas a BAC clone
containing SL1 (mCON072N5, yellow) localizes on the long arms. Arrowheads indicate sex
chromosomes. d, FISH of one sex chromosome with three different probes (SL2, SL1, SD). Signals
of a BAC clone containing the sex-determining region (mCON049E13) are light blue. The
arrowhead shows the centromere region. e, The Y chromosome of medaka lacking a part of the sexdetermining region. DNA markers were not detected between PG31C and A314482, but were
detected on the centromere side of PG04 and the telomere side of PG31T. Respective location of
PG17, PG21 and PG30 within the sex-determining region between PG04 and PG31T is shown. f,
RT–PCR analysis for PG17, PG21 and PG30 mRNA expression in whole Hd-rR.YHNI embryos at
hatching. Sex chromosome types were determined by SL1 PCR. PG17 expression was detected by
nested PCR (see Methods). g, h, Structure and sequences of PG17/DMY. g, Structural analysis of
the PG17/DMY showing exons (open boxes), the DM domain (grey boxes) and introns (horizontal
lines) of PG17. Translation start and stop sites are indicated by ATG and STOP, respectively.
Numbers represent nucleotide sequence length (base pairs, bp). h, Amino-acid sequence of wildtype PG17/DMT and the PG17wAwr mutation. The DM domain is boxed. A frame shift (italicized)
occurs in the region from amino acid 110 to premature termination (amino acid 139).
Figure 2 Characteristics of medaka lacking a part of the Y chromosome. a, Phenotypes of the
congenic strain (XX, XY) and medaka lacking a part of the Y chromosome (XY-). XX (top), XYand XY (bottom) with their secondary sex characters. Males (XY) have larger anal and dorsal fins
than females (XX, XY-). XX have white bodies (XrXr), whereas the others are orange-red (XrYR).
b, DNA types (SL1PG17 and 51H7.F) of sex chromosomes. PCR products of SL1 and PG17 were
electrophoresed in a 1% agarose gel with a 1-kb DNA ladder. AluI-digested PCR products of
51H7.F were electrophoresed in a 3% agarose gel with a 100-bp DNA ladder. SL1 and 51H7.F are
homozygous in XX and YY (Hd-rR or HNI type), but are heterozygous in XY- and XY (HdrR/HNI type). PG17 is present in XY and YY but absent in XX and XY-. These results indicate that
PG17 is specific to the Y chromosome but absent from the Y chromosome of XY-. Specific primers
for PG17 and 51H7.F were as follows: PG17: PG17.19,
GAACCACAGCTTGAAGACCCCGCTGA; PG17.20,
GCATCTGCTGGTACTGCTGGTAGTTG; and 51H7.F: 51H7.F2,
CAGGCCTTGAAGATCAACGAGT; 51H7.F3, AGTGCATCTAGTGTACATGGGT. c,
Histological cross-sections of medaka fry sampled 30 d.a.h.. Sex chromosome types were
determined by PCR using DNA extracted from the head. Black arrowheads indicate gonads. XX
and XY- individuals have ovaries with several oocytes, whereas XY specimens have testes with
spermatogonia. Scale bars, 50 µm.
Figure 3 cDNA PCR of progeny from XYwAwr and XYwSrn female Hd-rR male crosses. A
PG17/DMY band was detected in XYHd-rR and XYwAwr embryos but not in XYwSrn embryos.Total
RNA (150 ng each) was extracted from whole embryos (at hatching) and used for cDNA synthesis
and amplification using the SMART PCR cDNA Synthesis Kit (Clontech) according to the
manufacturer's protocol. Amplified cDNA and genomic DNA were used as PCR templates of
PG04 and PG17 (PG17.19–20 primer set). Specific primers for PG04 were as follows: PG04.1,
CCAGCGGTTTGAGGATAGGTTTG; PG04.2, GAGCTTTCTGCAGGGCGACTTTC. Products
were electrophoresed in a 2% agarose gel with a 100-bp DNA ladder.
Figure 4 PG17/DMY mRNA expression in fry gonads shown by digoxigenin-labelled in situ
hybridization of larval sections. a, XY gonads on hatching day; antisense probe. Strong signals for
PG17/DMY were seen in cells surrounding the germ cells. G, germ cell; CE, coelomic epithelium;
ND, nephric duct. b, XX gonad on hatching day; antisense probe. PG17/DMY signal was
undetectable in XX individuals. c, XY gonad on hatching day; sense probe. Control hybridization
showed no signal. Scale bar, 20 µm.
doi:10.1038/90046
volume 28 no. 3 pp 216 - 217
Sox9 induces testis development in
XX transgenic mice
Valerie P.I. Vidal1, 4, Marie-Christine Chaboissier1, 4, Dirk G. de Rooij2 &
Andreas Schedl1, 3
1. MDC for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany.
2. Department of Cell Biology, University Medical Center Utrecht, Utrecht, Netherlands.
3. Present address: Institute of Human Genetics, University of Newcastle upon Tyne, UK.
4. These authors contributed equally to this work.
Correspondence should be addressed to A Schedl. e-mail: andreas.schedl@ncl.ac.uk
Mutations in SOX9 are associated with male-to-female sex reversal in
humans1, 2. To analyze Sox9 function during sex determination, we
ectopically expressed this gene in XX gonads. Here, we show that Sox9 is
sufficient to induce testis formation in mice, indicating that it can substitute
for the sex-determining gene Sry.
Sex determination in mice is initiated at embryonic day 10.5 (E10.5) with
expression of the Y chromosome-linked gene Sry (ref. 3). One of the first genes
induced in male gonads after Sry expression is the Sry homolog Sox9 (refs. 4,5).
The DNA-binding domains of both proteins are highly conserved and can
functionally substitute for each other6. SOX9/Sox9 is required for expression of
the Mullerian inhibiting substance (MIS/Mis; refs. 7,8), but additional functions
during sex determination remain elusive.
To clarify the role of Sox9 during sex determination, we
ectopically expressed it in undifferentiated gonads of
transgenic mice. The mouse gene Sox9 (including introns) is
expressed under control of regulatory regions of the Wilms'
tumor suppressor gene Wt1, which is expressed at E10.5 in
the urogenital ridge of both XX and XY animals. Because
regulatory regions of Wt1 are poorly characterized, we used a
yeast artificial chromosome (YAC) 'knock-in' approach9, in
which Sox9 was fused with the start codon of the Wt1 locus
encoded by a 620-kilobase mouse YAC (ref. 10; Fig. 1a) and
used the resulting construct to generate transgenic animals11.
Two XX animals showed a female-to-male sex reversal and
were infertile. Moreover, founder 92, a fertile XY male,
transmitted the transgene to its offspring (Fig. 1b). All XX
transgenic animals (21/21) were phenotypically male (Fig.
2a–d) and had normal mounting behavior.
In situ hybridization and reverse transcription polymerase
chain reaction (PCR) analysis from E10.5 to adulthood
demonstrate expression of Sox9 from the transgene in the
developing gonads (Fig. 2e–h; data not shown). Expression of
Mis at E13.5 (data not shown) and E16.5 (Fig. 2i–l) is seen
within the developing sex cords. Moreover, male reproductive
ducts develop normally, which indicates the presence of
functional Sertoli cells. The slightly less organized pattern of
the seminiferous tubules is probably caused by ectopic
expression of Sox9 driven by the Wt1 promoter.
At E13.5, the testes of XX transgenic animals are
histologically indistinguishable from those of XY wildtype
littermates. Adult testes contain seminiferous tubules showing
a lumen lined by apparently normal Sertoli cells (Fig. 2m–p).
Many Leydig cells are present in the interstitial tissue. As
expected from a male with two X chromosomes, germ cells
are absent12, which explains the reduced testis size.
A mouse line with a transgene inserted 1 Mb upstream of
Sox9 shows a comparable female-to-male sex-reversed
phenotype13. Sox9 is expressed in XX gonads, but it is
impossible to conclude whether the expression of Sox9
triggers the sex reversal or whether Sox9 is activated as part
of the sex determination cascade. Our data clearly
demonstrate that Sox9 is sufficient to induce male
development and indicate that it can substitute for Sry
function. Hence, the duplication of the region containing
SOX9, as found in a sex-reversed human14, may be caused
by ectopic activation of the duplicated SOX9 in XX gonads.
Therefore, SRY may act only as a molecular switch15 to
activate the evolutionarily more conserved SOX9, which in
turn initiates the male differentiation program.
Received 20 March 2001; Accepted 25 May 2001.
REFERENCES
1. Wagner, T. et al. Cell 79, 1111-1120 (1994). | PubMed
| ISI | ChemPort |
2. Foster, J.W. et al. Nature 372, 525-530 (1994). | PubMed
| ISI | ChemPort |
3. Swain, A. & Lovell Badge, R. Genes Dev. 13, 755-767
(1999). | PubMed | ISI | ChemPort |
4. Morais da Silva, S. et al. Nature Genet. 14, 62-68
(1996). | PubMed | ChemPort |
5. Kent, J., Wheatley, S.C., Andrews, J.E., Sinclair, A.H. &
Koopman, P. Development 122, 2813-2822
(1996). | PubMed | ISI | ChemPort |
6. Bergstrom, D.E., Young, M., Albrecht, K.H. & Eicher, E.M.
Genesis. 28, 111-124 (2000). | Article | PubMed
| ISI | ChemPort |
7. De Santa Barbara, P. et al. Mol. Cell Biol. 18, 6653-6665
(1998). | PubMed | ChemPort |
8. Arango, N.A., Lovell-Badge, R. & Behringer, R.R. Cell 99,
409-419 (1999). | PubMed | ISI | ChemPort |
9. Moore, A.W. et al. Mech. Dev. 79, 169-184
(1998). | Article | PubMed | ISI | ChemPort |
10. Scholz, H. et al. J. Biol. Chem. 272, 32836-32846
(1997). | Article | PubMed | ISI | ChemPort |
11. Schedl, A., Montoliu, L., Kelsey, G. & Schutz, G. Nature
362, 258-261 (1993). | PubMed | ISI | ChemPort |
12. Hunt, P.A. et al. Mol. Reprod. Dev. 49, 101-111
(1998). | Article | PubMed | ISI | ChemPort |
13. Bishop, C.E. et al. Nature Genet. 26, 490-494
(2000). | Article | PubMed | ISI | ChemPort |
14. Huang, B., Wang, S., Ning, Y., Lamb, A.N. & Bartley, J.
Am. J. Med. Genet. 87, 349-353
(1999). | Article | PubMed | ISI | ChemPort |
15. Pask, A. & Graves, J.A. Cell Mol. Life Sci. 55, 864-875
(1999). | Article | PubMed | ISI | ChemPort |
ACKNOWLEDGMENTS
We thank U. Ziegler, S. Schmidt, S. Lützkendorf and
D. Landrock for excellent technical assistance. This
work was supported by the Volkswagen Stiftung and
EC-grant QLRT00741.
Figure 1: Construct design and analysis of transgenic animals.
a, Schematic of the transgene. The Sox9 genomic locus (exons and introns) was
fused in-frame with the major ATG of Wt1 and introduced into the YAC620mWt1 by
homologous recombination in yeast. E1, exon 1; IVS-pA, intervening sequence
fused to polyadenylation signal; LYS2, lysine 2 yeast marker; 5'UTR, 5'
untranslated region. b, Molecular analysis of transgenic animals and comparison
with their phenotypes. Wt1-Sox9, a region overlapping the Wt1 promoter and Sox9
open reading frame, was amplified by PCR to identify transgenic animals (primers
sWTp 5'–CATCCGAGCCGCACCTCATG–3', SS2 5'–
GCTGGAGCCGTTGACGCG). Zfy/Pax6, the Y chromosome-linked Zfy, was
amplified by PCR (oligonucleotides ZFY5 5'–
GACTAGACATGTCTTAACATCTGTCC–3' and ZFY3 5'–
CCTATTGCATGGACTGCAGCTTATG–3'); as an internal control, oligonucleotides
specific for Pax6 (H499 5'–CTTTCTCCAGAGCCTCAATCTG–3' and H500 5'–
GCAACAGGAAGGAGGGGGAGA–3') were added. F0, founders; F1, offspring of
founder 92.
Figure 2: Analysis of gonads in wildtype and transgenic animals.
a–d, Urogenital systems of 2-day-old animals. In contrast with wildtype females (c),
XX mice carrying the transgene (d) have descended testes (arrow). B, bladder; T,
testis, O, ovary, K, kidney; A, adrenal. Whole-mount in situ hybridization with antisense Sox9 (e–h) and Mis (i–l) probes demonstrate a male-specific expression
pattern in sex-reversed animals. m–p, Histological analysis of adult gonads (2- m
sections; hematoxylin and eosin staining). Testes of sex-reversed animals (p)
contain Sertoli and Leydig cells but lack germ cells because of the presence of two
X chromosomes.
doi:10.1038/82652
volume 26 no. 4 pp 490 - 494
A transgenic insertion upstream of
Sox9 is associated with dominant XX
sex reversal in the mouse
Colin E. Bishop1, 5, Deanne J. Whitworth2, Yanjun Qin1, Alexander I. Agoulnik1,
Irina U. Agoulnik1, 4, Wilbur R. Harrison3, 4, Richard R. Behringer2 & Paul A.
Overbeek4, 5
1. Department of Obstetrics & Gynecology, Baylor College of Medicine, Houston, Texas, USA.
2. Department of Molecular Genetics, University of Texas M.D. Anderson Cancer Center,
Houston, Texas, USA.
3. Department of Pathology and Laboratory Medicine, University of Texas, Houston Health
Science Center, Houston, Texas, USA.
4. Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas,
USA.
5. Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas,
USA.
Correspondence should be addressed to C E Bishop. e-mail: bishop@bcm.tmc.edu
In most mammals, male development is triggered by the transient
expression of the Y-chromosome gene, Sry, which initiates a cascade of
gene interactions ultimately leading to the formation of a testis from the
indifferent fetal gonad1-4. Several genes5-8, in particular Sox9, have a crucial
role in this pathway9-14. Despite this, the direct downstream targets of Sry
and the nature of the pathway itself remain to be clearly established15, 16. We
report here a new dominant insertional mutation, Odsex (Ods), in which XX
mice carrying a 150-kb deletion (approximately 1 Mb upstream of Sox9)
develop as sterile XX males lacking Sry. During embryogenesis, wild-type
XX fetal gonads downregulate Sox9 expression, whereas XY and XX Ods/+
fetal gonads upregulate and maintain its expression13, 14. We propose that
Ods has removed a long-range, gonad-specific regulatory element that
mediates the repression of Sox9 expression in XX fetal gonads. This
repression would normally be antagonized by Sry protein in XY embryos.
Our data are consistent with Sox9 being a direct downstream target of Sry
and provide genetic evidence to support a general repressor model of sex
determination in mammals17, 18.
We generated transgenic mice on the albino, inbred FVB
genetic background by microinjection of a Dct-tyrosinase (Tyr)
minigene construct to rescue the albinism19. The transgenic
founder male, OVE1220, had light pigmentation and
microphthalmia. When this founder was bred to FVB females,
only male progeny inherited the transgene and the eye
phenotype (Fig. 1a,b). Among the first 140 mice born, there
was an excess of male progeny: 102 (73%) were male and 38
(27%) were female. We found that 77 of the males were
pigmented with microphthalmia, indicating that they carried
the transgene, and 25 males were albino and non-transgenic,
as were all of the female progeny. This sex-ratio distortion
could be explained if a proportion of the transgenic males
were actually sex-reversed XX males. To investigate this
possibility, we typed transgenic males with ten sequencetagged site (STS) markers spanning the Y chromosome,
including primers specific for the genes RbmY1a1, Smcy,
Ube1y1, Zfy1, Usp9y and Sry located on Yp (ref. 20).
Transgenic males were either positive for all STS markers,
indicating the presence of an intact Y chromosome, or
negative for all the markers, indicating its complete absence
(Fig. 1c, and data not shown). These results indicated that the
transgene insertion was causing dominant microphthalmia
with XX sex reversal, and the mutation was designated Odsex
(Ods, ocular degeneration with sex reversal).
Except in testes and eyes (unpublished data), no obvious
histological differences were found in the adult visceral
organs or newborn skeletons of XY, XY Ods/+ and sexreversed XX Ods/+ males compared with controls (Fig. 1d,
and data not shown). Adult XX Ods/+ testes were
approximately one-third the size of those of their XY or XY
Ods/+ littermates. Histological analysis showed the presence
of seminiferous tubules and Sertoli cells, but no evidence of
spermatogenesis (Fig. 1d). Sterility was expected in XX Ods/+
males because they lack fertility genes located on the Y
chromosome and the presence of two X chromosomes in the
germ line has been shown to be incompatible with the mitotic
proliferation of spermatogonia21, 22. XY Ods/+ males showed
normal testis histology and were fertile. As Sox9 is known to
have a role in chondrogenesis23, we examined skeletal
development in more than 30 XX and XX Ods/+ newborns.
No differences were found, indicating that Sox9 was correctly
regulated in this tissue (data not shown).
Restriction and Southern-blot analysis showed that a single
insertion of two copies of the transgene has occurred in
typical head-to-tail pattern (Fig. 2a,b). The left end sequence
flanking the transgene showed approximately 90% identity
over 550 bp to a sequenced human BAC located on
chromosome 17. This allowed us to extend an extensive,
well-characterized human chromosome 17 contig24 by
approximately 250 kb, physically placing the homologous
human sequences approximately 1.3 Mb upstream of SOX9.
Using primers internal to the left end breakpoint (5sqendF/R),
we showed that the predicted 180-bp band is present in DNA
from normal and Ods/+ mice, but absent from Ods/Ods DNA,
indicating that the transgene has caused a deletion (Fig. 2c).
Preliminary data using pulsed-field gel analysis of a single,
200-kb BAC (RP23 455N13), which spans both transgenic
integration sites, indicates that the size of the deletion is
approximately 150 kb.
Fluorescence in situ hybridization (FISH) analysis revealed a
single integration site on distal chromosome 11, band E2 (Fig.
3a, top), near Sox9. Ods and Sox9 were resolved only in
interphase cells (Fig. 3a, bottom left and right), indicating that
Ods maps approximately 1–2 Mb from Sox9. Backcross
analysis placed Ods approximately 1 cM proximal to Sox9 on
chromosome 11 (Fig. 3b), consistent with the 1.3-Mb physical
distance determined in human.
The sex-reversal phenotype of XX Ods/+ mice and the
location of the Ods deletion upstream of Sox9 indicated that
the transgenic mice might have altered regulation of Sox9
expression during embryogenesis. XX Ods/+ gonads are
histologically indistinguishable from those of normal XY
littermates at 14.5 days post coitum (d.p.c.; Fig. 4a,d,g) and
11.5 d.p.c. (Fig. 4j,m,p). There was no evidence for ovotestis
at these stages. We did not detect Sox9 expression in normal
XX gonads at either 14.5 d.p.c. (Fig. 4e,f) or 11.5 d.p.c. (Fig.
4n,o), but we did detect Sox9 expression in XX Ods/+ gonads
at levels comparable to those found in their normal XY
littermates (Fig. 4h,i,q,r). Anti-mullerian substance (AMH) was
correctly expressed, suggesting that functional Sertoli cells
were present (data not shown). Abnormal expression of Sox9
was not seen in any other tissue at either embryonic stage.
Preliminary data indicate that both alleles of Sox9 are
expressed in newborn XX Ods/+ testes. Due to the presence
of a proposed auto-regulatory loop15, however, expression
data in the early embryonic gonad will be needed to assess
the significance of this point.
One explanation for the sex reversal is that the transgene has
deleted a novel gene acting between Sry and Sox9 in the sexdetermination pathway; however, we have found no evidence
for such a deleted gene in the draft sequence data of BAC
455N13, or in the homologous human sequence (unpublished
data). Another possibility is that the deletion has directly
altered the expression of Sox9 by some form of long-range
effect involving chromatin structure25. Although we cannot rule
out this possibility at present, the map position of the Ods
deletion upstream of Sox9 and the absence of detectable
skeletal malformations indicate that the deletion of a gonadspecific regulator of Sox9 expression may underlie the sexreversal phenotype. We interpret our results in the context of
a repressor model of mammalian sex determination 18. In our
model, Sox9 would induce Sertoli-cell differentiation and
represent a possible direct downstream target of Sry action
(Fig. 5). Normally, XX females would synthesize repressor
molecules that specifically extinguish Sox9 expression in the
XX fetal gonad, by binding to cis-acting regulatory elements
located upstream of the Sox9 coding sequences. These
regulatory elements are predicted to influence the activity of a
separate and distinct gonad-specific enhancer. In XY males,
Sry protein would interfere with the binding or activity of the
repressor molecule. Sox9 would therefore be expressed,
inducing Sertoli-cell differentiation and consequent male
development. In the Ods mutant, the transgene insertion has
deleted the regulatory element; thus the repressor complex
can no longer exert its long-range activity on the gonadspecific enhancer. This would permit specific upregulation of
Sox9 in the gonad in the absence of Sry, causing dominant
sex-reversal in XX Ods/+ mice.
Ods represents the first example of XX (Sry) sex reversal
reported in the mouse. These data, although derived from a
single mutant, provide strong support for a general repressor
system of sex determination in mammals. They identify Sox9
as a possible direct downstream target of Sry action and
suggest that specific gene expression is mediated by
repressor binding to long-range, gonad-specific regulatory
sequences.
Methods
Transgenic mice. We generated transgenic founder mice by
microinjection of FVB fertilized eggs with a 3-kb tyrosinase
minigene construct (Dct-Tyr) containing a mouse tyrosinase
(Tyr) cDNA under the control of a tyrosinase-related protein-2
(Dct) promoter26. We identified seven pigmented founders.
One male (a founder for family OVE 1220) had
microphthalmia. This male and his transgenic progeny were
mated to FVB females and the progeny analysed for
inheritance of the transgene by eye phenotype, coat
pigmentation and PCR using primers specific for the
transgenic construct (TyEx1, 5'–
CTGTCCAGTGCACCATCTGGACC–3', and TyEx2, 5'–
GATTACGTAATAGTGGTCCCTCAG–3'). In more than 1,000
FVB mice tested so far, the pigmented mice have all been
males with microphthalmia.
Cloning the transgene insertion site. A phage library was
constructed from XY Ods/+ DNA cloned into phage EMBL3
(Promega). We screened the library by hybridization with the
Dct-Tyr minigene construct. Plaques representing
endogenous Tyr were distinguished from the minigene by
interexonic PCR using primers TyEx1/TyEx2. A restriction
map of the insert was then generated according to standard
procedures and the mouse DNA flanking the transgene
insertion was sequenced. A single BAC (455N13) spanning
the transgene integration sites was identified in the reference
RPCI-23 mouse library by screening with probes flanking the
insertion breakpoints.
Generation of Ods/Ods mice. In more than 1,000 mice
maintained on the FVB background, no transgenic females
have been observed. When an FVB XY Ods/+ male was
crossed to a normal female of the outbred ICR strain,
however, approximately 10% of the XX Ods/+ F1 progeny
developed as adult females (with microphthalmia). The other
90% developed as typical XX Ods/+ males. XX Ods/+ F1
females were backcrossed to XY Ods/+ FVB males and found
to be fertile, giving normal litter sizes. In this way,
homozygous XX Ods/Ods and XY Ods/Ods mice were
produced. All of the Ods/Ods mice produced so far have been
males. This phenomenon shows that genetic background is
critical to Ods sex reversal and is currently being investigated
further.
Mapping of Sox9 and Ods. We typed DNA from the Jackson
Laboratory interspecific backcross panel BSS (ref. 27) for
Sox9 using primers Sox9pA (5'–TTCACCATCCCAGCCAAG–
3') and Sox9pB (5'–CCAGTCGGCCAGGTAATC–3'), located
20 kb proximal to Sox9(ref. 28). The 374-bp amplified product
was digested with BanII yielding the following sizes: C57BL/6
(B6), 20 bp, 154 bp and 200 bp; Mus spretus, 20 bp and 354
bp. The Ods flanking sequences were mapped using the
SQA1F (5'–ATTCCAGCCTTCACTGCTTC–3') and SQA2R
(5'–GGGGCTGGATAAGAACATT–3') primers, which
generate a 500-bp product. After restriction with HaeII, the M.
spretus allele remains uncut, whereas the C57BL/6 (B6) allele
is cleaved to yield 200-bp and 300-bp fragments. All products
were separated on a 3.5% agarose gel.
FISH. We prepared chromosome spreads according to
standard protocols. Digoxygenin-labelled (Boehringer) Tyr
cDNA was used to probe G-banded XX Ods/+ metaphase
spreads. Similarly, digoxygenin-labelled phage DNA spanning
Sox9 and biotin-labelled Ods BAC 455N13 DNA were used to
probe wild-type female mouse metaphase spreads.
Digoxygenin-labelled DNA was detected with an FITCconjugated anti-digoxygenin antibody (Boehringer) and biotinlabelled DNA with Cy3-conjugated streptavidin (Amersham,
Pharmacia). After counterstaining (0.2 g/ml DAPI or 0.2
g/ml propidium iodide), images were captured using a
PowerGene probe analysis system (Perceptive Scientific
Instruments).
RNA in situ hybridization. Timed matings of FVB females
with XY Ods/+ FVB males were used to generate fetuses at
11.5 and 14.5 d.p.c. Fetuses were dissected from the uterus,
a portion of the head was taken for DNA extraction and
genotyping, and the body was fixed in 4% paraformaldehyde
in PBS. The presence of the transgene was indicated by eye
pigmentation and confirmed using Dct-Tyr minigene primers
TYEx1/TYEx2. The presence of the Y chromosome was
assessed using Y-specific primers 207F (5'–
TGTAGACAGTCTTTCTGT–3') and 207R (5'–
CACAGGCTCTCCTGATTT–3'; ref. 20). Fetuses were
embedded in paraffin and serially sectioned (8 m). We
stained a subset of sections from each fetus with
haematoxylin and eosin according to standard protocols.
35[S]-UTP-labelled riboprobes were transcribed from a 255-bp
fragment of Sox9 as described29. Sections were exposed to
emulsion for 1 week, developed and counterstained with
haematoxylin.
Accession numbers. Draft sequence data for mouse BAC
455N13, AC069019; human BAC containing homologous
sequence data, 005771.
Received 7 June 2000; Accepted 11 September 2000.
REFERENCES
1. Sinclair, A.H. et al. A gene from the human sexdetermining region encodes a protein with homology to a
conserved DNA-binding motif. Nature 346, 240-244
(1990). | PubMed | ISI | ChemPort |
2. Gubbay, J. et al. A gene mapping to the sex-determining
region of the mouse Y chromosome is a member of a
novel family of embryonically expressed genes. Nature
346, 245-250 (1990). | PubMed | ISI | ChemPort |
3. Koopman, P., Gubbay, J., Vivian, N., Goodfellow, P. &
Lovell-Badge, R. Male development of chromosomally
female mice transgenic for Sry. Nature 351, 117-121
(1991). | PubMed | ISI | ChemPort |
4. Berta, P. et al. Genetic evidence equating SRY and the
testis-determining factor. Nature 348, 448-450
(1990). | PubMed | ISI | ChemPort |
5. Foster, J.W. et al. Evolution of sex determination and the
Y chromosome: SRY-related sequences in marsupials.
Nature 359, 531-533 (1992). | PubMed
| ISI | ChemPort |
6. Moniot, B., Berta, P., Scherer, G., Sudbeck, P. & Poulat,
F. Male specific expression suggests role of DMRT1 in
human sex determination. Mech. Dev. 91, 323-325
(2000). | Article | PubMed | ISI | ChemPort |
7. De Grandi, A. et al. The expression pattern of a mouse
doublesex-related gene is consistent with a role in
gonadal differentiation. Mech. Dev. 90, 323-326
(2000). | Article | PubMed | ISI | ChemPort |
8. Raymond, C.S. et al. Evidence for evolutionary
conservation of sex-determining genes. Nature 391, 691695 (1998). | Article | PubMed | ISI | ChemPort |
9. Foster, J.W. et al. Campomelic dysplasia and autosomal
sex reversal caused by mutations in an SRY-related gene.
Nature 372, 525-530 (1994). | PubMed
| ISI | ChemPort |
10. Wagner, T. et al. Autosomal sex reversal and campomelic
dysplasia are caused by mutations in and around the
SRY-related gene SOX9. Cell 79, 1111-1120
(1994). | PubMed | ISI | ChemPort |
11. Mansour, S., Hall, C.M., Pembrey, M.E. & Young, I.D. A
clinical and genetic study of campomelic dysplasia. J.
Med. Genet. 32, 415-420 (1995). | PubMed
| ISI | ChemPort |
12. Huang, B., Wang, S., Ning, Y., Lamb, A. & Bartley, J.
Autosomal XX sex reversal caused by duplication of
SOX9. Am. J. Med. Genet. 87, 349-353
(1999). | Article | PubMed | ISI | ChemPort |
13. Kent, J., Wheatley, S.C., Andrews, J.E., Sinclair, A.H. &
Koopman, P. A male-specific role for SOX9 in vertebrate
sex determination. Development 122, 2813-2822
(1996). | PubMed | ISI | ChemPort |
14. Morais da Silva, S. et al. Sox9 expression during gonadal
development implies a conserved role for the gene in
testis differentiation in mammals and birds. Nature
Genet. 14, 62-68 (1996). | PubMed | ChemPort |
15. Swain, A. & Lovell-Badge, R. Mammalian sex
determination: a molecular drama. Genes Dev. 13, 755767 (1999). | PubMed | ISI | ChemPort |
16. Capel, B. Sex in the 90s: SRY and the switch to the male
pathway. Annu. Rev. Physiol. 60, 497-523
(1998). | Article | PubMed | ISI | ChemPort |
17. Graves, J.A. Interactions between SRY and SOX genes in
mammalian sex determination. Bioessays 20, 264-269
(1998). | Article | PubMed | ISI | ChemPort |
18. McElreavey, K., Vilain, E., Abbas, N., Herskowitz, I. &
Fellous, M. A regulatory cascade hypothesis for
mammalian sex determination: SRY represses a negative
regulator of male development. Proc. Natl Acad. Sci. USA
90, 3368-3372 (1993). | PubMed | ISI | ChemPort |
19. Yokoyama, T. et al. Conserved cysteine to serine
mutation in tyrosinase is responsible for the classical
albino mutation in laboratory mice. Nucleic Acids Res. 18,
7293-7298 (1990). | PubMed | ISI | ChemPort |
20. Bishop, C.E. & Mitchell, M.J. Encyclopedia of the mouse
genome VII. Mouse chromosome Y. Mamm. Genome 8,
S378-381 (1998). | Article |
21. Burgoyne, P.S. The mammalian Y chromosome: a new
perspective. Bioessays 20, 363-366
(1998). | Article | PubMed | ISI | ChemPort |
22. Sutcliffe, M.J. & Burgoyne, P.S. Analysis of the testes of
H-Y negative XOSxrb mice suggests that the
spermatogenesis gene (Spy) acts during the
differentiation of the A spermatogonia. Development 107,
373-380 (1989). | PubMed | ISI | ChemPort |
23. Bi, W., Deng, J.M., Zhang, Z., Behringer, R.R. & de
Crombrugghe, B. Sox9 is required for cartilage formation.
Nature Genet. 22, 85-89 (1999). | Article | PubMed
| ISI | ChemPort |
24. Pfeifer, D. et al. Campomelic dysplasia translocation
breakpoints are scattered over 1 Mb proximal to SOX9:
evidence for an extended control region. Am. J. Hum.
Genet. 65, 111-124 (1999). | Article | PubMed
| ISI | ChemPort |
25. Capel, B. et al. Deletion of Y chromosome sequences
located outside the testis determining region can cause
XY female sex reversal. Nature Genet. 5, 301-307
(1993). | PubMed | ISI | ChemPort |
26. Zhao, S. & Overbeek, P.A. Tyrosinase-related protein 2
promoter targets transgene expression to ocular and
neural crest-derived tissues. Dev. Biol. 216, 154-163
(1999). | Article | PubMed | ISI | ChemPort |
27. Rowe, L.B. et al. Maps from two interspecific backcross
DNA panels available as a community genetic mapping
resource. Mamm. Genome 5, 253-274 (1994). | PubMed
| ISI | ChemPort |
28. Hustert, E., Scherer, G., Olowson, M., Guenet, J.L. &
Balling, R. Rbt (Rabo torcido), a new mouse skeletal
mutation involved in anteroposterior patterning of the
axial skeleton, maps close to the Ts (tail-short) locus and
distal to the Sox9 locus on chromosome 11. Mamm.
Genome 7, 881-885 (1996). | Article | PubMed
| ISI | ChemPort |
29. Zhao, Q., Eberspaecher, H., Lefebvre, V. & De
Crombrugghe, B. Parallel expression of Sox9 and Col2a1
in cells undergoing chondrogenesis. Dev. Dyn. 209, 377386 (1997). | Article | PubMed | ISI | ChemPort |
ACKNOWLEDGMENTS
We thank G. Schuster for microinjections; L. Vien for
animal husbandry and PCR assays; B. de
Crombrugghe for the Sox9 in situ hybridization probe;
H. Boettger-Tong for advice on several aspects of the
work; and B. Capel and P. Koopman for discussion on
the manuscript. Supported by grants from the
National Institutes of Health (to C.E.B., R.R.B. and
P.A.O.).
Figure 1: Phenotype of Ods-mutant mice.
a, A four-week wild-type FVB XX female (left) and an XX Ods/+ male (right) are
shown. The Ods mouse has pigmented ears, light coat pigmentation, pigmented
eyes with cataracts and male external genitalia. b, Pedigree chart of the OVE1220
transgenic mouse line. All of the transgenic mice showed co-inheritance of the DctTyr minigene, coat colour, microphthalmia and sexual phenotype. Squares, males;
circles, females; diamonds, died at birth; filled symbols, transgenic mice. Mice in
the second generation were tested by PCR for the presence (XY) or absence (XX)
of a Y chromosome. c, PCR genotyping for Sry, RbmY1a1 and the Dct-Tyr (Tyr)
minigene. Results for six pigmented, microphthalmic males (1–6), along with wildtype FVB male (M) and female (F) controls, are shown. The pigmented,
microphthalmic males were either positive for all Y-chromosome markers (2–5) or
negative for all Y markers (1, 6). All were positive for the transgene. d, Histology of
6-week adult testes of an XY Ods/+ male (left) and an XX Ods/+ male (right). The
XY Ods/+ testis is histologically normal, showing all stages of spermatogenesis,
whereas the XX Ods/+ testis contains Sertoli cells, but is devoid of germ cells.
Vacuolated Sertoli cells are present in many tubules.
Figure 2: Ods has created a deletion upstream of Sox9.
a, Map of the transgenic insertion site on mouse chromosome 11. Restriction sites
for BamHI (B) and SalI (S) (from the vector) are indicated. Two copies of the DctTyr minigene were found to have integrated in a head-to-tail alignment. Hatching
shows the region and orientation of flanking DNA homologous to human
chromosome 17 BAC. The human chromosome 17 BAC contig upstream of SOX9
is also shown. b, Ods contains a single transgene insertion. Southern-blot analysis
of EcoRI-restricted DNA from wild-type XY males, XX Ods/+ and XY Ods/+ males
probed with a 700-bp Tyr probe. In addition to the expected 5.5-kb endogenous Tyr
locus, a single 17-kb hybridizing band was detected, consistent with the transgene
having inserted at a single locus. c, PCR analysis of homozygous XX and XY
Ods/Ods DNA. The 180-bp product is present in normal and heterozygous DNA,
but is absent from Ods/Ods mice, indicating that the transgene insertion has
created a deletion.
Figure 3: Mapping the Ods insertion site.
a, FISH mapping shows that the Dct-Tyr transgene has integrated on distal chromosome 11, band
E2 (top left and right). Probes for Ods (red) and Sox9 (green) were not resolved on metaphase
chromosome spreads (bottom left), but can be resolved in interphase nuclei (bottom right),
indicating that they map 1–2 Mb apart. b, Interspecific backcross mapping places Ods and Sox9 1
cM apart, at 69 cM on mouse chromosome 11 (full data can be obtained at
http://www.informatics.jax.org).
Figure 4: Histological analysis and Sox9 in situ hybridizations of fetal
gonads.
a, Normal testicular development in a wild-type XY embryo at 14.5 d.p.c. showing
seminiferous cords (arrow) and the testis-specific blood vessel (arrowhead). b,c,
Sox9 transcripts are localized to Sertoli cells (arrow). d, Normal ovarian
development in a wild-type XX embryo at 14.5 d.p.c. e,f, Sox9 is not expressed by
the developing ovary. g, Testicular development in an XX Ods/+ embryo at 14.5
d.p.c. Seminiferous cords (arrow) and the testis-specific blood vessel (arrowhead)
are present. h,i, Sox9 is expressed by the Sertoli cells (arrow) of the XX Ods/+
gonad. j, The wild-type XY male gonad at 11.5 d.p.c.; (g) is at the indifferent stage
(m, mesonephros). k,l, Strong Sox9 expression is localized to the wild-type XY
male gonad. m, Wild-type XX female indifferent gonad at 11.5 d.p.c. n,o, Sox9
transcripts are not detectable above background in the indifferent female gonad. p,
The gonad of an XX Ods/+ embryo at 11.5 d.p.c. is similarly morphologically
indifferent. q,r, Sox9 is expressed in the XX Ods/+ 11.5 d.p.c. gonad. Scale bar,
100 m. All figures are of the same scale.
Figure 5: A double-repressor model of mammalian sex determination.
At 10.5 d.p.c., Sox9 is expressed in the genital ridges of both male and female
embryos. This expression is mediated by a genital ridge-specific enhancer located
upstream or downstream of Sox9 (not shown). In wild-type XX gonads at 11.5
d.p.c. (left), Sox9 expression is downregulated by the binding of a repressor or
repressor complex to gonad-specific regulatory elements (filled box) located 1.3
Mb upstream of Sox9. In wild-type XY gonads at 11.5 d.p.c. (middle), this repressor
binding is predicted to be antagonized by Sry protein, leading to upregulation of
Sox9 expression, followed by Sertoli-cell differentiation and testis formation. In the
XX Ods/+ gonads at 11.5 d.p.c. (right), Sox9 cannot be repressed, as the gonadspecific elements for repressor binding have been deleted by the transgene
insertion. As a result, Sox9 is expressed at sufficient levels to induce testis
formation and male development.
Published online: 20 August 2001, doi:10.1038/ng711
volume 29 no. 1 pp 20 - 21
Human mtDNA and Y-chromosome
variation is correlated with matrilocal
versus patrilocal residence
Hiroki Oota1, 2, Wannapa Settheetham-Ishida3, Danai Tiwawech4, Takafumi
Ishida5 & Mark Stoneking1
1. Max Planck Institute for Evolutionary Anthropology, Inselstrasse 22, D-04103 Leipzig,
Germany.
2. Present address: Department of Genetics, Yale University School of Medicine, New Haven,
Connecticut, USA.
3. Khon Kaen University, Khon Kaen, Thailand.
4. National Cancer Institute, Bangkok, Thailand.
5. Department of Biological Sciences, School of Science, University of Tokyo, Tokyo, Japan.
Correspondence should be addressed to M Stoneking. e-mail: stoneking@eva.mpg.de
Genetic differences among human populations are usually larger for the Y
chromosome than for mtDNA1-3. One possible explanation is the higher rate
of female versus male migration due to the widespread phenomenon of
patrilocality, in which the woman moves to her mate's residence after
marriage. To test this hypothesis, we compare mtDNA and Y-chromosome
variation in three matrilocal (in which the man moves to his mate's
residence after marriage) and three patrilocal groups among the hill tribes
of northern Thailand. Genetic diversity in these groups shows a striking
correlation with residence pattern, supporting the role of sex-specific
migration in influencing human genetic variation.
Patrilocality, in which men stay in their birthplace and women move, occurs in
about 70% of human societies4. Patrilocality has been invoked to explain the
usual patterns observed in human populations: high mtDNA and low Ychromosome diversity within groups, large between-group differences for the Y
chromosome, and small between-group differences for mtDNA. If patrilocality is
responsible for these patterns, then matrilocal groups (in which the women stay in
their birthplace and the men move) might show the opposite patterns: high Ychromosome and low mtDNA diversity within groups, large between-group
differences for mtDNA, and small between-group differences for the Y
chromosome. Here we compare Y-chromosome and mtDNA diversity in three
matrilocal groups (Lahu, Red Karen and White Karen; the two Karen groups were
sampled from multiple villages that were 5–25 km apart) and three patrilocal
groups (Akha and two groups of Lisu, one sampled near Chiang Rai and one
sampled about 220 km away, near Mae Hong Son) from northern Thailand.
We obtained blood samples between 1996 and 1998, with
informed consent. From these we prepared transformed cell
lines from which we subsequently extracted DNA5. We
analyzed 360 bp of the first hypervariable segment (HV1) of
the mtDNA control region, corresponding to positions 16024–
16383 (ref. 6), using standard methods (Oota, unpublished
data). To assess Y-chromosome variation, we carried out
multiplex typing of nine short tandem repeat (STR) loci
(DYS385a/b, DYS389I, DYS389II, DYS390, DYS391,
DYS392, DYS393 and DYS394) as previously described
(http://www.ystr.org/usa). The HV1 sequences have been
submitted to the HVRbase database7 and are also available
from the authors, as are the Y-STR haplotypes. We estimated
haplotype diversity within groups8 for the HV1 sequences and
the Y-STR haplotypes, and calculated the genetic distances
between groups based on dA values8 for the HV1 sequences
and Rst values9 for the Y-STR haplotypes.
The haplotype diversity for mtDNA was higher in all of the
patrilocal groups than in any of the matrilocal groups (Fig. 1).
The mean mtDNA diversity in the patrilocal groups was 0.937,
which is significantly greater than the mean mtDNA diversity
of 0.860 in the matrilocal groups (Mann-Whitney U-test,
P<0.05). Conversely, the Y-STR haplotype diversity was
higher in all of the matrilocal groups than in any of the
patrilocal groups (Fig. 1). The mean Y-STR diversity was
0.965 in the matrilocal groups, significantly greater than the
mean Y-STR diversity of 0.863 in the patrilocal groups (MannWhitney U-test, P<0.05). In addition, the average genetic
distance based on mtDNA HV1 sequences was significantly
higher among the matrilocal groups than among the patrilocal
groups, while the average genetic distance based on Y-STR
haplotypes was significantly higher among the patrilocal
groups than among the matrilocal groups (Table 1).
Genetic variation in the north Thailand hill tribes thus shows a
striking correlation with residence pattern. Matrilocal groups
have high within-group diversity for the Y chromosome and
large between-group distances for mtDNA, whereas patrilocal
groups have high within-group diversity for mtDNA and large
between-group distances for the Y chromosome. All of the
groups studied come from the same geographic region, speak
related Sino-Tibetan languages and practice similar
subsistence modes of agriculture. These shared attributes
make it unlikely that the correlation is accounted for by some
other factor that differs between the matrilocal and patrilocal
groups, such as reproductive success. Moreover, although
our results do not rule out a role for variance in male
reproductive success in influencing patterns of genetic
variation, theoretical considerations suggest at best a minor
effect of such variance1. We conclude that patrilocality does
appear to be primarily responsible for the higher betweenpopulation genetic differences consistently observed for the Y
chromosome as opposed to mtDNA or autosomal loci. Our
results also provide evidence for the importance of social
structure in influencing human genetic diversity10-12.
Received 1 May 2001; Accepted 25 July 2001; Published online
20 August 2001.
REFERENCES
1. Seielstad, M.T., Minch, E. & Cavalli-Sfroza, L.L. Nature
Genet. 20, 278-280 (1998). | Article | PubMed
| ISI | ChemPort |
2. Perez-Lezaun, A. et al. Am. J. Hum. Genet. 65, 208-219
(1999). | Article | PubMed | ISI | ChemPort |
3. Jorde, L.B. et al. Am. J. Hum. Genet. 66, 979-988
(2000). | Article | PubMed | ISI | ChemPort |
4. Burton, M.L., Moore, C.C., Whiting, J.W.M. & Romney,
A.K. Curr. Anthropol. 37, 87-123 (1996). | Article | ISI |
5. Kimura, M. et al. Hum. Biol. 70, 993-1000
(1998). | PubMed | ISI | ChemPort |
6. Anderson, S. et al. Nature 290, 457-465
(1981). | PubMed | ISI | ChemPort |
7. Burckhardt, F., von Haeseler, A. & Meyer, S. Nucleic Acids
Res. 27, 138-142 (1999). | Article | PubMed
| ISI | ChemPort |
8. Nei, M. Molecular Evolutionary Genetics (Columbia Univ.
Press, New York, 1987).
9. Slatkin, M. Genetics 139, 457-462 (1995). | PubMed
| ISI | ChemPort |
10. Salem, A.H., Badr, F.M., Gaballah, M.F. & Pääbo, S. Am.
J. Hum. Genet. 59, 741-743 (1996). | PubMed
| ISI | ChemPort |
11. Bamshad, M.J. et al. Nature 395, 651-652
(1998). | Article | PubMed | ISI | ChemPort |
12. Bamshad, M. et al. Genome Res. 11, 994-1004
(2001). | Article | PubMed | ISI | ChemPort |
ACKNOWLEDGMENTS
We thank S. Brauer and H. Schädlich for technical
assistance, and B. Pakendorf, R. Cordaux, M. Kayser,
I. Nasidze, S. Pääbo and A. Ryan for helpful
discussion. H.O. was supported by a fellowship from
the Japanese Society for the Promotion of Science
(JSPS). Sample collection was supported by funds
from the JSPS and the Ministry of Education, Science,
and Culture, Japan; research was supported by funds
from the Max Planck Society, Germany.
Figure 1: Diversity in mtDNA (red) and Y-STR (green) haplotypes in three
matrilocal and three patrilocal groups from northern Thailand.
From left to right, the matrilocal groups (mtDNA sample size, Y-STR sample size)
are Lahu (39, 17), Red Karen (39, 30), and White Karen (40, 20); the patrilocal
groups are Akha (91, 21), Lisu from near Chiang Rai (53, 9), and Lisu from near
Mae Hong Son (42, 22). The fourth (shaded) bar in each group indicates the mean
diversity (and standard error) for the group.
Table 1: Average genetic distances and standard errors based on mtDNA
HV1 sequences and Y-STR haplotypes among matrilocal and patrilocal
groups
doi:10.1038/7771
volume 21 no. 4 pp 429 - 433
There is an erratum (June 1999) associated with this Letter. Please click here to view.
Retroposition of autosomal mRNA
yielded testis-specific gene family on
human Y chromosome
Bruce T Lahn & David C Page
Howard Hughes Medical Institute, Whitehead Institute and Department of Biology, Massachusetts
Institute of Technology, 9 Cambridge Center, Cambridge, Massachusetts 02142, USA.
Correspondence should be addressed to D C Page. e-mail: dcpage@wi.mit.edu
Most genes in the human NRY (non-recombining portion of the Y
chromosome) can be assigned to one of two groups: X-homologous genes
or testis-specific gene families with no obvious X-chromosomal
homologues1, 2. The CDY genes have been localized to the human Y
chromosome1, and we report here that they are derivatives of a
conventional single-copy gene, CDYL (CDY-like), located on human
chromosome 13 and mouse chromosome 6. CDY genes retain CDYL exonic
sequences but lack its introns. In mice, whose evolutionary lineage
diverged before the appearance of the Y-linked derivatives, the autosomal
Cdyl gene produces two transcripts; one is expressed ubiquitously and the
other is expressed in testes only. In humans, autosomal CDYL produces
only the ubiquitous transcript; the testis-specific transcript is the province
of the Y-borne CDY genes. Our data indicate that CDY genes arose during
primate evolution by retroposition of a CDYL mRNA and amplification of the
retroposed gene. Retroposition contributed to the gene content of the
human Y chromosome, together with two other molecular evolutionary
processes: persistence of a subset of genes shared with the X
chromosome3, 4 and transposition of genomic DNA harbouring intact
transcription units5.
We had previously identified a single full-length cDNA clone from CDY but had
mapped homologous sequences to two different locations on the human Y
chromosome1. We explored the possibility that there might be multiple functional
CDY genes by isolating 25 additional cDNA clones from a human testis library.
Sequence analysis of the 25 clones revealed two distinct species, which we
designated CDY1 (17 clones) and CDY2 (8 clones). The sequence of CDY1 was
as previously reported1. Like CDY1, CDY2 appears to encode a protein with a
combination of chromatin-binding and catalytic domains ( Fig. 1a). The predicted
coding regions of CDY1 and CDY2 were 99% identical in nucleotide sequence,
and the amino acid sequences of the predicted proteins were 98% identical. We
observed an alternative 3´ region in 4 of 17 CDY1 cDNA clones. Most of the
putative protein encoded by this minor CDY1 transcript is identical to that
encoded by the major transcript; its carboxy terminus is divergent (Fig. 1a,b).
We had previously localized CDY-homologous sequences to
Y chromosome deletion intervals 5L and 6F (ref. 1), but it was
not apparent which of these map locations corresponded to
which gene. Exploiting sequence differences between CDY1
and CDY2, we designed PCR assays specific to each of the
two genes. Using genomic DNAs from individuals carrying
partial Y chromosomes as templates6, we mapped CDY1 to
interval 6F, and CDY2 to interval 5L (data not shown). We
conclude that at least two distinct functional CDY genes exist
on the human Y chromosome, and that they encode protein
isoforms. We cannot exclude the possibility that there are
multiple CDY1 genes in interval 6F, multiple CDY2 genes in
interval 5L, or additional CDY species that we failed to
identify.
To determine the intron/exon structures of CDY1 and CDY2,
we partially sequenced genomic BAC clones corresponding to
each of the genes. CDY1 and CDY2 genomic sequences
were found to be collinear with the major CDY1 and CDY2
transcripts as captured in cDNA clones, with no evidence that
any intronic sequences had been excised. In the case of the
minor CDY1 transcript, it appeared that a single intron had
been excised; the splice donor site was located within, and
near the 3´ end of, the single exon of the major transcript (Fig.
1 b). Further evidence of the atypical nature of CDY genes
arose from examining their polyadenylation sites. For 13 of 17
CDY1 and 5 of 8 CDY2 cDNA clones, the poly(A) tract was
located immediately 3´ of the predicted stop codon (Fig. 1b).
CDY1 and CDY2 may be the only mammalian nuclear genes
in which polyadenylation is known to occur so close to the
stop codon.
To address the evolutionary origin of CDY on the human Y
chromosome, we sought homologues that might exist
elsewhere in the genomes of humans or mice. We identified
an autosomal homologue in both humans and mice by
incubating a CDY-derived probe, at low stringency, with testis
cDNA libraries from the two species. We will refer to the
human and mouse autosomal homologues as CDYL and
Cdyl, respectively. We determined the sequence of the
human CDYL transcript by merging the sequences of four
incomplete but substantially overlapping cDNA clones; we
found no sequence differences among the four cDNA clones
in regions of overlap. Similarly, we derived the sequence of
mouse Cdyl by merging the sequences of ten incomplete but
overlapping cDNA clones. In both species, the autosomal
transcript, similar to human CDY, appears to encode a protein
with an amino-terminal chromo domain and a C-terminal
catalytic domain (Fig. 2). The predicted mouse and human
CDYL proteins have 93% amino acid identity overall, with
even greater similarity evident in the chromo and catalytic
domains (Fig. 2). In contrast, the predicted human CDYL and
CDY proteins have only 63% amino acid identity; again,
greater similarity is evident in the chromo and catalytic
domains (Fig. 2). These results suggest that human CDYL
and mouse Cdyl are orthologues, and human CDYL and CDY
are paralogues. They also suggest that evolutionary
constraints operating on the chromo and catalytic domains of
the CDYL and CDY proteins account for the relatively high
amino acid identities observed there.
Chromosomal mapping of the human CDYL and mouse Cdyl
genes indicate that they are indeed orthologues. In humans,
we localized CDYL to the distal short arm of chromosome 6
(distal to marker NIB1876 and proximal to WI-4489) by
radiation hybrid mapping. In mice, we localized Cdyl to the
proximal portion of chromosome 13 (distal to Fim1 and
proximal to D13Mit18) by meiotic linkage mapping. These
portions of human chromosome 6 and mouse chromosome
13 were previously shown to be syntenic7. We conclude that a
gene much like human CDYL and mouse Cdyl existed in the
common ancestors of mice and humans.
Our experiments to this point left another evolutionary
question unanswered: which came first, CDY or CDYL? The
first indication came from studies in mice, in which we were
unable to identify a Y-linked homologue of human CDY by
either Southern-blot analysis (Fig. 3a) or screening of testis
cDNA libraries. These findings are consistent with CDYL
having given rise to CDY sometime after divergence of the
mouse and human lineages.
Additional evidence that CDYL gave rise to CDY came from
analysis of the human CDYL and mouse Cdyl gene
structures. We found that each of these autosomal genes
contains eight introns ( Fig. 2). As one would expect for
orthologues, most introns were positioned at corresponding
sites in human CDYL and mouse Cdyl. The presence of
numerous introns in the CDYL/Cdyl orthologues, contrasted
with the paucity of introns in the human CDY genes, led us to
consider whether CDY had arisen by reverse transcription of
a spliced CDYL mRNA, with subsequent integration into the Y
chromosome.
Closer examination of the sequence data corroborated this
retroposition model. Excluding a few hundred nucleotides at
the 5´ terminus, the nucleotide sequence of the human CDY1
(or CDY2) genomic locus is essentially collinear with that of
the mature human CDYL (or mouse Cdyl) transcripts. Even
the single intron of the minor CDY1 transcript is collinear with,
and homologous to, CDYL exon 9 (Fig. 2). Thus, with respect
to evolutionary origins, all but the most 5´ portions of human
CDY1 and CDY2 genomic loci can be accounted for by
mature CDYL/Cdyl transcripts. These findings are as one
would predict if CDYL/Cdyl intronic sequences had been
excised before retroposition.
How long ago did the retroposition event occur? To address
this question, we incubated probes derived from human
CDYL and CDY with Southern blots of male and female
genomic DNAs from various mammalian species. Two
conclusions emerged from this analysis (Fig. 3a): (i) the
autosomal (male-female common) CDYL gene appears to be
widely conserved among the mammals tested, including
opossum, a marsupial; and (ii), the Y-chromosomal (malespecific) CDY family appears to be restricted to simian
primates, including apes, Old World monkeys (for example
macaques) and New World monkeys (for example squirrel
monkeys, whose sequences appear to have undergone
extraordinary amplification on the Y). We observed no malespecific hybridization fragments in prosimians (lemurs) or any
of the non-primate mammals tested. The simplest
interpretation of these findings is that the retroposition event
occurred in the simian lineage after its divergence from
prosimians but before the split between Old and New World
monkeys, perhaps 40-50 million years ago (Fig. 3b). Given
the age of the retroposed CDY genes, it is not unexpected
that certain hallmarks of retroposition are absent: we found no
evidence of target site duplication or of a poly(A) tract in CDY
genomic sequences. The observed 99% identity between the
coding sequences of human CDY1 and CDY2 (Fig. 1a; as
contrasted with 73% identity between human CDY and
human CDYL) suggests that their coexistence is the result of
subsequent amplification on the Y chromosome.
Do all members of the CDY/CDYL gene family serve malespecific functions? We had previously demonstrated by
northern-blot analysis that human CDY genes are abundantly
transcribed in adult testis but are not detectable in other adult
tissues1 (Fig. 4). In contrast, human CDYL appears to be
expressed at a modest and comparable level in all tissues
examined (Fig. 4), indicating that it may perform a
housekeeping function. In mice, Cdyl is characterized by two
transcripts: (i) a ubiquitous one that is similar in length to the
ubiquitous transcript of human CDYL; and (ii) a testis-specific
one that is similar in length to the testis-specific transcript of
human CDY (Fig. 4). These expression studies in humans
and mice suggest that a genomic partitioning of housekeeping
and testis-specific functions evolved in primates after
retroposition. We interpret the mouse, which possesses no
CDY homologue on its Y chromosome, as representing the
evolutionarily ancestral condition. In humans, the autosomal
gene CDYL appears to retain the housekeeping function but
relinquished the testis-specific function, the latter of which
seems to have transferred to the Y-linked CDY genes.
We propose, on the basis of these and other findings, that the
gene content of the human NRY was forged during evolution
through the interplay of three molecular processes:
retroposition, transposition and persistence ( Fig. 5). In certain
respects, the process of CDY retroposition is reminiscent of
the transposition process by which the DAZ gene family arose
on the Y chromosome5. The net result of both evolutionary
processes was the duplicative transfer of an autosomal gene
to the Y chromosome, with retention of the progenitor
autosomal gene. Subsequent to retroposition (as exemplified
by CDY) and transposition (as exemplified by DAZ ) the
genes transferred to the Y chromosome were therein
amplified. The expression pattern of the Y-linked CDY genes,
however, is very different to that of their autosomal progenitor,
Cdyl/CDYL (Fig. 4), whereas expression of the Y-linked DAZ
genes is quite similar to that of their autosomal progenitor,
DAZL (also known as DAZLA , DAZH or SPGYLA; Refs 5,811). In the case of>CDY, the nascent Y-linked gene was
fashioned from a fully processed, reverse-transcribed cDNA
which, we speculate, fortuitously integrated near an existing
promoter or into an otherwise transcriptionally permissive
locale on the Y chromosome. In contrast, in the case of DAZ,
a segment of genomic DNA containing an entire transcription
unit was duplicated from an autosome onto the Y
chromosome5. The transposed genomic DNA of DAZ
probably included not only introns and exons but also the
promoter of the ancestral gene.
Retroposition has long been suspected to figure prominently
in the evolution of animal Y chromosomes—but solely as a
means of marking the decay of genes previously shared with
the X chromosome. In Drosophila melanogaster , molecular
studies have suggested that insertional mutagenesis via
retroposition was a major driving force in Y gene decay12-14.
An accumulation of retroposed elements on the mouse Y
chromosome15 suggested that this paradigm might extend to
mammals16. Thus, it is reasonable to speculate that
retroposition—especially of highly transposable elements—
may have had an important role in Y gene decay in many
animal lineages. As illustrated here by CDY, however,
retroposition has also provided a mechanism for gene
building during the evolution of the human Y chromosome.
Methods
Identification of cDNA clones. The cDNA insert of plasmid
pDP1660 contains the entire ORF of human CDY1, as
previously reported1. To identify additional CDY clones, we
labelled the plasmid insert with 32P-dCTP by random priming
and used this probe to screen a human adult testis cDNA
library (Clontech). We incubated blots overnight at 65 °C in
NaiPO4 (0.5 M), 7% SDS, followed by three washes of 15 min
each at 65 °C in 0.1 SSC, 0.1% SDS. We identified human
CDYL clones by rescreening the same library with the same
probe at lower stringency (incubation at 60 °C, as above;
washing at 55 °C in 0.1 SSC, 0.1% SDS). We also used this
probe and conditions to screen a mouse adult testis cDNA
library (Clontech), resulting in the identification of mouse Cdyl
clones.
PCR analysis of human genomic DNAs. We tested human
genomic DNAs for the presence or absence of CDY1 and
CDY2 using PCR assays specific to each of the genes. PCR
primers for CDY1 were 5'-GGCGAAAGCTGACAGCAA-3' and
5'-GGGTGAAAGTTCCAGTCAA-3´. PCR primers for CDY2
were 5'-GACCACAAGAAAACTGTGAG-3' and 5'GATCTGCTGCAATAGGGTC-3´. Thermocycling conditions
were 30 cycles of 1 min at 94 °C, 45 s at 60 °C and 45 s at 72
°C.
Characterization of intron/exon structures. We identified
BAC clones corresponding to the human CDY1, human
CDY2, human CDYL and mouse Cdyl genomic loci by
hybridization screening on high-density filters (Research
Genetics) of CITB BAC libraries prepared from human or
mouse genomic DNA. We then used these BAC clones to
characterize the intron/exon structures of the genes. We
designed a series of PCR assays, based on the cDNA
sequence of each gene, each yielding a cDNA product of
approximately 50 bp and collectively encompassing the entire
cDNA sequence. We then repeated each of these PCR
assays using the corresponding BAC as template. In some
cases, we obtained PCR products of identical length using
cDNA or genomic BAC as template, indicating that no introns
were located between the two primers. In other cases, a
larger PCR product was obtained using the BAC as template
(in some cases long-range PCR employing TaqExtender
enzyme (Stratagene) was required), indicating the presence
of an intron. We sequenced all intron-containing PCR
products to identify intron/exon boundaries precisely.
Radiation hybrid mapping of human CDYL. Using PCR, we
tested DNAs from the 93 hybrid cell lines of the GeneBridge 4
panel17 (Research Genetics) for the presence of CDYL. PCR
primers were 5'-CTGAGCAGGAGAACATCACC-3' and 5'GCTACGGGTGAGCTTGTTTC-3'. Thermocycling conditions
were as listed above. Analysis of the results positioned CDYL
with respect to the radiation hybrid map constructed at the
Whitehead/MIT Center for Genome Research18 (http://wwwgenome.wi.mit.edu/cgi-bin/contig/phys_map ).
Genetic mapping of mouse Cdyl. We typed genomic DNA
from the BSS ([C57BL/6JEi SPRET/Ei]F1 female SPRET/Ei
male) backcross panel (Jackson Laboratory) for a Cdyl
polymorphism by PCR. PCR primers were 5'AAAAATGACCCTCACTAAGTTAC-3' and 5'AGGCTCCCTGCAGTAAGTA-3'. Thermocycling conditions
were as listed above. This PCR assay yielded a 53-bp
product when C57BL/6 genomic DNA was used as template,
but no product with Mus spretus DNA (because of sequence
polymorphism at priming sites). Analysis of the results
positioned Cdyl with respect to the genetic linkage map
maintained at the Jackson Laboratory
(http://www.informatics.jax.org/map.html ).
Southern- and northern-blot analyses. Each Southern-blot
lane contained genomic DNA (7 g). We incubated blots
overnight at 65 °C in NaiPO4 (0.5 M), 7% SDS, followed by
three washes of 15 min each at 60 °C in 1 SSC, 0.1% SDS.
For northern-blot analysis, commercial filters (Clontech)
doped with poly(A)+ RNA (2 g/lane) were incubated under
the same conditions, then washed at 65 °C in 0.1 SSC, 0.1%
SDS.
GenBank accession numbers. Human CDY1 major
transcript, AF080597; human CDY1 minor transcript,
AF000981 (ref. 1); human CDY2, AF080598; human CDYL,
AF081258 and AF081259; mouse Cdyl, AF081260 and
AF081261.
Received 9 November 1998; Accepted 25 February 1999.
REFERENCES
1. Lahn, B.T. & Page, D.C. Functional coherence of the
human Y chromosome. Science 278, 675-680
(1997). | Article | PubMed | ISI | ChemPort |
2. Vogt, P.H. et al. Report of the Third International
Workshop on Y chromosome mapping 1997. Cytogenet.
Cell Genet. 79, 1-20 (1997). | PubMed | ChemPort |
3. Graves, J.A.M. The origin and function of the mammalian
Y chromosome and Y-borne genes--an evolving
understanding. BioEssays 17, 311-321 (1995). | PubMed
| ISI | ChemPort |
4. Jegalian, K. & Page, D.C. A proposed path by which genes
common to mammalian X and Y chromosomes evolve to
become X inactivated. Nature 394, 776-780
(1998). | Article | PubMed | ISI | ChemPort |
5. Saxena, R. et al. The DAZ gene cluster on the human Y
chromosome arose from an autosomal gene that was
transposed, repeatedly amplified and pruned. Nature
Genet. 14, 292-299 (1996). | PubMed | ISI | ChemPort |
6. Vollrath, D. et al. The human Y chromosome: a 43interval map based on naturally occurring deletions.
Science 258, 52-59 (1992). | PubMed | ISI | ChemPort |
7. De Bry, R.W. & Seldin, M.F. Human/mouse homology
relationships. Genomics 33, 337-351
(1996). | Article | PubMed | ISI | ChemPort |
8. Yen, P.H., Chai, N.N. & Salido, E.C. The human autosomal
gene DAZLA: testis specificity and a candidate for male
infertility. Hum. Mol. Genet. 5, 2013-2017
(1996). | Article | PubMed | ISI | ChemPort |
9. Shan, Z. et al. A SPGY copy homologous to the mouse
gene Dazla and the Drosophila gene boule is autosomal
and expressed only in the human male gonad. Hum. Mol.
Genet. 5, 2005-2011 (1996). | Article | PubMed
| ISI | ChemPort |
10. Cooke, H.J., Lee, M., Kerr, S. & Ruggiu, M. A murine
homologue of the human DAZ gene is autosomal and
expressed only in male and female gonads. Hum. Mol.
Genet. 5, 513-516 (1996). | Article | PubMed
| ISI | ChemPort |
11. Reijo, R. et al. Mouse autosomal homolog of DAZ, a
candidate male sterility gene in humans, is expressed in
male germ cells before and after puberty. Genomics 35,
346-352 (1996). | Article | PubMed | ISI | ChemPort |
12. Ganguly, R., Swanson, K.D., Ray, K. & Krishnan, R. A
BamHI repeat element is predominantly associated with
the degenerating neo-Y chromosome of Drosophila
miranda but absent in the Drosophila melanogaster
genome. Proc. Natl Acad. Sci. USA 89, 1340-1344
(1992). | PubMed | ISI | ChemPort |
13. Steinemann, M. & Steinemann, S. Degenerating Y
chromosome of Drosophila miranda: a trap for
retroposons. Proc. Natl Acad. Sci. USA 89, 7591-7595
(1992). | PubMed | ISI | ChemPort |
14. Steinemann, M. & Steinemann, S. The enigma of Y
chromosome degeneration: TRAM, a novel
retrotransposon is preferentially located on the Neo-Y
chromosome of Drosophila miranda. Genetics 145, 261266 (1997). | PubMed | ISI | ChemPort |
15. Eicher, E.M., Hutchison, K.W., Phillips, S.J., Tucker, P.K.
& Lee, B.K. A repeated segment on the mouse Y
chromosome is composed of retroviral-related, Y-enriched
and Y-specific sequences. Genetics 122, 181-192
(1989). | PubMed | ISI | ChemPort |
16. Charlesworth, B., Sniegowski, P. & Stephan, W. The
evolutionary dynamics of repetitive DNA in eukaryotes.
Nature 371, 215-220 (1994). | PubMed
| ISI | ChemPort |
17. Gyapay, G. et al. A radiation hybrid map of the human
genome. Hum. Mol. Genet. 5, 339-346
(1996). | Article | PubMed | ISI | ChemPort |
18. Hudson, T.J. et al. An STS-based map of the human
genome. Science 270, 1945-1954 (1995). | PubMed
| ISI | ChemPort |
19. Burge, C. & Karlin, S. Prediction of complete gene
structures in human genomic DNA. J. Mol. Biol. 268, 7894 (1997). | Article | PubMed | ISI | ChemPort |
20. Novacek, M.J. Mammalian phylogeny: shaking the tree.
Nature 356, 121-125 (1992). | PubMed
| ISI | ChemPort |
21. Kumar, S. & Hedges, S.B. A molecular timescale for
vertebrate evolution. Nature 392, 917-920
(1998). | Article | PubMed | ISI | ChemPort |
22. Pilbeam, D. The descent of hominoids and hominids. Sci.
Am. 250, 84-96 (1984). | PubMed | ISI | ChemPort |
23. Ohno, S. Sex Chromosomes and Sex-linked Genes
(Springer-Verlag, Berlin, 1967).
24. Watson, J.M., Spencer, J.A., Riggs, A.D. & Graves, J.A.
Sex chromosome evolution: platypus gene mapping
suggests that part of the human X chromosome was
originally autosomal. Proc. Natl Acad. Sci. USA 88,
11256-11260 (1991). | PubMed | ISI | ChemPort |
25. Foster, J.W. & Graves, J.A. An SRY-related sequence on
the marsupial X chromosome: implications for the
evolution of the mammalian testis-determining gene.
Proc. Natl Acad. Sci. USA 91, 1927-1931
(1994). | PubMed | ISI | ChemPort |
ACKNOWLEDGMENTS
We thank H. Skaletsky and F. Lewitter for assistance
with sequence analysis; the San Diego Zoo and the
Duke Primate Center for animal specimens; and P.
Bain, A. Bortvin, L. Brown, C. Burge, B. Charlesworth,
A. Chess, S. Gilbert, R. Jaenisch, T. Kawaguchi, K.
Kleene, D. Menke, R. Saxena, C. Sun, C. Tilford and J.
Wang for helpful discussions and comments on the
manuscript.
Figure 1: Human CDY proteins and transcripts.
a, Schematic representation of predicted CDY1 and CDY2 proteins. Positions of chromo and
putative catalytic domains1 are indicated, as are protein lengths in amino acid residues. CDY1 major
and minor isoforms differ only near their C termini (grey bar in C region of CDY1 minor isoform). b,
3´ portions of CDY1 (major and minor) and CDY2 transcripts. Shown are partial cDNA and
predicted amino acid sequences; coding sequence is placed against a black background (grey in
the case of differential portion of CDY1 minor transcript); terminal residues of predicted proteins are
numbered. Part of the intron (genomic) sequence of the CDY1 minor transcript is shown; this intron
evolved from what had been exon 9 of CDYL (Fig. 2). The fifth nucleotide of this intron is circled; the
appearance of a G at this position (where 84% of introns 19 have a G, but where human CDYL and
mouse Cdyl have a T) may have contributed to the evolution of this novel intron. Alternative poly(A)
tracts observed in some CDY1 minor and CDY2 cDNA clones are indicated. Below polyadenylation
signals (underlined) are human CDYL sequences at corresponding positions; polyadenylation sites
employed by human CDY genes apparently arose only after retroposition, through mutations that
generated appropriate signal sequences.
Figure 2: Comparison of transcripts and encoded proteins from mouse Cdyl,
human CDYL and human CDY1 genes.
Exons are numbered; positions of introns are indicated. Black and grey bars depict
translated regions. Percentage identities are shown in several regions for both
DNA and predicted amino acid sequences; na, not applicable. Note: 64%
nucleotide identity was observed between the 3´ portion of human CDYL transcript
and the intron/second exon of human CDY1 minor transcript. Sequence similarity
between CDY1 genomic locus and CDYL falls abruptly (to less than 50%
nucleotide identity) at both the 5´ and 3´ ends: immediately 5´ of CDYL exon 4, and
3´ of the polyadenylation site of CDY1 minor transcript. Poly(A) tracts, including
alternative locations observed in some cDNA clones, are indicated.
Figure 3: Homologues of CDYL and CDY in diverse mammalian species.
a, Southern blots of male and female genomic DNAs from 13 mammalian species
hybridized with cDNA fragments corresponding to the entire coding sequence of
either human CDYL (top, TaqI digest) or human CDY1 (bottom, BglII digest). At low
stringency, CDYL probe hybridized with male-female common (presumably
autosomal) homologues in all 13 species. It also cross-hybridized with Y-encoded
homologues in some primates, especially orangutan and squirrel monkey. CDY
probe identified Y-encoded homologues in all primates except lemurs. It also crosshybridized weakly to the autosomal homologue in some species. Positions of size
markers are shown at right. The apparent smears in the squirrel monkey lanes are
composed of intense, discrete fragments, as revealed by shorter exposures. b,
Summary and interpretation of Southern-blot data. The 13 species tested are
arranged phylogenetically20-22.
Figure 4: Tissue distributions of mouse Cdyl, human CDYL and human CDY
transcripts.
Northern blots incubated cDNA fragments corresponding to the entire coding sequence of either
mouse Cdyl (top), human CDYL (middle), or human CDY1 (bottom) reveal tissue expression
patterns; the phylogenic relationship of these genes (illustrated on left) is based on data presented
in Fig. 3.
Figure 5: Schematic representation of three molecular evolutionary
processes that contributed genes to human NRY.
Persistence (top): an autosomal pair gave rise to the neo-Y and neo-X
(subsequently enlarged by fusion with other autosomes or autosomal segments;
data not shown; refs 23,24). SRY, RPS4Y and several other genes derived from
these ancestral autosomes persist as X-homologous genes in the human NRY
(refs 1,2,3,4,25). Transposition: the DAZ genes arose by transposition (and
subsequent amplification; data not shown) of autosomal genomic DNA containing
the entire DAZL transcription unit5. Retroposition: the CDY genes arose by
integration (and subsequent amplification; data not shown) of a reverse-transcribed
copy of a processed mRNA derived from the autosomal CDYL gene. Gene sizes
are not to scale.
21 March 2002
Nature 416, 323 - 326 (2002); doi:10.1038/416323a
Reduced adaptation of a non-recombining neo-Y chromosome
DORIS BACHTROG AND BRIAN CHARLESWORTH
Institute of Cell, Animal and Population Biology, University of Edinburgh, West Mains Road, Edinburgh EH9 3JT, UK
Correspondence and requests for materials should be addressed to D.B. (e-mail: doris.bachtrog@ed.ac.uk).
Sex chromosomes are generally believed to have descended from a pair of homologous
autosomes. Suppression of recombination between the ancestral sex chromosomes led
to the genetic degeneration of the Y chromosome1. In response, the X chromosome
may become dosage-compensated1, 2. Most proposed mechanisms for the degeneration
of Y chromosomes involve the rapid fixation of deleterious mutations on the Y1.
Alternatively, Y-chromosome degeneration might be a response to a slower rate of
adaptive evolution, caused by its lack of recombination3. Here we report patterns of
DNA polymorphism and divergence at four genes located on the neo-sex chromosomes
of Drosophila miranda. We show that a higher rate of protein sequence evolution of
the neo-X-linked copy of Cyclin B relative to the neo-Y copy is driven by positive
selection, which is consistent with the adaptive hypothesis for the evolution of the Y
chromosome3. In contrast, the neo-Y-linked copies of even-skipped and roundabout
show an elevated rate of protein evolution relative to their neo-X homologues,
probably reflecting the reduced effectiveness of selection against deleterious mutations
in a non-recombining genome1. Our results provide evidence for the importance of
sexual recombination for increasing and maintaining the level of adaptation of a
population.
Well-studied sex chromosome systems, such as those of humans or Drosophila
melanogaster, are very ancient and show few signs of their evolutionary origins4, 5. In D.
miranda, a neo-sex chromosome system has resulted from a recent chromosome fusion
between the Y chromosome and an autosome (Muller's element C (ref. 6); Fig. 1). The
fused autosome, the neo-Y chromosome, is transmitted patrilineally. Its homologue
segregates with the X chromosome, forming the neo-X chromosome. The closest relatives
of D. miranda, D. pseudoobscura and D. persimilis (from which it diverged about 2 million
years ago7), lack the fusion, setting an upper limit to the age of the neo-sex chromosomes.
Because male Drosophila have achiasmate meiosis8, the neo-Y never recombines, and so is
subject to the same evolutionary forces responsible for the degeneration of 'true' Y
chromosomes. Indeed, the neo-sex chromosomes of D. miranda show many characteristics
of the much older true sex chromosomes. The neo-Y has degenerated substantially9,
whereas the neo-X chromosome has evolved partial dosage compensation2. There is
compelling evidence that much of the human Y chromosome is derived from a single
autosomal region added to the original Y chromosome10; that is, it is also a degenerate neoY chromosome.
Figure 1 Phylogenetic relationships between the species
investigated, constructed from the coding region of CycB.
Full legend
High resolution image and legend (24k)
The different evolutionary processes that might be involved in Y-chromosome degeneration
leave different footprints of variation and evolution at the molecular level1, and so can be
examined empirically. Here we report levels of polymorphism and divergence at four
genes, Cyclin B (CycB), roundabout (robo), engrailed (eng) and even-skipped (eve), located
on both the neo-X and neo-Y chromosomes of D. miranda. The use of degenerate primers
for PCR, and screening of a genomic library, allowed the isolation of these genes from D.
miranda (see Methods). In situ hybridization of probes to the polytene chromosomes
confirmed their expected position on the neo-sex chromosomes of D. miranda (their
positions on the polytene map11 are: CycB, 32A; robo, 33C; eng, 31D; eve, 23A). For CycB,
robo and eve, both coding and non-coding sequence data were obtained; for eng, only the
intron sequence was investigated. The use of reverse-transcriptase-mediated polymerase
chain reaction (RT–PCR) with primers spanning an intron confirmed that, for CycB, eve
and robo, both alleles are transcribed and spliced in male flies, indicating that both the neoX and the neo-Y copies are probably under selective constraints. For outgroup
comparisons, all genes were isolated by PCR from D. pseudoobscura, in which Muller's
element C is autosomal (Fig. 1).
We investigated nucleotide variation in 12 wild-derived lines of D. miranda (Table 1). A
total of 5.2 kilobases per individual were surveyed for the homologous regions on the neoX and neo-Y chromosomes. Between the neo-X-linked and the neo-Y-linked genes
investigated, there are 106 fixed differences and no shared polymorphisms, which is
consistent with a lack of recombination between the neo-sex chromosomes. Net silent-site
divergence12 between the neo-X and neo-Y alleles is 2.8%; combining this with previous
data13, and assuming a silent substitution rate of 1.2 10-8 per site per year for
Drosophila14, this yields a total divergence time of 2.2 million years, giving an age of about
1.1 million years for the neo-sex chromosome system. On the neo-X chromosome, 61
segregating sites were observed (27 silent single-nucleotide polymorphisms, 4
replacements, 25 non-coding sites and 5 insertion or deletion events. In contrast, the
homologous neo-Y-linked regions almost completely lacked variability: only two singleton
variants (one synonymous, one non-coding) were detected. Under the standard neutral
model, genetic diversity is proportional to the product of the effective population size, Ne,
and the mutation rate15. If no selective forces are operating, Ne for the neo-Y chromosome
should be one-third that of the neo-X chromosome, assuming random variation in offspring
number for each sex16. Our data show that the diversity level for the neo-Y chromosome is
reduced 30-fold compared with that for the neo-X. This result is consistent with our
previous finding of a 25-fold reduction in microsatellite variability on the neo-Y
chromosome of D. miranda17. The overall rate of silent substitution on the neo-Y branch of
the tree connecting the neo-X and neo-Y of D. miranda and D. pseudoobscura (Fig. 1) is
similar to that for the neo-X branch, showing that the difference in diversity is not caused
by a lower mutation rate of the neo-Y-linked genes (Table 1). A likelihood method was
used to quantify the reduction in diversity of the neo-Y chromosome, combining both the
microsatellite and the sequence data. Coalescent simulations yield a maximum-likelihood
estimate of the ratio of Ne for the neo-Y chromosome to that for the neo-X of 0.04 (Fig. 2).
This is consistent with the expectation of a substantial reduction in Ne under various forms
of selection acting on a non-recombining genome1. A similar magnitude of reduction in
nucleotide polymorphism has also been observed on an evolving plant Y chromosome18.
Figure 2 Plots of the relative values of the log-likelihood
functions ln L(k| V), ln L(k| S) and ln L(k| V, S) as a function
of k, the reduction in the effective population size of the neo-Y
chromosome relative to the neo-X. Full legend
High resolution image and legend (24k)
A large reduction in Ne for the neo-Y chromosome greatly reduces the rate of fixation of
beneficial mutations on the neo-Y chromosome3, and increases the rate of fixation of
deleterious mutations1. If loci on the neo-X chromosome continue to adapt and their
homologues on the neo-Y fail to do so, it is advantageous to upregulate X-linked genes and
downregulate their maladapted Y-linked homologues3. Similarly, an accumulation of
deleterious alleles on the neo-Y favours increased activity of neo-X genes relative to their
neo-Y homologues1. Both processes can therefore lead to the observed inactivation of neoY-linked genes9 and to dosage compensation2.
Comparisons with the outgroup species reveal that the neo-X-linked copy of CycB has a
very high rate of protein evolution (Table 1 and Fig. 1). Twenty-three amino-acid
replacement substitutions were observed on the branch of the phylogeny leading to the neoX, but only nine occurred on the neo-Y branch. This difference is not the result of an
increased mutation rate on the neo-X, as the proportion of synonymous substitutions per
synonymous site, Ks, is very similar on the neo-X and neo-Y lineages (Ks 0.06 for both
branches (Table 1)). For phylogenetic analysis of the nucleotide substitutions, CycB was
also amplified from D. persimilis, D. affinis and D. subobscura, which are all members of
the obscura subgroup of Drosophila, to which D. miranda belongs6. Likelihood analysis19
indicates that the ratio of the non-synonymous to synonymous substitution rate (Ka/Ks)
differs significantly between branches (2 L = 31.9, d.f. = 8, P < 0.0001 (Fig. 1)). A model
of a single Ka/Ks ratio for all branches was compared with a model of a specific ratio for the
neo-X branch plus a second rate common to all other branches (two-rate model). The latter
fitted the data significantly better (2 L = 18.4, d.f. = 1, P < 0.00002), whereas a model with
a different Ka/Ks ratio for each branch did not significantly improve the likelihood over the
two-rate model (2 L = 13.4, d.f. = 7, P 0.06). This suggests that most of the
heterogeneity in the Ka/Ks ratio between branches of the phylogeny reflects an elevated rate
on the branch leading to the neo-X chromosome (Fig. 1).
Heterogeneity in the Ka/Ks ratio between lineages might be caused either by positive
selection or by relaxed selective constraints in some lineages12. If the elevated rate of
amino-acid replacements on the neo-X were solely a consequence of reduced functional
constraints, the ratio of synonymous to replacement substitutions would be similar for
polymorphisms within D. miranda and for fixed differences between D. miranda and its
relatives20. However, a significant excess of the ratio of replacement to silent substitutions
relative to polymorphisms was found for comparisons involving the neo-X-linked copy of
CycB (Table 2), suggesting that directional darwinian selection has driven the rapid
evolution of CycB on the neo-X chromosome20. In addition, the recent fixation of an
advantageous mutation reduces neutral polymorphism around the site that is the target of
selection21. Consistent with the selection hypothesis is a decrease in variability at CycB
(Table 1). The level of polymorphism is about 4-fold lower than estimates for the other
neo-X-linked genes investigated (see Table 1). By the HKA test22, the level of neo-X silent
polymorphism relative to divergence from D. pseudoobscura at CycB is significantly lower
than that for all three other genes (eve, eng and robo (P < 0.01)). This result is consistent
with the idea that the neo-X chromosome experiences a faster rate of adaptive evolution
than the non-recombining neo-Y.
In contrast, eve and robo both show a higher rate of amino-acid replacements on the neo-Y
chromosome than their neo-X homologues (Table 1). Maximum-likelihood analysis of the
combined sequence data set for eve and robo was used to compare a model in which Ka/Ks
was assumed to be the same on the neo-X and neo-Y branches with a model in which it was
allowed to differ; the latter fitted the data significantly better (2 L = 6.1, d.f. = 1, P = 0.01).
In addition, Ka/Ks on the neo-Y branch of CycB was higher than on the other branches of
the phylogeny (except for the neo-X branch; see Fig. 1). Several lines of evidence suggest
that the acceleration of the rate of amino acid substitution on the neo-Y chromosome is not
due to a lack of functional constraints on the coding region of eve and robo. Using RT–
PCR, we demonstrated that the neo-Y-linked copies of both genes are transcribed. The
entire coding sequences of eve and robo were analysed, and contained no stop codons or
frame-shift mutations. A ratio of Ka/Ks of 1 is expected for a gene that is evolving entirely
neutrally, as was found for the Lcp genes on the neo-Y chromosome of D. miranda13,
whereas a lower ratio is expected for genes subject to functional constraints12. Eve and robo
display a very low Ka/Ks ratio on the neo-Y chromosome (Table 1), suggesting that both
genes are under selective constraints. These results, together with the large reduction in Ne
for the neo-Y chromosome, suggest that many amino-acid substitutions in these genes are
slightly deleterious and that natural selection has not been able to prevent their
accumulation on the neo-Y branch1. A similar effect has been reported for some other nonrecombining genomes23, 24.
The adaptive significance of sexual recombination is one of the most puzzling problems in
evolutionary biology25, 26. Most asexual lineages of eukaryotes seem to become extinct at a
higher rate than their sexual relatives25, 27. Similarly, the Y chromosome is prone to
degeneration once it stops recombining with the X1. The dismal fates of clonally
transmitted species and genomes suggest that genetic recombination is necessary for their
persistence over long periods of evolutionary time. Theoretical models predict that sexual
populations should be more effective in incorporating new beneficial mutations and
preventing the accumulation of deleterious alleles25, 26. Our evidence for both faster
adaptation on the recombining X chromosome and the accumulation of deleterious
mutations on the non-recombining Y chromosome suggests that both processes may be
important in conferring an evolutionary advantage to sex and recombination.
Methods
Isolation of genes, sequencing and RT-PCR CycB and robo were isolated from a
genomic library constructed from D. miranda males (strain 0101.3 (ref. 13) using the
Lambda FixII Library kit (Stratagene). eve and eng were isolated by using degenerate
primers designed from regions conserved between D. melanogaster and D. virilis. Allelespecific primers were used for PCR amplification of the neo-X-linked and neo-Y-linked
copies of CycB and robo from male genomic DNA, followed by direct sequencing of both
strands of the PCR products. The neo-Y-specific primers only amplify a product in males,
while the neo-X-specific primers amplify in both sexes. For eng and eve, PCR primers
amplifying both copies were used, and the amplified product was cloned with the TOPO
TA cloning kit (Invitrogen). Clones containing either the neo-X or neo-Y copy were
distinguished by a size difference in the cloned PCR products in the case of eng, or by a
HaeII restriction site for eve. To exclude PCR errors, at least three independent clones per
allele and individual were analysed (the PCR error rate was found to be 0.00092 per base
pair). For the population survey, 12 strains of D. miranda were sequenced (see ref. 13 for
description of the strains), using the ABI Prism BigDye Chemistry (Perkin-Elmer) on an
ABI 377 automated sequencer.
Total RNA from male D. miranda (strain MSH 38 (ref. 13) was extracted with the RNA
STAT-60 kit, and first-strand cDNA synthesis was performed with random or dT primers
and the cDNA Cycle kit (Invitrogen). Primers spanning at least one intron were used to
PCR-amplify the neo-X and the neo-Y copies. Amplification products were cloned with the
TOPO TA cloning vector and sequenced.
Sequence analysis Sequences of D. miranda and D. pseudoobscura were aligned
manually. Nucleotide diversity was calculated as described28. To align the coding region of
CycB between the more diverged members of the obscura species group, the translated
proteins were aligned by using CLUSTAL X (ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX/)
and the nucleotide alignment was constructed from the protein alignment. The phylogeny
based on the coding region of CycB was derived by using PUZZLE
(ftp://ftp.ebi.ac.uk/pub/software/mac/puzzle/) and is shown in Fig. 1. Ka and Ks were
calculated with a maximum-likelihood method, which accounts for unequal transition and
transversion rates and unequal base and codon frequencies19. A likelihood ratio test was
used to test different models of evolution by application of the codeml program within the
PAML software package19. A model assuming the same Ka/Ks ratio for all branches was
compared with a model assuming a different Ka/Ks ratio for all or some branches of the
phylogeny. Twice the difference in log-likelihood between models is assumed to be
distributed approximately as 2, with degrees of freedom equal to the difference in the
numbers of parameters.
Coalescent process simulations Coalescent simulations with the standard algorithm of
Hudson29 were performed to obtain a maximum-likelihood estimate of the reduction in Ne
for the neo-Y chromosome compared with that of the neo-X, combining both the nucleotide
polymorphism data set presented here and data on microsatellite variability17. Whereas all
loci on the neo-Y share one genealogy, the loci on the neo-X are probably independent. We
therefore generated 11 independent trees for a data set of 12 chromosomes for the neo-X (7
microsatellite loci and 4 genes), whereas a single tree with 11 completely linked loci was
computed for the neo-Y. Microsatellite mutations were superimposed on the trees following
the single stepwise mutation model described in ref. 17. In short, seven random estimates of
= 4Ne (where µ is the mutation rate per microsatellite locus) were drawn from a gamma
distribution for each run, and mutations using the resulting values were laid down on the
trees in accordance with Poisson distributions, using the same values of for homologous
loci. The mean variance in repeat number per locus, , across the seven loci was then
calculated. After each run, Vsim = ( X - Y)/ X for the simulated data set was computed. For
each gene, the total number of segregating mutations (CycB, 6; robo, 20; eve, 28; eng, 9)
were assigned to the simulated neo-X and neo-Y trees in proportion to their lengths. After
each run, the total numbers of segregating sites on the neo-X (SX) and on the neo-Y (SY)
were determined and the quantity Ssim = (SX - SY)/SX was computed. To estimate the
reduction in Ne for the neo-Y loci, the tree length (in units of 2Ne generations) was
multiplied by a scaling factor k before mutations were laid down on the neo-Y tree. The
simulations can be used to determine the likelihood of a pair of ( Vsim, Ssim) for a given
value of k. n replicas are generated for each value of k. The number of runs M where |
Vsim - Vobs|
and | Ssim - Sobs|
was determined, where is a preassigned mesh size for
the continuous variables V and S. Following ref. 30, the likelihood of the sample is
approximated by
Here, n is 105 and is 0.02.
Received 21 September 2001;
accepted 18 December 2001
References
1. Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes. Phil. Trans. R. Soc.
Lond. B 55, 1563-1572 (2000). | Article |
2. Marin, I., Siegal, M. L. & Baker, B. S. The evolution of dosage-compensation mechanisms.
BioEssays 22, 1106-1114 (2000). | Article | PubMed | ISI | ChemPort |
2. Marin, I., Siegal, M. L. & Baker, B. S. The evolution of dosage-compensation mechanisms.
BioEssays 22, 1106-1114 (2000). | Article | PubMed | ISI | ChemPort |
2. Marin, I., Siegal, M. L. & Baker, B. S. The evolution of dosage-compensation mechanisms.
BioEssays 22, 1106-1114 (2000). | Article | PubMed | ISI | ChemPort |
3. Orr, H. A. & Kim, Y. An adaptive hypothesis for the evolution of the Y chromosome. Genetics
150, 1693-1698 (1998). | PubMed | ISI | ChemPort |
3. Orr, H. A. & Kim, Y. An adaptive hypothesis for the evolution of the Y chromosome. Genetics
3.
4.
4.
5.
5.
5.
6.
7.
7.
7.
8.
8.
8.
9.
10.
10.
10.
11.
150, 1693-1698 (1998). | PubMed | ISI | ChemPort |
Orr, H. A. & Kim, Y. An adaptive hypothesis for the evolution of the Y chromosome. Genetics
150, 1693-1698 (1998). | PubMed | ISI | ChemPort |
Lahn, B. T., Pearson, N. M. & Jegalian, K. The human Y chromosome, in the light of evolution.
Nature Rev. Genet. 2, 207-216 (2001). | Article | PubMed | ISI | ChemPort |
Lahn, B. T., Pearson, N. M. & Jegalian, K. The human Y chromosome, in the light of evolution.
Nature Rev. Genet. 2, 207-216 (2001). | Article | PubMed | ISI | ChemPort |
Carvalho, A. B., Lazzaro, B. P. & Clark, A. G. Y chromosomal fertility factors kl-2 and kl-3 of
Drosophila melanogaster encode dynein heavy chain polypeptides. Proc. Natl Acad. Sci. USA
97, 13239-13244 (2000). | Article | PubMed | ISI | ChemPort |
Carvalho, A. B., Lazzaro, B. P. & Clark, A. G. Y chromosomal fertility factors kl-2 and kl-3 of
Drosophila melanogaster encode dynein heavy chain polypeptides. Proc. Natl Acad. Sci. USA
97, 13239-13244 (2000). | Article | PubMed | ISI | ChemPort |
Carvalho, A. B., Lazzaro, B. P. & Clark, A. G. Y chromosomal fertility factors kl-2 and kl-3 of
Drosophila melanogaster encode dynein heavy chain polypeptides. Proc. Natl Acad. Sci. USA
97, 13239-13244 (2000). | Article | PubMed | ISI | ChemPort |
Powell, J. R. Progress and Prospects in Evolutionary Biology: The Drosophila Model (Oxford
Univ. Press, New York, 1997).
Schaeffer, S. W. & Miller, E. L. Molecular population genetics of an electrophoretically
monomorphic protein in the alcohol dehydrogenase region of Drosophila pseudoobscura.
Genetics 132, 163-178 (1992). | PubMed | ISI | ChemPort |
Schaeffer, S. W. & Miller, E. L. Molecular population genetics of an electrophoretically
monomorphic protein in the alcohol dehydrogenase region of Drosophila pseudoobscura.
Genetics 132, 163-178 (1992). | PubMed | ISI | ChemPort |
Schaeffer, S. W. & Miller, E. L. Molecular population genetics of an electrophoretically
monomorphic protein in the alcohol dehydrogenase region of Drosophila pseudoobscura.
Genetics 132, 163-178 (1992). | PubMed | ISI | ChemPort |
Gethmann, R. C. Crossing over in males of higher Diptera (Brachycera). J. Hered. 79, 344-350
(1988). | ISI |
Gethmann, R. C. Crossing over in males of higher Diptera (Brachycera). J. Hered. 79, 344-350
(1988). | ISI |
Gethmann, R. C. Crossing over in males of higher Diptera (Brachycera). J. Hered. 79, 344-350
(1988). | ISI |
Steinemann, M. & Steinemann, S. Enigma of Y chromosome degeneration: neo-Y and neo-X
chromosomes of Drosophila miranda a model for sex chromosome evolution. Genetica 102103, 409-420 (1998). | Article | PubMed | ChemPort |
Waters, P. D., Duffy, B., Frost, C. J., Delbridge, M. L. & Graves, J. A. The human Y
chromosome derives largely from a single autosomal region added to the sex chromosomes
80-130 million years ago. Cytogenet. Cell. Genet. 92, 74-79
(2001). | Article | PubMed | ISI | ChemPort |
Waters, P. D., Duffy, B., Frost, C. J., Delbridge, M. L. & Graves, J. A. The human Y
chromosome derives largely from a single autosomal region added to the sex chromosomes
80-130 million years ago. Cytogenet. Cell. Genet. 92, 74-79
(2001). | Article | PubMed | ISI | ChemPort |
Waters, P. D., Duffy, B., Frost, C. J., Delbridge, M. L. & Graves, J. A. The human Y
chromosome derives largely from a single autosomal region added to the sex chromosomes
80-130 million years ago. Cytogenet. Cell. Genet. 92, 74-79
(2001). | Article | PubMed | ISI | ChemPort |
Das, M., Mutsuddi, D., Duttagupta, A. K. & Mukherjee, A. S. Segmental heterogeneity in
replication and transcription of the X2 chromosome of Drosophila miranda and
conservativeness in the evolution of dosage compensation. Chromosoma 87, 373-388
11.
11.
12.
13.
13.
13.
14.
14.
14.
15.
16.
17.
17.
17.
18.
18.
18.
19.
(1982). | ISI | ChemPort |
Das, M., Mutsuddi, D., Duttagupta, A. K. & Mukherjee, A. S. Segmental heterogeneity in
replication and transcription of the X2 chromosome of Drosophila miranda and
conservativeness in the evolution of dosage compensation. Chromosoma 87, 373-388
(1982). | ISI | ChemPort |
Das, M., Mutsuddi, D., Duttagupta, A. K. & Mukherjee, A. S. Segmental heterogeneity in
replication and transcription of the X2 chromosome of Drosophila miranda and
conservativeness in the evolution of dosage compensation. Chromosoma 87, 373-388
(1982). | ISI | ChemPort |
Li, W. Molecular Evolution (Sinauer Associates, Sunderland, Massachusetts, 1997).
Yi, S. & Charlesworth, B. Contrasting patterns of molecular evolution of the genes on the new
and old sex chromosomes of Drosophila miranda. Mol. Biol. Evol. 17, 703-717
(2000). | PubMed | ISI | ChemPort |
Yi, S. & Charlesworth, B. Contrasting patterns of molecular evolution of the genes on the new
and old sex chromosomes of Drosophila miranda. Mol. Biol. Evol. 17, 703-717
(2000). | PubMed | ISI | ChemPort |
Yi, S. & Charlesworth, B. Contrasting patterns of molecular evolution of the genes on the new
and old sex chromosomes of Drosophila miranda. Mol. Biol. Evol. 17, 703-717
(2000). | PubMed | ISI | ChemPort |
Wang, R. L. & Hey, J. The speciation history of Drosophila pseudoobscura and close relatives:
inferences from DNA sequence variation at the period locus. Genetics 144, 1113-1126
(1996). | PubMed | ISI | ChemPort |
Wang, R. L. & Hey, J. The speciation history of Drosophila pseudoobscura and close relatives:
inferences from DNA sequence variation at the period locus. Genetics 144, 1113-1126
(1996). | PubMed | ISI | ChemPort |
Wang, R. L. & Hey, J. The speciation history of Drosophila pseudoobscura and close relatives:
inferences from DNA sequence variation at the period locus. Genetics 144, 1113-1126
(1996). | PubMed | ISI | ChemPort |
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, Cambridge,
1983).
Wright, S. Evolution and the Genetics of Populations (Univ. of Chicago Press, Chicago, 1969).
Bachtrog, D. & Charlesworth, B. Reduced levels of microsatellite variability on the neo-Y
chromosome of Drosophila miranda. Curr. Biol. 10, 1025-1031
(2000). | Article | PubMed | ISI | ChemPort |
Bachtrog, D. & Charlesworth, B. Reduced levels of microsatellite variability on the neo-Y
chromosome of Drosophila miranda. Curr. Biol. 10, 1025-1031
(2000). | Article | PubMed | ISI | ChemPort |
Bachtrog, D. & Charlesworth, B. Reduced levels of microsatellite variability on the neo-Y
chromosome of Drosophila miranda. Curr. Biol. 10, 1025-1031
(2000). | Article | PubMed | ISI | ChemPort |
Filatov, D. A., Moneger, F., Negrutiu, I. & Charlesworth, D. Low variability in a Y-linked plant
gene and its implications for Y-chromosome evolution. Nature 404, 388-390
(2000). | Article | PubMed | ISI | ChemPort |
Filatov, D. A., Moneger, F., Negrutiu, I. & Charlesworth, D. Low variability in a Y-linked plant
gene and its implications for Y-chromosome evolution. Nature 404, 388-390
(2000). | Article | PubMed | ISI | ChemPort |
Filatov, D. A., Moneger, F., Negrutiu, I. & Charlesworth, D. Low variability in a Y-linked plant
gene and its implications for Y-chromosome evolution. Nature 404, 388-390
(2000). | Article | PubMed | ISI | ChemPort |
Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput.
Appl. Biosci. 13, 555-556 (1997). | PubMed | ISI | ChemPort |
19. Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput.
Appl. Biosci. 13, 555-556 (1997). | PubMed | ISI | ChemPort |
19. Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput.
Appl. Biosci. 13, 555-556 (1997). | PubMed | ISI | ChemPort |
20. McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila.
Nature 351, 652-654 (1991). | PubMed | ISI | ChemPort |
20. McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila.
Nature 351, 652-654 (1991). | PubMed | ISI | ChemPort |
20. McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila.
Nature 351, 652-654 (1991). | PubMed | ISI | ChemPort |
21. Barton, N. H. Genetic hitchhiking. Phil. Trans. R. Soc. Lond. B 355, 1553-1562
(2000). | Article | ISI | ChemPort |
21. Barton, N. H. Genetic hitchhiking. Phil. Trans. R. Soc. Lond. B 355, 1553-1562
(2000). | Article | ISI | ChemPort |
21. Barton, N. H. Genetic hitchhiking. Phil. Trans. R. Soc. Lond. B 355, 1553-1562
(2000). | Article | ISI | ChemPort |
22. Hudson, R. R., Kreitman, M. & Aguade, M. A test of neutral molecular evolution based on
nucleotide data. Genetics 116, 153-159 (1987). | PubMed | ISI | ChemPort |
22. Hudson, R. R., Kreitman, M. & Aguade, M. A test of neutral molecular evolution based on
nucleotide data. Genetics 116, 153-159 (1987). | PubMed | ISI | ChemPort |
22. Hudson, R. R., Kreitman, M. & Aguade, M. A test of neutral molecular evolution based on
nucleotide data. Genetics 116, 153-159 (1987). | PubMed | ISI | ChemPort |
23. Lynch, M. & Blanchard, J. L. Deleterious mutation accumulation in organelle genomes.
Genetica 102-103, 29-39 (1998). | Article | PubMed | ChemPort |
24. Fridolfsson, A. K. & Ellegren, H. Molecular evolution of the avian CHD1 genes on the Z and W
sex chromosomes. Genetics 155, 1903-1912 (2000). | PubMed | ISI | ChemPort |
24. Fridolfsson, A. K. & Ellegren, H. Molecular evolution of the avian CHD1 genes on the Z and W
sex chromosomes. Genetics 155, 1903-1912 (2000). | PubMed | ISI | ChemPort |
24. Fridolfsson, A. K. & Ellegren, H. Molecular evolution of the avian CHD1 genes on the Z and W
sex chromosomes. Genetics 155, 1903-1912 (2000). | PubMed | ISI | ChemPort |
25. Maynard Smith, J. The Evolution of Sex (Cambridge Univ. Press, Cambridge, 1978).
26. Barton, N. H. & Charlesworth, B. Why sex and recombination? Science 281, 1986-1990
(1998). | Article | PubMed | ISI | ChemPort |
26. Barton, N. H. & Charlesworth, B. Why sex and recombination? Science 281, 1986-1990
(1998). | Article | PubMed | ISI | ChemPort |
26. Barton, N. H. & Charlesworth, B. Why sex and recombination? Science 281, 1986-1990
(1998). | Article | PubMed | ISI | ChemPort |
27. Bell, G. The Masterpiece of Nature (Univ. of California, Berkeley, 1982).
28. Tajima, F. Statistical analysis of DNA polymorphism. Jpn J. Genet. 68, 567-595
(1993). | PubMed | ISI | ChemPort |
28. Tajima, F. Statistical analysis of DNA polymorphism. Jpn J. Genet. 68, 567-595
(1993). | PubMed | ISI | ChemPort |
28. Tajima, F. Statistical analysis of DNA polymorphism. Jpn J. Genet. 68, 567-595
(1993). | PubMed | ISI | ChemPort |
29. Hudson, R. R. in Oxford Surveys in Evolutionary Biology Vol. 7 (eds Futuyma, D. &
Antonovics, J.) 1-44 (Oxford Univ. Press, Oxford, 1990).
30. Weiss, G. & von Haeseler, A. Inference of population history using a likelihood approach.
Genetics 149, 1539-1546 (1998). | PubMed | ISI | ChemPort |
30. Weiss, G. & von Haeseler, A. Inference of population history using a likelihood approach.
Genetics 149, 1539-1546 (1998). | PubMed | ISI | ChemPort |
30. Weiss, G. & von Haeseler, A. Inference of population history using a likelihood approach.
Genetics 149, 1539-1546 (1998). | PubMed | ISI | ChemPort |
Acknowledgements. We thank P. Andolfatto, N. Barton, D. Charlesworth, I. Gordo, P.
Keightley and S. Wright for helpful comments on the manuscript. D.B. is supported by a
Marie Curie fellowship and B.C. by the Royal Society.
Competing interests statement. The authors declare that they have no competing financial
interests.
Figure 1 Phylogenetic relationships between the species investigated, constructed from the coding
region of CycB. The numbers shown above the lines give the estimated Ka values (percentages) for
each lineage obtained by maximum likelihood19; the corresponding Ks values are given below the
lines. The karyotypes of the species are also illustrated. The letters A–E indicate the five major
chromosomes of the basic Drosophila karyotype6. In D. miranda, element C is fused to the true Y
chromosome (neo-Y, shown in grey), and its unfused homologue, the neo-X, segregates with the X
chromosome. The relatives of D. miranda lack the Y–C fusion. (Note that D. subobscura lacks the
A–D fusion found in the other species6).
Figure 2 Plots of the relative values of the log-likelihood functions ln L(k| V), ln L(k| S) and ln
L(k| V, S) as a function of k, the reduction in the effective population size of the neo-Y
chromosome relative to the neo-X. All three curves have been normalized so that their maxima are
at 0.The horizontal line indicates the two-unit support interval.
11 April 2002
Nature 416, 624 - 626 (2002); doi:10.1038/416624a
Strong male-driven evolution of DNA sequences in humans and apes
KATERYNA D. MAKOVA AND WEN-HSIUNG LI
Department of Ecology and Evolution, University of Chicago, 1101 East 57th Street, Chicago, Illinois 60637, USA
Correspondence and requests for materials should be addressed to W.-H.L. (e-mail: whli@uchicago.edu).
Studies of human genetic diseases have suggested a higher mutation rate in males than in
females1 and the male-to-female ratio ( ) of mutation rate has been estimated from DNA
sequence and microsatellite data to be about 4–6 in higher primates2-5. Two recent studies,
however, claim that is only about 2 in humans6, 7. This is even smaller than the estimates ( >
4) for carnivores and birds8, 9; humans should have a higher than carnivores and birds
because of a longer generation time and a larger sex difference in the number of germ cell
cycles. To resolve this issue, we sequenced a noncoding fragment on Y of about 10.4 kilobases
(kb) and a homologous region on chromosome 3 in humans, greater apes, and lesser apes.
Here we show that our estimate of from the internal branches of the phylogeny is 5.25 (95%
confidence interval (CI) 2.44 to ), similar to the previous estimates2-5, but significantly
higher than the two recent ones6, 7. In contrast, for the external (short, species-specific)
branches, is only 2.23 (95% CI: 1.47–3.84). We suggest that closely related species are not
suitable for estimating , because of ancient polymorphism and other factors. Moreover, we
provide an explanation for the small estimate of in a previous study12. Our study reinstates a
high in hominoids and supports the view that DNA replication errors are the primary
source of germline mutation.
An effective way to estimate is to use highly similar noncoding sequences on different types of
chromosomes10. Noting that the DAZ locus was translocated from chromosome 3 to Y after the
split between Old and New World monkeys11, we sequenced three noncoding segments ( 10.4 kb
total) on chromosomes Y and 3 in the human, bonobo, gorilla, siamang and gibbon.
When we estimate the -values separately for the external and the internal branches of the tree (Fig.
1) an intriguing pattern emerges (Table 1). The -values vary enormously among the external
branches, but their average is low. The sum of the external branch lengths is 4.43% for the Y
sequences (Y) and 3.22% for the chromosome 3 sequences (A), leading to Y/A=1.38. According to a
previous study12: Y/A=2 /(1 + ), so =2.23 (95% CI: 1.47–3.84), not significantly different from
the estimate (1.55) in ref. 6. In contrast, the sum of the internal branch lengths is 3.77% for the Y
sequences and 2.25% for the chromosome 3 sequences, leading to Y/A=1.68 and =5.25 (95% CI:
2.44 to ). This is similar to the earlier estimates of in primates2-5, but significantly higher than
those in refs 6 and 7. Thus, estimates of obtained from closely related species tend to be lower
than those obtained from more distantly related species.
Figure 1 Phylogenetic tree of nucleotide sequences. Full legend
High resolution image and legend (42k)
We wondered how to explain the above discrepancy. The level of polymorphism on chromosome Y
is usually extremely low because of selective sweep and background selection13, 14 and because the
effective population size of Y is only one-quarter of that of an autosome. The average divergence
between two species is equal to + 2ut, where t is the divergence time, u is the mutation rate, and
is the average divergence between two sequences in the ancestral population (ancient nucleotide
diversity) at the time of speciation15. The ratio of Y divergence (Y) to autosome divergence (A) is
Y/A=(y + y)/(a + a), where y is the number of mutations on Y, a is the number of mutations on
an autosome, and y and a are the ancient nucleotide diversities at the point of speciation on Y
and an autosome, respectively. As y is usually lower than a, (y + y)/(a + a) can be
substantially lower than y/a for closely related species. For instance, a between the two gorilla
alleles at the chromosome 3 locus is 0.19% (Fig. 1). If we assume a=0.19% and y 0 in the
common ancestor of the bonobo and the gorilla, then Y/A between the bonobo and the gorilla
increases from 2.11%/1.49%=1.42 to 2.11%/(1.49% - 0.19%)=1.62, and increases from 2.45 to
4.26. Clearly, ancient polymorphism can considerably reduce the estimate of for closely related
species. If A is small, even a small error in the estimate of A can have a strong effect on Y/A.
A more serious problem in ref. 6 is that their phylogeny (Fig. 2a) assumes that the Y-linked
sequence, which was derived from the transposition of an X-linked sequence in an ancestral human
population, has always evolved as a Y-linked sequence since its separation from the X-linked
sequence sampled in their study. This is unlikely. Rather, one of the schemes in Fig. 2b–d should
be true. In Fig. 2b, when chimpanzee X is used as a reference, we have X=x1 and Y=x2 + y. If y < x2,
Y can be close to X and can be seriously underestimated. In Fig. 2c, X= 1 + x1, Y= 2 + x2 + y and
1>
2, so if y is very small, Y can even be smaller than X; that is, both Y/X and can be smaller
than 1. In Fig. 2d, 1 < 2, so Y/X is greater than 1. However, Y/X can be close to 1, if y is small
and if 2 is close to 1. Among the three schemes, Fig. 2b is most probable because the observed
human X–human Y divergence (1.18%) is smaller than the observed human X–chimpanzee X
divergence (1.56%)6. In any case, is underestimated if Fig. 2a is used as the phylogeny.
Figure 2 Phylogenetic schemes for the sequences of ref. 6. Full legend
High resolution image and legend (71k)
As to the estimate of =2.1 in ref. 7, corrections for multiple substitutions were not made in the data
analysis. This could have underestimated for old repetitive elements. For young elements, the low
estimate of could be in part due to low polymorphism on Y. One additional problem was the false
assumption that the repetitive elements of the same subfamily were inserted into the genome at the
same time16, introducing errors into the estimation of .
We have provided a resolution for the controversy on the magnitude of in primates. Most previous
studies compared X- and Y-linked sequences (see ref. 2), and some have argued that the high values could be due to a reduction in mutation rate in X rather than an elevated rate in Y17. To
avoid this possibility, we compared an autosomal and a Y-linked locus. The fact that this study and
all previous studies, with the exception of comparisons between closely related species, give
consistent estimates of 4–6 provides conclusive evidence for strong male-driven evolution in
hominoids.
Methods
DNA sequencing We studied three homologous fragments on Y and chromosome 3 that
correspond to nucleotides 36,887–40,238, 49,037–54,082, and 55,171–57,328 of human Y contig
AC006983.4 and nucleotides 59,378–56,044, 47,226–42,245, and 40,284–38,159 of human
chromosome 3 contig AC010139.4. The 5' end of each fragment is located 1.5, 13.7 and 20.6 kb,
respectively, downstream from the 3' untranslated region of the DAZ gene and they have no
homology with an expressed sequence tags and contain no known or predicted gene. Interspersed
and simple repeats were identified using RepeatMasker
(http://ftp.genome.washington.edu/RM/RepeatMasker.html). Searches employing GenScan and
BLAST failed to identify any genes or exons within these fragments.
Genomic DNAs of a female and a male human, bonobo (Pan paniscus), gorilla (Gorilla gorilla),
gibbon (Hylobates lar) and siamang (Hylobates syndactylus) were used separately as templates in
polymerase chain reaction (PCR). The Expand High Fidelity PCR system (Roche) was used to
minimize errors in PCR amplification. The PCR primers were designed from the human sequence
(available as Supplementary Information). Female DNA was used to amplify the chromosome 3
locus and male DNA to amplify the chromosome Y locus. The sequences were deposited in
GenBank under accession numbers AF483550–AF483579.
Statistical analyses The sequences of the fragments studied were concatenated and aligned using
the MegAlign module of DNAStar (Lasergene), and the alignment was adjusted manually. As the
divergence between the sequences studied is low (at most 11%), the alignment was rather simple.
We found evidence of at least two copies of the DAZ locus on chromosome Y in human (there were
five 'polymorphic' sites in a 10.4-kb sequence), gorilla (two polymorphic sites), and siamang (four
polymorphic sites). Thus, the copies were almost identical, suggesting that the duplications
happened recently.
To take the presence of polymorphisms on chromosome 3 into account, two pseudosequences were
generated for each of the individuals studied. For example, if the site was heterozygous for A and
G, one of the pseudosequences was assigned 'A' and the other was assigned 'G'. This assignment
was done randomly by DAMBE18. Insertion/deletion polymorphisms (indels) in the alignment, as
well as heterozygous indel sites, were not included in the calculations. These indels included one
deletion of 106 base pairs (bp) in gibbon Y and one insertion of 33 bp in human Y, although most
of the others were short indels, many of which were in mononucleotide microsatellites. One
segment of 121 bp in gorilla chromosome 3 was difficult to sequence and was excluded from
analysis. So, in total, 1,148 nucleotide sites were excluded from analysis. Tajima and Nei's genetic
distances19 were estimated using DAMBE18. The distances were apportioned among the branches
of the phylogeny using the FITCH module of PHYLIP20. The ratio of substitution rates (Y/A) was
calculated from the sum of the branches on Y (Y) and chromosome 3 (A). The average of the two
values for the chromosome 3 pseudosequences was used in the calculations. This procedure was
also used for treating the 'polymorphic' sites in Y-linked sequences. The formula from ref. 12,
Y/A=2 /(1 + ), was used to calculate . To calculate the 95% CI for , we used two methods. First,
we derived the variance of Y/A. If L is the length of the sequence, V(Y)=Y(1 - Y)/[L(1 - 4Y/3)2],
V(A)=A(1 - A)/[L(1 - 4A/3)2], and V(Y/A)=V(Y)/E(A)2 + E(Y)2V(A)/E(A)4. Second, we used the
bootstrap. The results for the two methods were similar. The second method was used in the text.
Supplementary information accompanies this paper.
Received 8 October 2001;
accepted 25 January 2002
References
1. Crow, J. F. The origins, patterns and implications of human spontaneous mutation. Nature Rev. Genet.
1, 40-47 (2000). | Article | PubMed | ISI | ChemPort |
2. Huang, W., Chang, B. H.-J., Gu, X., Hewett-Emmett, D. & Li, W.-H. Sex differences in mutation rate in
higher primates estimated from AMG intron sequences. J. Mol. Evol. 44, 463-465
(1997). | PubMed | ISI | ChemPort |
3. Agulnik, A. I. et al. Evolution of the DAZ gene family suggests that Y-linked DAZ plays little, or a limited,
role in spermatogenesis but underlines a recent African origin for human populations. Hum. Mol. Genet.
7, 1371-1377 (1998). | Article | PubMed | ISI | ChemPort |
4. Nachman, M. W. & Crowell, S. L. Estimate of the mutation rate per nucleotide in humans. Genetics
156, 297-304 (2000). | PubMed | ISI | ChemPort |
5. Ellegren, H. Heterogeneous mutation processes in human microsatellite DNA sequences. Nature
Genet. 24, 400-402 (2000). | Article | PubMed | ISI | ChemPort |
6. Bohossian, H. B., Skaletsky, H. & Page, D. C. Unexpectedly similar rates of nucleotide substitution
found in male and female hominids. Nature 406, 622-625 (2000). | Article | PubMed | ISI | ChemPort |
7. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human
genome. Nature 409, 860-921 (2000). | Article |
8. Pecon Slattery, J. & O'Brien, S. J. Patterns of Y and X chromosome DNA sequence divergence during
the Felidae radiation. Genetics 148, 1245-1255 (1998). | PubMed | ChemPort |
9. Ellegren, H. & Fridolfsson, A. K. Male-driven evolution of DNA sequences in birds. Nature Genet. 17,
182-184 (1997). | PubMed | ISI | ChemPort |
10. Shimmin, L. C., Chang, B.H.-J., Hewett-Emmett, D. & Li, W.-H. Potential problems in estimating the
male-to-female mutation rate ratio from DNA sequence data. J. Mol. Evol. 37, 160-166
(1993). | PubMed | ISI | ChemPort |
11. Saxena, R. et al. Four DAZ genes in two clusters found in the AZFc region of the human Y
chromosome. Genomics 67, 256-67 (2000). | Article | PubMed | ISI | ChemPort |
12. Miyata, T., Hayashida, H., Kuma, K., Mitsuyasu, K. & Yasunaga, T. Male-driven molecular evolution: a
model and nucleotide sequence analysis. Cold Spring Harbor Symp. Quant. Biol. 52, 863-867
(1987). | PubMed | ISI | ChemPort |
13. Begun, D. J. & Aquadro, C. F. Levels of naturally occurring DNA polymorphism correlate with
recombination rates in D. melanogaster. Nature 356, 519-520 (1992). | PubMed | ISI | ChemPort |
14. Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes. Phil. Trans. R. Soc. Lond. B
355, 1563-1572 (2000). | Article | ISI | ChemPort |
15. Li, W.-H. Distribution of nucleotide differences between two randomly chosen cistrons in a finite
population. Genetics 85, 331-337 (1977). | PubMed | ISI | ChemPort |
16. Erlandsson, R., Wilson, J. F. & Paabo, S. Sex chromosomal transposable element accumulation and
male-driven substitutional evolution in humans. Mol. Biol. Evol. 17, 804-812
(2000). | PubMed | ISI | ChemPort |
17. McVean, G. T. & Hurst, L. D. Evidence for a selectively favourable reduction in the mutation rate of the
X chromosome. Nature 386, 388-392 (1997). | PubMed | ISI | ChemPort |
18. Xia, X. Data Analysis in Molecular Biology and Evolution (Kluwer Academic, Boston, 2000).
19. Tajima, F. & Nei, M. Estimation of evolutionary distance between nucleotide sequences. Mol. Biol. Evol.
1, 269-285 (1984). | PubMed | ISI | ChemPort |
20. Felsenstein, J. PHYLIP--Phylogeny Inference Package (version 3.2). Cladistics 5, 164-166 (1989).
Acknowledgements. The DNA samples were purchased from San Diego Zoological Society and
the gibbon sample was given by M. Jensen-Seaman. We thank J. Crow and D. Page for comments.
This study was supported by NIH grants.
Competing interests statement. The authors declare that they have no competing financial
interests.
Figure 1 Phylogenetic tree of nucleotide sequences. Branch lengths were estimated from the
pairwise evolutionary distances (substitutions per 100 sites). The pseudosequences of the two
alleles at the chromosome 3 locus in each species are labelled a and b.
Figure 2 Phylogenetic schemes for the sequences of ref. 6. a, The original scheme (modified from
ref. 6). b–d, Three possible schemes: one of them is likely to be true. X1 was sampled in ref. 6 and
part of X2 was transposed to Y. 1, the ancient nucleotide diversity between the ancestral human
X1 and chimpanzee X chromosomes; 2, the ancient nucleotide diversity between the ancestral
human X2 and chimpanzee X chromosomes; x1, the number of nucleotide substitutions on human
X1 since the common ancestor of X1 and X2 or since the speciation; x2, the number of nucleotide
substitutions on human X2 between the common ancestor of X1 and X2 (or the speciation) and the
transposition; cx, the number of nucleotide substitutions on chimpanzee X chromosome; y, the
number of nucleotide substitutions on the branch for the human Y chromosome locus from the time
of transposition to present. Note that a assumes X2=X1.
15 February 2001
Nature 409, 943 - 945 (2001); doi:10.1038/35057170
A physical map of the human Y chromosome
CHARLES A. TILFORD*, TOMOKO KURODA-KAWAGUCHI*, HELEN SKALETSKY*,
STEVE ROZEN*, LAURA G. BROWN*, MICHAEL ROSENBERG*, JOHN D. MCPHERSON†,
KRISTINE WYLIE†, MANDEEP SEKHON†, TAMARA A. KUCABA†, ROBERT H. WATERSTON† &
DAVID C. PAGE*
* Howard Hughes Medical Institute, Whitehead Institute, and Department of Biology, Massachusetts Institute of Technology, 9
Cambridge Center, Cambridge, Massachusetts 02142, USA
† Genome Sequencing Center, Department of Genetics, Washington University School of Medicine, 4444 Forest Park Boulevard, St.
Louis, Missouri 63108, USA
Correspondence and requests for materials should be addressed to D.C.P. (e-mail: dcpage@wi.mit.edu).
The non-recombining region of the human Y chromosome (NRY), which comprises
95% of the chromosome, does not undergo sexual recombination and is present only
in males. An understanding of its biological functions has begun to emerge from DNA
studies of individuals with partial Y chromosomes, coupled with molecular
characterization of genes implicated in gonadal sex reversal, Turner syndrome, graft
rejection and spermatogenic failure1, 2. But mapping strategies applied successfully
elsewhere in the genome have faltered in the NRY, where there is no meiotic
recombination map and intrachromosomal repetitive sequences are abundant3. Here
we report a high-resolution physical map of the euchromatic, centromeric and
heterochromatic regions of the NRY and its construction by unusual methods,
including genomic clone subtraction4 and dissection of sequence family variants5. Of
the map's 758 DNA markers, 136 have multiple locations in the NRY, reflecting its
unusually repetitive sequence composition. The markers anchor 1,038 bacterial
artificial chromosome clones, 199 of which form a tiling path for sequencing.
A low-resolution physical map of the Y chromosome was previously assembled by testing
naturally occurring deletions and yeast artificial chromosome (YAC) clones for the
presence or absence of 182 Y-chromosomal sequence-tagged sites (STSs)3, 6. These STS
markers were generated from Y DNA sequences selected at random, which promoted
representative sampling of the entire chromosome6. Nonetheless, most randomly selected
sequences proved unusable as map landmarks because they corresponded either to
interspersed repetitive elements found throughout the genome or to male-specific repetitive
sequences dispersed to many locations in the NRY.
To construct a high-resolution map, we generated additional STSs in a directed manner. To
enrich for the single-copy sequences most useful as map landmarks, we systematically
applied genomic clone subtraction, whereby a 'tracer' clone's DNA is depleted of sequences
shared with a set of 'driver' clones4. We identified a tiling path of 57 YACs that collectively
spanned the euchromatic NRY3, and then carried out, in parallel, 57 subtractions, each
employing one YAC as tracer and the remaining YACs (minus those overlapping the tracer
YAC) as drivers. We sequenced a random sample of products from each of the 57
subtractions, identifying 308 additional STSs that proved useful in map assembly.
We used radiation hybrid mapping to integrate and order the random and subtractionderived STSs. Large-fragment radiation hybrid panels offering long-range connectivity
have been used to assemble human genome maps at 500–1,000-kilobase (kb) resolution7-10.
To obtain greater resolution in the NRY, where the number of STSs appeared sufficient to
cover the euchromatic region at an average spacing of 50 kb, we used a small-fragment
radiation hybrid panel that had been used to build detailed maps of limited autosomal
segments11. We tested this panel for all random and subtraction-derived NRY STSs.
Nascent radiation hybrid linkage groups were ordered and oriented with respect to the
centromere by positioning selected STSs on the existing map of natural deletions6.
Additional STSs generated in our project's later phases were also tested. Ultimately, 513
STSs were positioned, at a resolution of around 50 kb, on a radiation hybrid map
encompassing nearly the entire euchromatic NRY.
To prepare for sequencing the NRY, we used the radiation hybrid map as a scaffold for
assembling contigs of bacterial artificial chromosome (BAC) clones. We screened a BAC
library of human male genomic DNA with hybridization probes derived from NRY STSs.
Through subsequent polymerase chain reaction tests of STS content, we assembled 1,038
BACs into contigs that, except for four small gaps, represented the whole NRY (see Fig. 1
in Supplementary Information). Many portions of this BAC map could be assembled only
after the sequence of selected BACs had been determined and compared. Also, many NRY
genes and extragenic sequences are known to have closely related counterparts on the X
chromosome12. In many cases, it was initially unclear whether BACs identified using Xhomologous NRY STSs, especially those from a 4-Mb region of 99% X–Y identity13, 14,
derived from the NRY or from the X chromosome. We resolved these ambiguities by
resequencing STSs from the BACs in question and comparing them to X- or Y-derived
reference sequences. The resulting map of overlapping BACs and ordered STSs (Fig. 1 in
Supplementary Information) was extensively cross-checked against the radiation hybrid
map and was further reinforced by restriction fingerprinting of all mapped BACs15.
Figure 1 Repetitive structure of euchromatic NRY. Full legend
High resolution image and legend (48k)
The greatest challenges were posed by massive, NRY-specific amplified regions (or
amplicons), which comprise about one-third of the euchromatic NRY. Of the 758 STSs on
which the map is built, 136 are present at two or more locations in the NRY. Although we
avoided such repetitive STSs in favour of single-copy STSs wherever possible, substantial
portions of the euchromatic NRY contained little or no single-copy sequence. For many
such amplicons, BACs derived from different copies could not be distinguished by STS
content or restriction fingerprinting15. In many cases, we distinguished among amplicon
copies (and the BACs corresponding to them) by typing 'sequence family variants' (SFVs)5.
SFVs are subtle differences (for example, single-nucleotide substitutions or dinucleotide
repeat length alterations) between closely related but non-allelic sequences. We were
analysing BACs from only one male's Y chromosome, so these subtle sequence differences
could not represent allelic variants. In general, we identified SFVs only after comparing the
DNA sequences of BACs that originated from distinct copies, despite having similar STS
content. Thus, mapping and sequencing were inseparable, iterative activities in ampliconrich regions.
The euchromatic NRY amplicons are diverse in composition, size, copy number and
orientation (Fig. 1), with some occurring as tandem repeats, others as inverted repeats, and
still others dispersed throughout both arms of the chromosome. The euchromatic amplicons
are well populated with testis-specific gene families that may be critical in spermatogenesis
(see Fig. 1 in Supplementary Information)1, 2.
One pair of amplicons is of particular interest in the context of human variation.
Highlighted in Fig. 1 (arrows) are two units, each 300 kb long, that exist in opposite
orientations on the short arm. These inverted repeats bound a region of around 3.5 Mb that
occurs in one orientation (Fig. 1 in Supplementary Information) in the male from whom the
BAC library was constructed, but in the opposite orientation in the existing map of
naturally occurring deletions6. This may reflect variation among men for a 3.5-Mb
inversion1, 16, perhaps the result of homologous recombination between the 300-kb inverted
repeats flanking the inverted segment. Large Y-chromosome inversions are postulated to
have been crucial in the evolution of the human sex chromosomes12, and this 3.5-Mb
inversion may be one of many massive NRY variants that exist in modern populations.
Supplementary information accompanies this paper.
Received 16 November 2000;
accepted 21 December 2000
References
1. Vogt, P. H. et al. Report of the Third International Workshop on Y Chromosome Mapping 1997.
Cytogenet. Cell Genet. 79, 1-20 (1997). | PubMed | ChemPort |
2. Lahn, B. T. & Page, D. C. Functional coherence of the human Y chromosome. Science 278,
675-680 (1997). | Article | PubMed | ISI | ChemPort |
3. Foote, S., Vollrath, D., Hilton, A. & Page, D. C. The human Y chromosome: Overlapping DNA
clones spanning the euchromatic region. Science 258, 60-66
(1992). | PubMed | ISI | ChemPort |
4. Reijo, R. et al. Diverse spermatogenic defects in humans caused by Y chromosome deletions
encompassing a novel RNA-binding protein gene. Nature Genet. 10, 383-393
(1995). | PubMed | ISI | ChemPort |
5. Saxena, R. et al. Four DAZ genes in two clusters found in the AZFc region of the human Y
chromosome. Genomics 67, 256-267 (2000). | Article | PubMed | ISI | ChemPort |
6. Vollrath, D. et al. The human Y chromosome: A 43-interval map based on naturally occurring
deletions. Science 258, 52-59 (1992). | PubMed | ISI | ChemPort |
7. Hudson, T. J. et al. An STS-based map of the human genome. Science 270, 1945-1954
(1995). | PubMed | ISI | ChemPort |
8. Gyapay, G. et al. A radiation hybrid map of the human genome. Hum. Mol. Genet. 5, 339-346
(1996). | Article | PubMed | ISI | ChemPort |
9. Stewart, E. A. et al. An STS-based radiation hybrid map of the human genome. Genome Res.
7, 422-433 (1997). | PubMed | ISI | ChemPort |
10. Deloukas, P. et al. A physical map of 30,000 human genes. Science 282, 744-746
(1998). | Article | PubMed | ISI | ChemPort |
11. Lunetta, K. L., Boehnke, M., Lange, K. & Cox, D. R. Selected locus and multiple panel models
for radiation hybrid mapping. Am. J. Hum. Genet. 59, 717-725
(1996). | PubMed | ISI | ChemPort |
12. Lahn, B. T. & Page, D. C. Four evolutionary strata on the human X chromosome. Science 286,
964-967 (1999). | Article | PubMed | ISI | ChemPort |
13. Mumm, S., Molini, B., Terrell, J., Srivastava, A. & Schlessinger, D. Evolutionary features of the
4-Mb Xq21.3 XY homology region revealed by a map at 60-kb resolution. Genome Res. 7,
307-314 (1997). | PubMed | ISI | ChemPort |
14. Schwartz, A. et al. Reconstructing hominid Y evolution: X-homologous block, created by X-Y
transposition, was disrupted by Yp inversion through LINE-LINE recombination. Hum. Mol.
Genet. 7, 1-11 (1998). | Article | PubMed | ISI | ChemPort |
15. The International Human Genome Mapping Consortium. A physical map of the human
genome. Nature 409, 934-941 (2001). | Article | PubMed | ChemPort |
16. Jobling, M. A. et al. A selective difference between human Y-chromosomal DNA haplotypes.
Curr. Biol. 8, 1391-1394 (1998). | PubMed | ISI | ChemPort |
Acknowledgements. We thank C. Nusbaum and T. Hudson for materials and advice on
radiation hybrid mapping; J. Crockett, J. Fedele, N. Florence, H. Grover, C. McCabe, N.
Mudd, S. Sasso, D. Scheer, R. Seim and P. Shelby for technical contributions; and D.
Berry, J. Bradley, Y. Lim, A. Lin, D. Menke, M. Royce-Tolland, J. Saionz and J. Wang for
comments on the manuscript. Supported in part by NIH.
Figure 1 Repetitive structure of euchromatic NRY. Bottom, schematic of the Y chromosome,
comprising large NRY flanked by pseudoautosomal regions (yellow). NRY is divided into
euchromatic and heterochromatic (tan, shown truncated) portions, roughly 24 and 30 Mb,
respectively. pter, short-arm telomere; cen, centromere; qter, long-arm telomere. Within
euchromatic NRY, regions rich in NRY-specific amplicons (blue) or sequence similarity to X
chromosome (red) are shown. Above chromosome schematic are positions of some NRY genes;
most are found in amplicons (blue) or have X-linked homologues (red). Above genes is a plot of
the average number of NRY BACs that contain each of the 758 STSs mapped (136 of these STSs at
two or more locations) along euchromatic NRY. As expected, STSs in amplicon regions tended to
be present in more BACs than STSs in X-homologous or unshaded regions. (Plotted values are
local averages within sliding window of five consecutive STSs; values reflect all NRY BACs
containing those STSs, not just BACs assigned to site indicated.) Some amplicon regions were
under-represented in the BAC library; four gaps remain (red diamonds; 100 kb each) in BAC
coverage of NRY. Top, STS-based dot plot of euchromatic NRY. Each dot reflects occurrence of a
particular STS at two points in map (complete map shown in Fig. 1 in Supplementary Information).
Dots fall almost exclusively within amplicon regions. Many repeats of entire groups of STSs are
apparent, with lines parallel to light grey diagonal indicating direct repeats and lines perpendicular
to light grey diagonal indicating inverted repeats. Green arrows, inverted repeats flanking 3.5-Mb
inversion (see text). Pale red lines, centromere.
Download