Supplementary Information
Rat History.
The rat remains a major pest contributing to famine (rodents eat about one-fifth of the
world’s food supply annually) yet its contribution to human health cannot be
overestimated, from new drugs (most are tested in rats), to understanding essential
nutrients, to increasing knowledge of the pathobiology of human disease. Humans invest
billions of dollars each year to exterminate rats, yet raising rats for research is a $1 billion
per year industry. In many parts of the world the rat remains a source of meat. We loathe
seeing them, but are fascinated by their feats: they survived nuclear testing in Engebi
atoll, their bite exerts 24,000 lbs per square inch, a single pair could produce 15,000
descendents in a year, and an adult can squeeze through a hole the size of a quarter.
The laboratory rat (Rattus norvegicus) originates from central Asia and its success at
conquering the world can be directly attributed to its relationship with humans1. The rat
eats what we do and travels well with us. Opening trade routes spread the various species
of rat and its scourge around the world. Yet today’s plethora of rat clubs, breeders,
newsletters, and commercial goods with logos of rats, indicates many people are fans of
The history of the rat is obscured by confusing nomenclature. For many years there was
not a distinction between mice and rats. In Latin, the word for ‘rat’ is mus; modern
Chinese similarly does not differentiate between the two. Samuel Johnson’s 18th century
dictionary2 defines rat as “an animal of the mouse kind that infests houses and ships”.
Clearly, rats and mice were considered virtually synonymous. Even the history of the
laboratory rat is confusing. “Norways have norvegicus as their species name because, in
his "Outline of the Natural History of Great Britain" of 1769, J. Berkenhout used it in the
first formal, Linnaean description of the species. The full, formal name of the species is
therefore: Rattus norvegicus Berkenhout." He was mistaken in thinking the species came
from Norway, but this does not invalidate the name. While the black rat (Rattus rattus)
was part of the European landscape from at least the third century and is the species
associated with the spread of the bubonic plague, Rattus norvegicus probably originated
in northern China and migrated to Europe somewhere around the mid-1500s3. They may
have entered Europe as hordes crossing the Volga River, a phenomenon observed by the
naturalist Pallas in 1727. The species are great swimmers and exhibit horde migration
when their density outstrips their food supply.
The rat in research.
The use of animal models for research into human disease is a recent innovation in the
history of medicine4. The first recorded breeding colony for rats was established in 1856
(Hedrich chapter). Historically, rat genetics had a surprisingly early start. Hugo De
Vries, Karl Correns and Erich Tschermak rediscovered Mendel’s laws at the turn of the
century, and Bateson used these concepts in 1903 to demonstrate that rat coat color is a
Mendelian trait. These inbred strains of rats are used for research in numerous areas.
( Current rat research is very prolific with 28,049 manuscripts
published in 2003 (search terms: rat NOT mouse AND 2003), exceeding 27,059
publications using mouse in 2003 (search terms: mouse NOT rat AND 2003).
The Rat Genome Project
Prior to the decision to sequence the rat genome, there was much discussion about the
value of having the rat genome sequence, as well as the utility of the rat as a major model
organism. A major limitation was the naïve belief that the rat and mouse were so similar
morphologically, and so close evolutionarily, that it was redundant to sequence both
rodents. Nevertheless, wisdom prevailed affording the first 3-way whole genome
comparison as well as the first whole genome comparison of near evolutionary neighbors.
The Rat Genome Sequencing Project Consortium (RGSPC) was formed in response to
this commitment of resources. A network of centers took responsibility for data and
resource generation led by the Baylor College of Medicine Human Genome Sequencing
Center (BCM-HGSC) and including Celera Genomics, Genome Therapeutics
Corporation, British Columbia Cancer Agency Genome Sciences Centre, The Institute for
Genomic Research, University of Utah, Medical College of Wisconsin, The Children’s
Hospital of Oakland Research Institute, and Max-Delbruck-Center for Molecular
Medicine (Berlin). After assembly of the genome at the BCM-HGSC, analysis was
performed by an international group, representing over 20 groups in 6 countries and
relying largely on gene and protein predictions produced at Ensembl. In this paper and
the companion papers, investigators have taken an initial slice through the analysis of the
rat genome sequence, both by itself and in context with the mouse and human genome
sequences. Although the rat is not a member of the Security Council of Model Genetic
Organisms (Fink ref), this publication of the draft sequence and the compendium papers
(Genome Research volume) shows the value of this organism.
Overview of sequencing strategy
The goal of the RGSP was to produce a draft sequence of the rat genome without the
intention of moving to a final higher quality ‘finished’ sequence5. The quality of the draft
rat sequence was thus more critical than the comparable human and mouse genome
projects, where errors were ultimately corrected in a finished sequence. Despite the
considerable progress in assembling draft sequences of large genomes6-14 the question of
which method produced the highest quality draft sequence was unresolved. The most
significant issue was the choice between logistically simpler whole genome shotgun
(WGS) approaches versus more complex approaches employing BAC clones. In the
extreme is the BAC-by-BAC approach, such as used for the NIH Human Genome
Project6, requiring individual sequencing of a set of BACs comprising a tiling path
covering the whole genome.
The principal challenge in assembling large genomes is correctly dealing with repeated
sequences, comprising up to half of the genome. In pure WGS assembly this is
confounded by having to correctly align reads from all over the genome whereas in a
BAC-based assembly the problem is confined to dealing with only those reads in the
region covered by a BAC. Thus BACs reduce the assembly problem to a local one,
simplifying the repeat problem. Although still debated15-18 the sense from the human
genome project was considerable benefit from either sequencing individual BAC clones
or including BAC clones in a mixed assembly with WGS sequences. Although the draft
mouse genome sequence was a pure WGS approach7, the project planned full use of BAC
clones in constructing the final finished sequence. The loss of segmental duplication
regions due to ‘collapses’ in the draft mouse genome assembly7,19-21 suggested serious
limitations on the quality of a draft sequence based only on WGS sequences, and this
type of defect could not be tolerated in a draft sequence that would not be taken to the
higher finished grade.
Mammalian X chromosome evolution
The assignment of the accelerated activity to the rodent branch, following the primaterodent divergence, is consistent with previous studies at significantly lower resolution,
showing complete conservation of marker order between the X chromosomes of human
and cat22, human and dog23, and human and lemur24, as well as similar karyotypes of the
X chromosomes in human, chimpanzees, gorillas, and orangutans25 (the karyotypes have
deletions in telomeres, but no rearrangements). Other studies showed only small
rearrangements between the X chromosomes of human and pig26, and human and horse27.
All of these species except the primates serve as evolutionary outgroups28 to human,
mouse, and rat, and all the primates29 have consistent order in the X chromosome, thus
suggesting, independently of the current report, that the marker order on the human X
chromosome is ancestral for the primate-rodent ancestor. Indeed, Lahn and Page30
showed evidence that the marker order on the human X chromosome has not changed
since the ancestor 240-320 million years ago.
Rat single nucleotide polymorphisms
The rat cSNP pilot is based upon sequences generated from randomly chosen clones from
cDNA libraries from three different rat strains: spontaneously hypertensive rat strokeprone (SHRsp), Wistar-Kyoto (WKY), and Sprague-Dawley (SD). These data so far
show that the average density of cDNA derived SNPs between the BN sequence and each
of the three strains is approximately 1 SNP/1,100 bp31. To date, over 10,000 unique
SNPs have been identified and will be publicly available from the Ensembl web site.
This collection is expected to grow over the coming year, to include cSNPs in the
majority of rat genes.
The value of this dataset is illustrated in an ongoing study that is searching for genes
involved in blood pressure regulation in the SHRsp.127,32,33, an important model for
identifying genetic factors responsible for predisposition to cardiovascular disease. We
screened all transcripts containing non-synonymous changes to identify potential
candidate genes for blood pressure regulation. Among these was a variant (R300H) in the
transcript for kallistatin, a protease inhibitor that acts as a potent arterial vasodilator via
endothelium independent mechanisms34. Subsequent genetic analysis showed a strong
correlation (p<0.0002) with diastolic blood pressure in response to dietary sodium
loading. This observation was further linked to biochemical observations of reduced
binding of kallistatin to its substrate kallikrein35. This variant therefore likely represents
the underlying cause for this diminished protein-protein interaction, and may thus be
causally related to enhanced sodium sensitivity and elevated blood pressure in genetically
hypertensive rats.
Further studies of kallistatin through transgenic approaches36,37 are underway. In the
meantime these data demonstrate the general power of a comprehensive genetic
discovery approach that couples a genomic reference sequence, subsequent sequence
based discovery of genetic variation, tightly coupled to precise phenotyping of an
important disorder. The analysis of other genetic determinants for blood pressure control
in rats is an ongoing effort.
