Haplotype Structure of the Mouse Genome

advertisement
Haplotype Structure of the Mouse Genome
Jianmei Wang
Guochun Liao, Janet Cheng,
Anh Nguyen, Jingshu Guo, Christopher Chou, Steven Hu, Sharon
Jiang, John Allard, Steve Shafer, Anne Puech, John D. McPherson,
Dorothee Foernzler, Gary Peltz, and Jonathan Usuka
1. INTRODUCTION
Commonly available inbred mouse strains can be used to genetically
model traits that vary in the human population, including those
associated with disease susceptibility. In order to understand how
genetic differences regulate trait variation in humans, we must first
develop a detailed understanding of how genetic variation in the mouse
produces the phenotypic differences among inbred mouse strains. The
information obtained from analysis of experimental murine genetic
models can direct biological experimentation, clinical research, and
human genetic analysis. This “mouse to man” approach will increase
our knowledge of the genes and pathways regulating important
biological processes and disease susceptibility. The availability of the
complete sequence of the mouse genome (1) enables the genetic
differences among commonly studied inbred strains to be characterized.
This will facilitate identification of the genetic basis for phenotypic trait
differences among the inbred strains. To do this, we have analyzed the
pattern of genetic variation among 18 inbred mouse strains and have
produced a high-resolution haplotypic map of the inbred mouse
genome. This haplotypic map covers 75 Mb of the mouse genome. An
additional 99 Mb of the mouse genome, which was not polymorphic
among the 16 Mus musculus strains, was also analyzed. Analysis of the
genetic distance between inbred strains and of the haplotypic blocks
generated using different strains demonstrated that inclusion of only
the 16 M. musculus strains 71
Computational Genetics and Genomics:
Edited by: G. Peltz © Humana Press Inc., Totowa, NJ
produced balanced haplotypic block structures that reflected extensive
allele sharing among closely related inbred strains. Although haplotypic
blocks in the inbred mouse genome had similarities with those
described in humans, there are important differences that increase the
likelihood that genetic variants underlying phenotypic trait differences
can be successfully identified in the mouse.
2. CHARACTERIZATION OF GENETIC VARIATION AMONG
INBRED STRAINS
Polymorphisms were identified by resequencing targeted genomic
regions in 1672 genes across 18 inbred mouse strains (2): 129/Sv, A/HeJ,
A/J, AKR/J, B10.D2-H2/oSnJ, BALB/cByJ, BALB/cJ, C3H/HeJ,
C57BL/6J, CAST/Ei, DBA/2J, LG/J, LP/J, MRL/MpJ, NZB/BinJ,
NZW/LaC, and SM/J SPRET/Ei. Identification of single nucleotide
polymorphisms (SNPs) was performed by targeted resequencing of
genomic regions using methods that have been described previously (2).
For genes that were less than 5 kb in size, the entire gene was analyzed
for polymorphisms. For genes greater than 5 kb in size, a 1-kb region
surrounding each exon, a 2-kb region at 5’ of the transcriptional start
site, and a 500-bp segment downstream of the 3’ end of the transcript
were analyzed. Both strands of a selected genomic region were
sequenced, and sequence waveforms were analyzed using Phred and
Phrap (3,4). Potential polymorphisms were identified, and sequence
quality was assessed in an automated fashion. Only SNPs with very
high-quality sequence were accepted: those with either single stranded
sequence with Phred scores equal to or above 30 or (more commonly)
double stranded DNA sequence with Phred scores equal to or above 20
for both strands. The mouse SNP database used in this study contained
105,064 unique SNPs, and a total of 1,440,349 alleles were characterized
for these 18 strains. The number of SNPs on each chromosome ranged
from a low value of 1083 SNPs on chromosome 18 to 16,615 SNPs on
chromosome 7 (Table 1). The genetic distance between the inbred
mouse strains was assessed using this allelic information. To measure
this, the percent allelic difference was calculated as the ratio of the
number of SNPs identified using only a selected pair of strains to the
total number of SNPs identified among all 18 inbred strains. The
CAST/Ei and SPRET/Ei strains were derived from wild mice of Asian
and European origin, respectively. The 16 other M. musculus strains
were bred from a small group of mice at the beginning of the last
century (reviewed in ref. 5). Consistent with their independent origin,
the CAST/Ei and SPRET/Ei strains have more than 39 and 70%,
respectively, 72 Wang et al.
Haplotype Structure of the Mouse Genome :
allelic differences when compared with any one of the 16 other M.
musculus strains (Table 2). In contrast, the 16 other M. musculus strains
were far more genetically similar. The allelic differences among M.
musculus strain pairs ranged from 0.8% (A/HeJ:A/J) to 16.4%
(NZW/LaC:Balb/cJ) (Table 2). The genetic distance revealed by SNP
allelic information is consistent with published genealogies of mouse
inbred strains (5).
Download