Haplotype-Based Computational Genetic

advertisement

Haplotype-Based Computational Genetic

Analysis In Mice

1. INTRODUCTION

A number of significant discoveries have resulted from genetic analysis of model experimental organisms. Improved methods for quantitative trait analysis, a process referred to as quantitative trait locus (QTL) mapping, have enabled investigators to make genetic discoveries. This mapping method requires the experimental generation of intercross progeny derived from two selected parental strains, chosen because they differ in a trait of interest. Through correlative analysis of the measured phenotype and genotype at multiple positions in the genome for each intercross progeny, regions of the genome responsible for the differences in the trait are identified. The genomic regions that quantitatively contribute to the trait are referred to as QTL. QTL analysis has been successfully used to map important traits in crop plants, cattle, fruit flies, mice, and many other model organisms. The statistical basis for QTL mapping has been thoroughly investigated

(reviewed in ref. 1 ). Based on this statistical underpinning, experimental crosses using model organisms can be designed to reliably detect QTLs, even when the involved regions make a relatively small contribution to the trait being studied. Many traits of biomedical importance are now routinely studied by genetic analysis of mammalian experimental models, primarily using inbred mouse or rat strains. However, there are significant liabilities associated with QTL analysis, especially when applied to mammalian organisms. First, the resolution of QTL mapping is limited. An implicated region identified by QTL analysis typically ranges from 10 to 100 Mb in size. Because of inherent limitations within

QTL mapping methods, the resolution does not increase significantly as the density of the markers and the

Computational Genetics and Genomics:

Edited by: G. Peltz © Humana Press Inc., Totowa, NJ number of intercross progeny analyzed are increased ( 2 ). The 95% confidence interval is often greater than half a chromosome for a genetic locus of moderate effect, even when identified by analysis of 500 or more intercross progeny ( 3 ). Second, a significant amount of time and cost is required for generating and analyzing mouse or rat intercross progeny.

The process of generating, genotyping, and phenotyping 200–1000 mice or rats required for analysis of a selected trait usually requires a 2-yr period. The genetic interval can be further narrowed by analysis of an experimentally produced congenic strain. The congenic mouse is produced by introgressing the involved segment of the genome from one strain onto the genetic background of the other strain. Generation of

congenic strains requires an additional 2 yr, adding more time onto an already long process. Other independent methods of analysis are then used to identify the genetic variant(s) within the QTL interval causing the trait difference. To overcome the cost, time, and resolution issues associated with QTL analysis, we have developed a computational method for genetic mapping that correlates phenotypic differences among a set of inbred mouse strains with genotypic differences ( 4 ).

Although this method was developed for analysis of genetic traits in mice, it can be applied to any experimental organism. However, to use an organism other than the mouse, there must be wellmaintained inbred strains, a physical map of the genome, and a database characterizing the pattern of genetic variation among the strains analyzed. The set of genetic markers must be dense enough to cover all genes of interest and should characterize all polymorphic patterns for all the inbred strains selected for analysis. Most importantly, computational mapping by this method does not require generation of intercross progeny. Phenotypic analysis is performed on only a selected set of available parental strains. Although establishing the genotypic database for the inbred strain panel is costly, the cost is well justified, because it is amortized across all subsequently performed experiments.

The computational method maps traits at high resolution using a relatively small number of inbred strains, usually to an interval that is below the size of a single gene. This increased precision is possible because the density of genotypic markers is very high, and the computational mapping method does not depend on recombinations occurring over two generations, which are relatively rare events.

Furthermore, homozygosity of the genome of the inbred strains eliminates confounding effects because of allelic heterogeneity at a locus, and modeling the effects of dominance and additivity is not required. In addition to its low cost and precision, the computational mapping method has one other significant advantage. Because genetically identical and widely available inbred strains are analyzed, it

52 Wang and Peltz Haplotype-Based Computational Genetic Analysis

In Mice 53 enables the results to be repeated and widely replicated. The ability to analyze complex genetic traits remains an advantage for conventional QTL mapping approaches at this time. The computational mapping methods have relatively low power for analyzing genetic traits regulated by a large number of different genetic loci, each of small effect size ( Table 1 ). However, as the number of strains that are genetically and phenotypically analyzed is increased, the complexity of the genetic traits that can be analyzed by the computational mapping method will increase. Although we are currently analyzing 20 or fewer inbred strains, it is not unlikely that the number of characterized inbred mouse strains will increase to nearly 100 within 3–5 yr.

Previously, a genome-wide computational mapping method was developed and referred to as digital disease ( 5 ). This method utilized a relatively crude calculation of the correlation coefficient between trait values measured among inbred mouse strains and single nucleotide polymorphism (SNP) alleles within chromosome regions. It correctly mapped selected traits to 30-cM chromosomal regions. In this chapter, we outline a haplotype-based computational method for genetic analysis of phenotypic traits using inbred mouse strains. The haplotype- based method utilizes the same principle of finding patterns of genetic variation that correlate with phenotypic differences among the strains.

However, the haplotype-based computational mapping method is radically different; it is based on a highly quantitative model. This method correctly identified known genetic loci for previously characterized traits and was used to discover anovel allele-specific enhancer element in the mouse genome ( 4 ).

Download