Computational Biology 9 the consequences of genetic variation. Often, polymorphism scanning has been confined to genomic sequences within or near exons (20). However, genetic variation is not introduced into the genome by processes that respect boundaries between genes, nor does it select for coding sequences. Furthermore, the effect of a sequence change may not be confined to a single gene. Therefore, it is likely to be more efficient and productive to evaluate sequence variation over a chromosomal region when assessing the effect of genotype on phenotype. Although SNPs at different sites within a chromosomal region can be identified and characterized individually, alleles at different positions along a chromosome can be associated. The presence or absence of an allele at one site can provide information about alleles at other sites; this association between alleles is called linkage disequilibrium. A haplotype can be defined as the relationship between deoxyribonucleic acid (DNA) sequence variants in the same gene, region, or chromosome. Analysis of polymorphisms over large genomic segments of the human genome has indicated that polymorphic variation at multiple sites within a chromosomal region can be grouped into patterns (“haplotypes”) with high linkage disequilibrium (20–23). Regions with low linkage disequilibrium separate the haplotypic blocks. The size of the genomic sequence contained within a haplotypic block can range from a few to over 100 kb. Analysis of human chromosome 21 indicated that about half the haplotypic blocks identified, each with an average size of 7.8 kb, could be defined by less than three SNPs, and less than three different haplotypes within a block encompassed most of the population (80%) (23). Unfortunately, the structure of the haplotypic blocks cannot be determined empirically. A large amount of detailed sequence information must be available to identify the haplotype-defining SNPs and the conserved blocks. It is also likely that the size and structure of the haplotypic blocks will change as sequence information from more individuals is obtained and analyzed. However, analysis of the haplotypic patterns enables individuals within a population to be segmented into a finite number of small groups sharing the same haplotype for a particular chromosomal region. Identification of haplotypes to segment the human population has great potential. This will decrease the amount of genotyping required to characterize a genomic region, and the haplotypic information will enable a human population to be segregated into a finite number of different groups. The frequency of disease or disease-associated traits can be compared among the groups with different genetic haplotypes within a chromosomal region. It is hoped that this will provide a more efficient method for identification of disease-susceptibility regions in human populations. One of the first examples of linkage disequilibrium mapping using haplotypes was the identification of a 250-kb region of human chromosome 5q31 associated with Crohn’s disease susceptibility (24). There were 11 SNPs with strong linkage disequilibrium in the 5q31 region associated with Crohn’s disease susceptibility, and it was not possible to identify an individual disease-associated SNP. These results are consistent with the possibility that a set of polymorphisms within a chromosomal region, which may effect more than a single gene, contribute to the disease susceptibility. Similarly, inbred mouse strains are particularly useful for genetic analysis because the entire genome of an inbred strain is effective in linkage disequilibrium. The parental origin of DNA segments in intercross progeny over entire chromosomes can be inferred by analysis of only a few polymorphic markers (25). Furthermore, there is extensive linkage disequilibrium among polymorphisms in the genome of inbred strains. Analogous to the human population, SNPs among the inbred strains can be organized into haplotypic blocks (26). Analysis of regions linked to susceptibility to complex disease-related traits in mouse models has also indicated that genetic changes across chromosomal regions affecting multiple genes, rather than within a particular gene, may contribute to susceptibility. For example, a region on chromosome 1 that controls autoantibody production in a mouse model of SLE was analyzed. Polymorphisms within a set of co-linear interferon-inducible genes in this region were responsible for differential autoantibody production in this model (27). This result appears to be applicable to other mouse models of human disease related traits. Fine-mapping analysis often identifies several distinct subloci within a linked chromosomal region that independently contribute to the phenotypic trait. Additional analysis of a linked chromosomal region regulating autoantibody production and nephritis in the murine model of systemic lupus demonstrated that the interval consisted of at least four distinct genetic loci (28). Similarly, the ability of our “digital disease” computer program to identify chromosomal regions regulating complex traits in mice is likely to result from recognition of patterns of genetic variation over large (10 cm) regions within the mouse genome (29). Analysis of the patterns of variation over larger regions is likely to be informative in situations when analysis of a single SNP does not reveal genotype– phenotype correlation. 4. Integrative approaches must be utilized to efficiently analyze complex biological processes. Only a limited amount of resolution can be achieved with the use of any single approach for analyzing a complex biological .