Chapter 5 The Content of the Genome 5.1 Introduction • genome – The complete set of sequences in the genetic material of an organism. – It includes the sequence of each chromosome plus any DNA in organelles. • transcriptome – The complete set of RNAs present in a cell, tissue, or organism. – Its complexity is due mostly to mRNAs, but it also includes noncoding RNAs. 5.1 Introduction • proteome – The complete set of proteins that is expressed by the entire genome. – The term is sometimes used to describe the complement of proteins expressed by a cell at any one time. • interactome – The complete set of protein complexes/protein–protein interactions present in a cell, tissue, or organism. 5.2 Genomes Can Be Mapped at Several Levels of Resolution • Linkage maps are based on the frequency of recombination between genetic markers. • Restriction maps are based on the physical distances between markers. • Molecular characterization of mutations can be used to reconcile linkage maps with physical restriction maps. 5.3 Individual Genomes Show Extensive Variation • Polymorphism may be detected at the phenotypic level when a sequence affects gene function, at the restriction fragment level when it affects a restriction enzyme target site, and at the sequence level by direct analysis of DNA. • The alleles of a gene show extensive polymorphism at the sequence level, but many sequence changes do not affect function. Figure 05.01: A point mutation that affects a restriction site is detected by a difference in restriction fragment lengths. 5.3 Individual Genomes Show Extensive Variation • restriction fragment length polymorphism (RFLP) – Inherited differences in sites for restriction enzymes (for example, caused by base changes in the target site) that result in differences in the lengths of the fragments produced by cleavage with the relevant restriction enzyme. – They are used for genetic mapping to link the genome directly to a conventional genetic marker. 5.3 Individual Genomes Show Extensive Variation Figure 05.03: A restriction polymorphism can be used as a genetic marker to measure recombination distance from a phenotypic marker. Figure 05.04: If a restriction marker is associated with a phenotypic characteristic, the restriction site must be located near the gene for the phenotype. 5.3 Individual Genomes Show Extensive Variation • single nucleotide polymorphism (SNP) – A polymorphism (variation in sequence between individuals) caused by a change in a single nucleotide. – SNPs are responsible for most of the genetic variation between individuals. 5.4 RFLPs and SNPs Can Be Used for Genetic Mapping • RFLPs and SNPs can be the basis for linkage maps and are useful for establishing parent– offspring relationships. • haplotype – The particular combination of alleles in a defined region of some chromosome; in effect, the genotype in miniature. – Originally used to describe combinations of major histocompatibility complex (MHC) alleles, it now may be used to describe particular combinations of RFLPs, SNPs, or other markers. 5.4 RFLPs and SNPs Can Be Used for Genetic Mapping • DNA fingerprinting – A technique for analyzing the differences between individuals of the fragments generated by using restriction enzymes to cleave regions that contain short repeated sequences, or by PCR. – The lengths of the repeated regions are unique to every individual. – The presence of a particular subset in any two individuals can be used to define their common inheritance (e.g., a parent–child relationship). 5.5 Eukaryotic Genomes Contain Both Nonrepetitive and Repetitive DNA Sequences • The kinetics of DNA reassociation after a genome has been denatured distinguish sequences by their frequency of repetition in the genome. • Polypeptides are generally encoded by sequences in nonrepetitive DNA. • Larger genomes within a taxonomic group do not contain more genes but have large amounts of repetitive DNA. 5.5 Eukaryotic Genomes Contain Both Nonrepetitive and Repetitive DNA Sequences Figure 05.05: The proportions of different sequence components vary in eukaryotic genomes. • A large part of moderately repetitive DNA may be made up of transposons. 5.6 Eukaryotic Protein-Coding Genes Can Be Identified by the Conservation of Exons • Conservation of exons can be used as the basis for identifying coding regions by identifying fragments whose sequences are present in multiple organisms. • zoo blot – The use of Southern blotting to test the ability of a DNA probe from one species to hybridize with the DNA from the genomes of a variety of other species. • Human disease genes are identified by mapping and sequencing DNA of patients to find differences from normal DNA that are genetically linked to the disease. 5.6 Eukaryotic Protein-Coding Genes Can Be Identified by the Conservation of Exons • exon trapping – Inserting a genomic fragment into a vector whose function depends on the provision of splicing junctions by the fragment. Figure 05.06: A special splicing vector is used for exon trapping. 5.7 The Conservation of Genome Organization Helps to Identify Genes • Methods for identifying functional genes are not perfect and many corrections must be made to preliminary estimates. • Pseudogenes must be distinguished from functional genes. Figure 05.07: Exons of protein-coding genes are identified as coding sequences flanked by appropriate signals. 5.7 The Conservation of Genome Organization Helps to Identify Genes • There are extensive syntenic relationships between the mouse and human genomes, and most functional genes are in a syntenic region. • synteny – A relationship between chromosomal regions of different species where homologous genes occur in the same order. Figure 05.08: Mouse chromosome 1 has 21 segments 1-25 Mb in length syntenic with regions corresponding to parts of six human chromosomes. 5.7 The Conservation of Genome Organization Helps to Identify Genes • expressed sequence tag (EST) – A short sequenced fragment of a cDNA sequence that can be used to identify an actively expressed gene. 5.8 Some Organelles Have DNA • Mitochondria and chloroplasts have genomes that show nonMendelian inheritance. Typically they are maternally inherited. • Organelle genomes may undergo somatic segregation in plants. Figure 05.10: DNA from the sperm enters the oocyte to form the male pronucleus in the egg, but all the mitochondria are provided by the oocyte. 5.8 Some Organelles Have DNA • extranuclear genes – Genes that reside outside the nucleus, in organelles such as mitochondria and chloroplasts. • Comparisons of human mitochondrial DNA suggest that it is descended from a single population that existed ~200,000 years ago in Africa. 5.9 Organelle Genomes Are Circular DNAs That Encode Organelle Proteins • Organelle genomes are usually (but not always) circular molecules of DNA. – Mitochondrial DNA (mtDNA) – Chloroplast DNA (cpDNA or ctDNA) • Organelle genomes encode some, but not all, of the proteins used in the organelle. Figure 05.11: Mitochondrial genomes have genes encoding (mostly complex I–IV) proteins, rRNAs, and tRNAs. 5.9 Organelle Genomes Are Circular DNAs That Encode Organelle Proteins • Animal cell mtDNA is extremely compact and typically encodes 13 proteins, 2 rRNAs, and 22 tRNAs. • D loop – A region of the animal mitochondrial DNA molecule that is variable in size and sequence and contains the origin of replication. • Yeast mtDNA is 5× longer than animal cell mtDNA because of the presence of long introns. 5.9 Organelle Genomes Are Circular DNAs That Encode Organelle Proteins Figure 05.12: Human mitochondrial DNA has 22 tRNA genes, two rRNA genes, and 13 protein-coding regions. 5.10 The Chloroplast Genome Encodes Many Proteins and RNAs • Chloroplast genomes vary in size, but are large enough to encode 50 to 100 proteins as well as rRNAs and tRNAs. Figure 05.14: The chloroplast genome in land plants encodes 4 rRNAs, 30 tRNAs, and ~60 proteins. 5.11 Mitochondria and Chloroplasts Evolved by Endosymbiosis • Both mitochondria and chloroplasts are descended from bacterial ancestors. • Most of the genes of the mitochondrial and chloroplast genomes have been transferred to the nucleus during the organelle’s evolution. Figure 05.15: Mitochondria originated by a endosymbiotic event when a bacterium was captured by a eukaryotic cell.