RNA Splicing Processing of Primary RNA Transcripts into mRNA Possible post-transcriptional controls on gene expression. Only a few of these controls are likely to be used for any one gene. Amount of DNA in the genomes of various organisms. The Relationship Between Gene Size and mRNA Size Species Average exon number Average gene length (kb) Hemophilus influenzae Methanococcus jannaschii 1 1 1.0 1.0 S. cerevisiae 1 1.6 Filamentous fungi 3 1.5 C. elegans 4 4.0 D. Melanogaster 4 11.3 Chicken 9 13.9 Mammals 7 16.6 -----------------------------------------------------------------------SOURCE: Based on B. Lewin, Genes 5, Table 2-2. Oxford University Press. ------------------------------------------------------------------------ Average mRNA length (kb) 1.0 1.0 1.6 1.5 3.0 2.7 2.4 2.2 Synthesis of a primary RNA transcript (an mRNA precursor) by RNA polymerase II. This diagram starts with a polymerase that has just begun synthesizing an RNA chain. Recognition of a poly-A addition signal in the growing RNA transcript causes the chain to be cleaved and then polyadenylated as shown. In yeasts the polymerase terminates its RNA synthesis almost immediately thereafter, but in higher eukaryotes it often continues transcription for thousands of nucleotides. The reactions that cap the 5' end of each RNA molecule synthesized by RNA polymerase II. The final cap contains a novel 5'-to-5' linkage between the positively charged 7-methyl G residue and the 5' end of the RNA transcript. Polymerase I and III transcripts are not capped. The indicated reaction occurs almost immediately following initiation of each RNA chain. The letter N is used here to represent any one of the four ribonucleotides, although the nucleotide that starts an RNA chain is usually a purine (an A or a G). (After A.J. Shatkin, Bioessays 7:275-277, 1987. © ICSU Press.) The first two reactions are catalyzed by a capping enzyme that associates with the phosphorylated CTD of RNA polymerase II shortly after transcription initiation. Two different methyltransferases catalyze reactions 3 and 4. Sadenosylmethionine (S-Ado-Met) is the source of the methyl (CH3) group for the two methylation steps; the guanylate (G) is methylated first, then the 2’ hydroxyl of the first one or two nucleotides (N) in the transcript. [See S. Venkatesan and B. Moss, 1982, Proc. Nat’l. Acad. Sci. USA 79:304.] Capping enzyme: Phosphatase+ Guanyl transferase Methyltransferases Structure of the 5’ methylated cap of eukaryotic mRNA. The distinguishing chemical features are the 5’ 5’ linkage of 7-methylguanylate to the initial nucleotide of the mRNA molecule and the methyl group on the 2’ hydroxyl of the ribose of the first nucleotide (base 1). Both these features occur in all animal cells and in cells of higher plants; yeasts lack the methyl group on base 1. The ribose of the second nucleotide (base 2) also is methylated in vertebrates. [See A. J. Shatkin, 1976, Cell 9:645.] Model for cleavage and polyadenylation of pre-mRNAs in mammalian cells. Cleavage-and-polyadenylation specificity factor (CPSF) binds to an upstream AAUAAA polyadenylation signal. CStF interacts with a downstream GU- or U-rich sequence and with bound CPSF, forming a loop in the RNA; binding of CFI and CFII help stabilize the complex. Binding of poly(A) polymerase (PAP) then stimulates cleavage at a poly(A) site, which usually is 10 – 35 nucleotides 3’ of the upstream polyadenylation signal. The cleavage factors are released, as is the downstream RNA cleavage product, which is rapidly degraded. Bound PAP then adds ≈12 A residues at a slow rate to the 3’-hydroxyl group generated by the cleavage reaction. Binding of poly(A)-binding protein II (PABII) to the initial short poly(A) tail accelerates the rate of addition by PAP. After 200 – 250 A residues have been added, PABII signals PAP to stop polymerization. Overview of RNA processing in eukaryotes using the β-globin gene as an example. The β-globin gene contains three proteincoding exons (red) and two intervening noncoding introns (blue). The introns interrupt the protein-coding sequence between the codons for amino acids 31 and 32 and 105 and 106. Transcription of this and many other genes starts slightly upstream of the 5’ exon and extends downstream of the 3’ exon, resulting in noncoding regions (gray) at the ends of the primary transcript. These regions, referred to as untranslated regions (UTRs), are retained during processing. The 5’ 7methylguanylate cap (m7Gppp; green dot) is added during formation of the primary RNA transcript, which extends beyond the poly(A) site. After cleavage at the poly(A) site and addition of multiple A residues to the 3’ end, splicing removes the introns and joins the exons. The small numbers refer to positions in the 147-aa sequence of β -globin. Early evidence for the existence of introns in eukaryotic genes. The evidence was provided by the "R-loop technique," in which a base-paired complex between mRNA and DNA molecules is visualized in the electron microscope. An unusually abundant mRNA molecule, such as β-globin mRNA or ovalbumin mRNA, is readily purified from the specialized cells that produce it. When this single-stranded mRNA preparation is annealed in a suitable solvent to a cloned double-stranded DNA molecule containing the gene that encodes the mRNA, the RNA can displace a DNA strand wherever the two sequences match and form regions of RNA-DNA helix. Regions of DNA where no match to the mRNA sequence is possible are clearly visible as large loops of double-stranded DNA. Each of these loops (numbered 1 to 6) represents an intron in the gene sequence. Consensus sequences for RNA splicing in higher eukaryotes. The sequence given is that for the RNA chain; the nearly invariant GU and AG dinucleotides at either end of the intron sequence are highlighted in red, as is the conserved A at the branch point. The numbers below the nucleotides represent percent conservation. Splicing of exons in pre-mRNA occurs via two transesterification reactions. In the first reaction, the ester bond between the 5’ phosphorus of the intron and the 3’ oxygen (red) of exon 1 is exchanged for an ester bond with the 2’ oxygen (dark blue) of the branch-site A residue. In the second reaction, the ester bond between the 5’ phosphorus of exon 2 and the 3’ oxygen (light blue) of the intron is exchanged for an ester bond with the 3’ oxygen of exon 1, releasing the intron as a lariat structure and joining the two exons. Arrows show where the activated hydroxyl oxygens react with phosphorus atoms. Structure of the branched RNA chain that forms during nuclear RNA splicing. The nucleotide shown in yellow is the A nucleotide at the branch site. The branch is formed in step 1 of the splicing reaction, when the 5' end of the intron sequence couples covalently to the 2'-OH ribose group of the A nucleotide, which is located about 30 nucleotides from the 3' end of the intron sequence. The branched chain remains in the final excised intron sequence and is responsible for its lariat form. Analysis of RNA products formed in an in vitro splicing reaction A nuclear extract from HeLa cells was incubated with a 497-nucleotide radiolabeled RNA (bottom) that contained portions of two exons (orange and tan) from human β-globin mRNA separated by a 130-nucleotide intron (blue). After incubation for various times, the RNA was purified and subjected to electrophoresis and autoradiography, along with RNA markers (lane M). The number of nucleotides in the various species is indicated. Much of the slower-migrating starting RNA (497) was correctly spliced, yielding a 367-nucleotide product. The excised intron (130*) migrated slower than expected based on its molecular weight, indicating that it is not a linear molecule. Likewise, one of the reaction intermediates (339*) exhibited an anomalously slow electrophoretic mobility. Additional analysis indicated that in both cases the intron had a lariat structure resulting in the slow mobility. The 252** band, an aberrant product of the in vitro reaction, is greatly reduced in reactions in which the RNA is capped. [From B. Ruskin et al., 1984, Cell 38:317; photograph courtesy of Michael R. Green. See also R. A. Padgett et al., 1984, Science 225:898.] The spliceosomal splicing cycle. The splicing snRNPs (U1, U2, U4, U5, and U6) associate with the premRNA and with each other in an ordered sequence to form the spliceosome. Although ATP hydrolysis is not required for the transesterification reactions, it is thought to provide the energy necessary for rearrangements of the spliceosome structure that occur during the cycle. The branch-point A in pre-mRNA is indicated in boldface. [See S. W. Ruby and J. Abelson, 1991, Trends Genet. 7:79; adapted from M. J. Moore et al., 1993, in R. Gesteland and J. Atkins, eds., The RNA World, Cold Spring Harbor Press, pp. 303-357.] Diagram of interactions between pre-mRNA, U1 snRNA, and U2 snRNA early in the splicing process. The 5’ region of U1 snRNA initially base-pairs with nucleotides at the 5’ end of the intron (blue) and 3’ end of the 5’ exon (dark red) of the pre-mRNA; U2 snRNA base-pairs with a sequence that includes the branch-point A, although this residue is not base-paired. The yeast branchpoint sequence is shown here. Secondary structures in the snRNAs that are not altered during splicing are shown in diagrammatic line form. The purple rectangles represent sequences that bind snRNP proteins recognized by anti-Sm antibodies. For unknown reasons, antisera from patients with the autoimmune disease systemic lupus erythematosus (SLE) contain these antibodies. Such antisera have been useful in characterizing components of the splicing reaction. [See E. J. Sontheimer and J. A. Steitz, 1993, Science 262:1989; adapted from M. J. Moore et al., 1993, in R. Gesteland and J. Atkins, eds., The RNA World, Cold Spring Harbor Press, pp. 303-357.] The RNA components of snRNPs are essential for mRNA splicing The two known classes of self-splicing intron sequences. The group I intron sequences bind a free G nucleotide to a specific site to initiate splicing, while the group II intron sequences use a specially reactive A nucleotide in the intron sequence itself for the same purpose. The two mechanisms have been drawn in a way that emphasizes their similarities. Both are normally aided by proteins that speed up the reaction, but the catalysis is nevertheless mediated by the RNA in the intron sequence. The mechanism used by group II intron sequences forms a lariat and resembles the pathway catalyzed by the spliceosome. (After T.R. Cech, Cell 44:207-210, 1986. © Cell Press.) Major Points 1. Primary eukaryote RNA transcripts are processed by 5’ Capping, 3’polyA and internal intron removal by RNA splicing 2. Many post-transcriptional steps can be regulated to form mRNA 3. 5’ cap influences protein translation and 3’polyA tail effects stability 4. Exon coding seq. are interrupted by intron seq. that must be spliced out of primary transcripts to form mature mRNA 5. Intron removal involves self-splicing RNA seq. and splicesomes : protein/RNA complexes (RNP’s) containing special small U RNA’s 6. Splicing occurs via 2 trans-esterification reactions and involve an intermediate branched RNA formed by linking 5’Phos of the intron to an A residue near the 3’ end of the intron via a 2’ OH 7. Two classes of self-splicing introns: Group I ( uses G-OH) and Group II which uses the internal A residue to form a lariat & branch