Evidence of Two-Step Recursive Pre-mRNA Splicing In Vitro Ken Seldeen Department of Biochemistry and Molecular Biology University of Miami School of Medicine A rotation student’s project in Dr. Mayeda’s laboratory February 11, 2005 Approved by ____________________________________________________ Mentor Akila Mayeda, Ph.D. Ken Seldeen ABSTRACT Precise removal of introns from nascent gene transcripts, or pre-mRNAs, by splicing is a crucial step during gene expression in eukaryotes. Large size genes containing very long introns are prevalent in human genome, however, little is known about the splicing mechanism of extremely long introns. Recursive splicing, i.e., the stepwise removal of introns by sequential re-splicing at spliced junctions, was discovered in the long intron of a fruit fly (Drosophila melanogaster) gene in 1988. We assume that the recursive splicing is one of the reasonable splicing mechanisms for long introns in human (Homo sapiens) genes, however, the evidence has not been reported to date. To study the mechanism of recursive splicing, we have been trying to reconstitute this phenomenon in vitro with minimal model pre-mRNA. The Mayeda lab previously constructed a modified -globin mini-gene, which contains one recursive splice site (a 3’ splice site adjacent to a 5’ splice site). Using this transcript as a minimal substrate, in vitro splicing with HeLa cell nuclear extracts was performed. (1) Besides the final spliced mRNA, we observed upstream partial spliced product (between the authentic 5’ splice site and the RSS). However, no downstream partial spliced product (between the RSS and the authentic 3’ splice site) was detected even by sensitive RT-PCR assays. (2) Point mutation of 3’ splice site sequence (AGGG) generates no upstream partial spliced product. (3) Point mutation of 5’ splice site sequence (GUAU) in RSS generates more upstream partial spliced product. (4) The substrate corresponding to the upstream partial spliced RNA was further spliced to the final mRNA in vitro. All these data strongly indicated that, at least, a portion of the final spliced mRNA is generated in a two-step sequential process mediated by recursive splicing at the RSS. This is the first demonstration to reconstitute recursive splicing in vitro, and our system provides a useful tool to identify and characterize the trans-acting factors that are specific in recursive splicing. -2- Ken Seldeen INTRODUCTION Most of eukaryotic messenger RNA precursors (pre-mRNAs) are interrupted by introns, which are not included in the final spliced mRNA. Pre-mRNA splicing is essential process to remove intron and generate mature mRNA, which is a template for the translation. Pre-mRNA splicing requires both small nuclear ribonucleoprotein particles (snRNPs) and many non-snRNP protein factors. These factors assemble into a large complex on the pre-mRNA known as a spliceosome, in where splicing reaction catalyzes (reviewed in Krämer, 1996; Burge et al., 1999). The splicing machinery must recognize exon/intron boundaries with high fidelity, so that cleavage and re-joining can be made precisely at the right position. However, important signal sequences around the intron/exon junctions, the 5’ and 3’ splice sites and branch site (reviewed in Burge et al., 1999) are not highly conserved, and thus they are not sufficient for the accurate splicing. The biochemical steps of splicing involve two consecutive transesterification (Fig. 1; reactions reviewed in Fig. 1. Two-step catalytic process for pre-mRNA splicing. A pre-mRNA with a single intron is shown at left, with two exons shown as boxes and the intron shown as a line (modified from Burge et al., 1999). Burge et al., 1999). The first step is a cleavage that occurs at the 5’ splice site, and the phosphate at the 5’-end of the intron links to the 2’-OH of an adenine nucleotide, which called the branch site. The branch site is usually located 20–40 nucleotides (nt) upstream from the 3’ splice site. The second step is a cleavage that occurs at the 3’ splice spice site, and the 5’phosphate of the down-stream exon joins to the 3’-OH of the upstream exon. Eventually, two exons are re-joined and intron is released as a lariat form (reviewed in Krämer, 1996; Burge et al., 1999). In Eukaryotes, very large genes with long introns are very prevalent (International Human Genome Sequencing Consortium, 2001; 2004; Venter et al., 2001). For instance, many genes that play important roles in development and human disease, such as cystic fibrosis, retinoblastoma, muscle dystrophy, neurofibromatosis, are very -3- Ken Seldeen large with many long introns. Little is known about mechanisms that facilitate the accurate and efficient splicing of extremely long introns. However, it is technically very demanding to assay pre-mRNA splicing of long introns. Our project is one of the possible approaches to elucidate splicing mechanism of such long introns. Recursive splicing, i.e., the stepwise removal of intron by sequential re-splicing at spliced exon-exon junctions, was discovered in the very long intron of the Ultrabithorax (Ubx) gene of D. melanogaster (Fig. 2B; Hatton et al., 1998). The members of the Mayeda lab have been examining possible recursive splicing in intron 7 (109,574 bp) of the human dystrophin (DMD) gene. During the course of this research, they unexpectedly found evidence of a novel mechanism, Fig. 2. Three splicing pathways for long intron. (A) Conventional one-step splicing. (B) Recursive multi-step splicing, which was originally found in D. melanogaster. (C) Nested-intron multistep splicing. 5’ ss: 5’ splice site, 3’ss: 3’ splice site. 3’ ss/5’ss: re-splicing site (RSS). termed nested-intron splicing: i.e., two putative nested-intron lariats have been detected in intron 7 by reverse transcription-polymerase chain reaction (RT-PCR; H. Suzuki & A. Mayeda, unpublished data). The nested splicing is the multiple sequential splicing of the internal intron followed by the eventual authentic splicing via 5’ and 3’ splice sites (Fig. 2C). We assume short introns can be easily spliced out by a conventional one-step splicing (Fig. 2A), thus if there are many short nested introns, or potential 5’ and 3’ splice sites, within the long intron, a series of multiple nested splicing can shorten gradually the whole long intron until it becomes short enough to be spliced via authentic 5’ and 3’ splice site eventually (Fig. 2C). We predict that the nested splicing, together with possible recursive splicing, may be a general mechanism for splicing of extremely long introns in human genes. It is conceivable that the involved mechanism and factors might be distinct from those of authentic splicing, since the nested splicing should be a multi-step sequential splicing -4- Ken Seldeen event whereas the conventional splicing essentially completes its process by re-joining of exons. As a rotation student’s project, I focused on the possible recursive splicing in human gene. To elucidate the mechanism of recursive splicing, it is crucial to reconstitute in vitro splicing with a minimal model pre-mRNA that can be spliced in two-step through one inserted recursive splice site. Following the studies of previous two rotation students, we eventually obtained the first evidence that recursive splicing takes place in vitro. MATERIALS AND METHODS Transformation of E. coli Plasmid (1 µg) was mixed into 200 µl of competent E. coli cells (strain DH5) and incubated on ice for 30 min followed by immediate heat at 42ºC for 1 min. The mixture was then put on ice for 2 min and added 150 µl SOC (2% Tryptone, 0.5% Yeast Extract 10 mM NaCl, 10 nM MgSO4, 10mM MgCl2). The transformed E. coli was spread on LB agar plates containing 50 µg/ml ampicillin and incubated at 37ºC overnight. Preparation and plasmids and templates Single colony from the transformants was inoculated in LB medium with 50 µg/ml ampicillin and incubate at 37ºC overnight for the preparation of plasmid. Plasmids were prepared by plasmid midi-preparation kit according to the manufacturers protocol (Qiagen). The plasmid DNA yield was determined by measuring the UV absorbance at 260 nm by spectrophotometer (Ultraspec 2100, Amersham). To prepare template DNA for in vitro transcription, the plasmid was digested at 36ºC for 1 h with BamHI using appropriate reaction buffer (New England Biolabs). The digested plasmid was checked by agarose gel electrophoresis to confirm the right plasmid. Digested DNA samples were analyzed on 1% agarose gels (mini-size). Electrophoresis was performed at 100 mV for 20 min with 0.5 X Tris-acetate EDTA (TAE) buffer. Agarose gel was stained with 0.5 µg/ml ethidium bromide and visualized under UV light. -5- Ken Seldeen 1/10 volume of 3 M sodium acetate and an equal volume of phenol chloroform were added to the digestion reaction with restriction enzyme and shook by Vortex for 10 sec. After micro-centrifugation for 5 min at 12,000 rpm (Tomy), the supernatant was saved to a fresh tube. An equal volume of chloroform was added and centrifuged again for 5 min. The supernatant was transferred to a fresh tube, 2.5 volume of chilled ethanol was added, and kept at –80ºC for 15 min to precipitate DNA. After micro-centrifugation at 12,000 rpm for 15 min at 4ºC, the supernatant was carefully removed and the pellet was dissolved in water to concentration of 1 µg/µl. Plasmid construction for model splicing substrates Previous lab work showed that the cryptic 5’ splice sites upstream of the authentic 5’ splice site (in the first exon of (-globin mini-gene; Krainer et al., 1984) are activated in the model substrate with only 5’ half of introns. To avid the activation of upstream two cryptic 5’ splice sites, I created the GTAT point mutations in the original plasmid constructs. PCR was performed with 100 ng of each of the 4 plasmids (all from previous lab work): pSP64-HUX2-AG/GT, pSP64-HUX2-AG/AT, pSP64-HUX2GG/GT, and pSP64-HUX2-AG/GT∆5’intron (modified plasmid with 5’ half of intron removed). Custom DNA primers for PCR were purchased (Qiagen). The primers used have been designed according to the sequence upstream and downstream of the first exon: Mutation Primer-S 5’-GTGGGGCAAGaTGAACGTGGATGAAGTTGGTGaTGAGGCCCTG-3’; Mutation Primer-AS, 5’- CAGGGCCTCAtCACCAACTTCATCCACGTTCAtCTTGCCCCAC-3’. Lower-case bold letters represent location of the base mutations. The reaction mixture (50 µl) contained: 100 pmol primers Mutation Primer-S and Mutation Primer-AS (dissolved in 1 X TE buffer), 0.5 mM dNTP (4 kinds) mixture, 100 ng pSP64-HUX2-AG/GT, and 20 units Pfu Turbo polymerase with Pfu DNA polymerase buffer (Stratagene). All the PCR conditions for each cycle were 94ºC for 30 sec, 60ºC for 30 sec, and 68ºC for 10 min, and a total 19 cycles were performed. After PCR is completed, dNTP and primers were removed by S-300 microspin column (Amersham) according the manufacturers protocol. The PCR product was digested with 40 units of DpnI at 37C for 1 h, followed by phenol/chloroform extraction. -6- Ken Seldeen PCR product was treated with T4 DNA polynucleotide kinase (New England Biolabs) and subcloned into pSP64 vector (Promega). Ligation was performed with T4 DNA ligase (New England Biolabs) at 16ºC for 1–2 h. After ligation of the plasmid, 5 µl of the reaction was directly used for the transformation of E. coli. The medium-scale plasmid preparation was carried out to obtain plasmids termed pSP64-AGGT, pSP64-AGAT, pSP64-GGGT, pSP64-2nd Splice. The concentration of the plasmid was measure by UV absorbance. ~1 µg of each plasmid was digested with 2 units of BamHI and HindIII at 37ºC for 1 h. Digested samples were checked by agarose gel electrophoresis to confirm whether the plasmid contained the correct extended intron. Digested plasmid of the previous construct was used as a control. DNA size markers used were 0.5 µg of 100 bp and 1 kbp DNA ladders (New England Biolabs). The plasmids were further verified by sequencing (DNA Core Lab or Michigan University). Preparation of 32P-labeled pre-mRNA 50 µg of the pSP64-AGGT, pSP64-AGAT, pSP64-GGGT, and pSP64- AG/GT∆5’intron plasmids were digested with BamHI (1.5 units) at 37C overnight. DNA was extracted with phenol/chloroform and precipitated with ethanol. The DNA pellets were air dried and dissolved in 1 X TE buffer (to final concentration of 1 µg/µl). The linearized plasmids were used as template of run-off transcription in vitro (Mayeda & Krainer, 1999b). The transcription reaction (25 µl) was performed with 0.5 mM ATP/CTP mixture, 50 µM GTP/UTP mixture, 2 mM GpppG cap analog (New England Biolabs), 1 unit of RNase inhibitor (PRIME), 1 µg of DNA template, 25 µCi [32P] UTP and 10 units SP6 RNA polymerase with SP6 polymerase buffer (New England Biolabs). The reaction mixture was incubated at 40ºC for 1.5 h, followed by 10 min incubation with 15 units RQ DNase (Promega) to degrade template DNA. After incubation, 150 µl H2O, 50 µl 7.5 M ammonium acetate and 150 µl Tris-saturated phenol were added and immediately shook by vortex for 2–3 min. After micro- centrifugation for 5 min at 12,000 rpm, the aqueous phase was recovered to a fresh tube. To precipitate RNA, 0.5 ml 100% ethanol and 15 µg glycogen were added and kept at 4ºC for at least 10 min. After micro-centrifugation at 12,000 rpm for 15 min at 4ºC, the supernatant was carefully removed and the pellet was washed with 80% -7- Ken Seldeen ethanol. The RNA pellet was dried for 5 min under vacuum and resuspended in 100 µl 10 mM Tricine-HCl (pH 7.6). The yield and concentration of RNA transcript was estimated by TCA-insoluble scintillation counting. In vitro splicing assays Preparation of HeLa cell nuclear and cytosolic S100 extracts, and in vitro splicing reaction were described previously (Mayeda & Krainer, 1999a; 1999b). Splicing buffer mixture was made in a batch, according to the desired number of reactions: the content per one reaction was 1 µl 12.5 mM ATP / 0.5 M creatine phosphate mixture, 1 µl 80 mM MgCl2, 1.25 µl 0.4 M Hepes-KOH (pH 7.3), 5.0 µl 13% polyvinyl alcohol (which was added last due to viscosity), 20 fmol of 32P-labeled pre-mRNA, and H2O to 10 µl total. Splicing reaction mixture (in 25 µl) was prepared on ice with 8 µl of HeLa cell nuclear extract (or 8 µl S100 extract plus 10 pmol or 20 pmol of recombinant SF2/ASF). Buffer D [20 mM Hepes-NaOH (pH 8.0), 100 mM KCl, 0.2 mM EDTA, 20%(v/v) glycerol, 0.5 mM PMSF, I mM DTT] was added to make a volume to 15 µl. Then 10 µl of splicing buffer mixture was added, mixed gently, and incubated at 30ºC for 1–4 h or otherwise stated. The reactions were terminated by adding 175 µl of splicing stop solution [0.3 M sodium acetate (pH 5.2), 0.1%(v/w) SDS, 62.5 µg/ml tRNA] immediately followed by phenol extraction. After micro-centrifuged for 5 min at 12,000 rpm, the upper aqueous layer was removed carefully and then RNA was precipitated with 0.5 ml ethanol (and kept at –80ºC for at least 10 min). Human -globin pre-mRNA, transcribed from BamHI cleaved pSP64-H6 (Krainer et al., 1984), was used as a positive control. Analyses of splicing products The ethanol-precipitated samples were micro-centrufuged for 15 min and ethanol was carefully removed. The RNA pellet was dissolved in 3.5 µl RNA dye mixture [90% (v/v) formamide, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.1% Bromophenol blue, 0.1% Xylene cyanol FF], heated at 80ºC for 5–10 min, and immediately loaded onto 5.5% polyacrylamide/7 M urea gel electrophoresis (denaturing PAGE). 100 bp and 1 kbp DNA markers (New England Biolabs), which were radio-labeled with [32P] ATP and T7 -8- Ken Seldeen polynucleotide kinase, were used as size markers. The electrophoresis was done at constant voltage at 700 V for 100 min. Gel was transferred to presoaked 3M paper (Wattman) and then dried by gel dryer at 85ºC for 1.5–2 h. Autoradiography was done with intensifying screen at –80ºC for 2–3 h. RESULTS Preparation of model recursive splicing Fig. 3. In vitro splicing pathway of the original model pre-mRNA. constructs The original model -globin mini-gene contains a strong RSS, which was derived from intron 7 of the Dystrophin gene (G.R. Screaton & A. Mayeda, unpublished). In vitro splicing of this pre-mRNA generates two spliced products, the partially spliced product of the 5’ half of the intron (splicing between the authentic 5’ splice site and 3’ splice site of the RSS) and the final spliced product (Fig. 3). There exist two possible pathways to generate this final product, one from the pre-mRNA via authentic 5’ and 3’ splice sites (Fig 3, “Conventional Splicing”) and the other from the partially spliced product (Fig 3, “Resplicing?”). To demonstrate sequential recursive splicing in two steps, it is important to show, at least, the partially spliced pre-mRNA (455 nt) can be spliced to the final spliced product (365 nt). Preliminary in vitro splicing results of the first spliced substrate (Fig 4, “First Splice Substrate”) have shown that cryptic 5’ splice sites upstream of -9- Fig. 4. mRNA substrates used for in vitro splicing or are a product of in vitro splicing. Observed In vitro splicing marked by black lines. Red X’s mark areas where splicing was not observed. Ken Seldeen the authentic 5’ splice sites were activated (T. Venkataraman & A. Mayeda). The unexpected activation of the cryptic 5’ splice sites (without any mutation on authentic 5’ splice site) may be due to the shortness of the introns (88 nt). Thus, two cryptic 5’ splice sites were abrogated with a single base change from GT to AT using the site directed mutagenesis. This was successful and no evidence of the use of the cryptic 5’ splice splices was found during the following experiments. In vitro splicing analysis Transcripts from the plasmid constructs were splice in vitro with HeLa cell nuclear extracts. The splicing products were analyzed by denaturing PAGE followed by autoradiography. The splicing process of the wild-type (AGGT) pre-mRNA was observed over the time course Fig. 5. Results of in vitro splicing of the wild-type AGGT substrate. Generated splicing products were indicated by their schematic structures (see Fig 4). (1, 2, 3, 4 h; Fig 5). Besides unspliced pre-mRNA, two discrete bands were visible that are corresponding to the first spliced product and final spliced product. In the wild-type substrate (WT), both these bands are increasing over time, while unspliced pre-mRNA is descreasing. Using the same splicing conditions, mutant substrates (MUT1 and MUT2) were also analyzed in vitro over the time course (1, 3 h; Fig. 6). In MUT2 substarte, a band corresponding to the first spliced product is missing. This is due to the mutation in the 3’ splice site of RSS that prevents a splicing in the 5’ half of intron. In MUT1 substrate, the first spliced product is more accumulated compared with that of WT substrate. This observation suggests that the - 10 - Fig. 6. Results of in vitro splicing of the mutant substrates. Predicted splicing products were indicated by their schematic structures (see Fig 4). Ken Seldeen sequential second splicing in WT substrate is actually takes place (besides conventional direct splicing to the final product; see Fig. 3), because the accumulation of the first spliced product could be due to the prevention of the second splicing in this mutant (5’ splice site mutation in RSS forces to prevent the second splicing). Finally, it is necessary to demonstrate that splicing can occur from the first spliced product to generate the final spliced product. Therefore, in vitro splicing was performed in the substrate corresponding to the first spliced product (“First Splice Substrate” in Fig. 4). The substrate was spliced in vitro over the time course (1, 2, 3, 4 h) and we detected increasing amounts of the final spliced product, which has an identical mobility of the spliced product of control -globin pre-mRNA (Fig. 7). It is also noted that Fig. 7. Results of in vitro splicing of the first spliced substrate. Predicted splicing products were indicated by their schematic structures (see Fig 4). bands just below the final spliced product are not the spliced products via cryptic 5’ splice sites but rather non-specific cleaved products, since the bands were generated in constant levels throughout the reaction from 1 h through 4 h. DISCUSSION Two essential findings came from the research providing convincing evidence for the recursive splicing model (Fig. 8A). (1) The first splicing takes place exclusively between the 5’ splice site and the 3’ splice site of the inserted RSS (whereas the splicing between the 5’ splice site of the inserted RSS and the 3’ splice site does not occur). (2) The substrate corresponding to the first spliced product can be spliced to the final spliced product. These two results indicate, although not prove, two-step recursive splicing through inserted RSS (Fig. 8A). We cannot rule out the possibility that the second splicing takes place independently, but not sequentially after the first splicing. To provide definitive proof of the sequential splicing via RSS, it is essential to detect the lariat RNA arising from the subsequent splicing of the first spliced product using the - 11 - Ken Seldeen wild-type (AGGT) substrate (Fig. 8A). Since the first splicing does not occur from the 3’ half of intron, we can assume that the lariat RNA is exclusively generated by the second splicing from the first spliced product. We will first try direct detection of these lariat RNAs (either lariat intermediate or lariat intron) on PAGE gel. Since a higher percentage of polyacrylamide (e.g., 9%) facilitates the detection of lariat RNAs (whose gel mobility are much lower than the linear RNAs), we will use both usual 5% and higher 9% denaturing page to detect these lariat RNAs. If we cannot detect by this direct method (because of the low yield of these lariat products), then we will use sensitive RT-PCR detection. Previously, it was shown that lariat RNA can Fig. 8. (A) The pathway of two-step recursive splicing and expected generation of lariat products. (B) The RT-PCR detection of lariat RNA. be selectively detected by RT-PCR across the branch site (Fig. 8B; Lorsch et al., 1995; Vogel et al., 1997). If we could get expected size of RT-PCR product and the sequence was verified, this would provide a final proof of our recursive splicing hypothesis. ACKNOWLEDGEMENTS I thank Dr. Hitoshi Suzuki for his helpful advice and patience in training me in the various experiments needed to complete this project. I am grateful to my advisor Dr. Akila Mayeda to support my research during my rotation period. - 12 - Ken Seldeen REFERENCES Burge, C.B., Tuschl, T. & Sharp, P.A. (1999). Splicing of precusors to mRNAs by the spliceosomes. In Gesteland, R.F., Cech, T.R. and Atkins, J.F. (ed.), The RNA world, Second edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, pp. 525-560. Hatton, A.R., Subramaniam, V. & Lopez, A.J. (1998). Generation of alternative Ultrabithorax isoforms and stepwise removal of a large intron by resplicing at exon-exon junctions. Mol. Cell 2, 787-796. International Human Genome Sequencing Consortium. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860-921. International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature 431, 931-945. Krainer, A.R., Maniatis, T., Ruskin, B. & Green, M.R. (1984). Normal and mutant human bglobin pre-mRNAs are faithfully and efficiently spliced in vitro. Cell 36, 993-1005. Krämer, A. (1996). The structure and function of proteins involved in mammalian pre-mRNA splicing. Annu. Rev. Biochem. 65, 367-409. Lorsch, J.R., Bartel, D.P. & Szostak, J.W. (1995). Reverse transcriptase reads through a 2'5'linkage and a 2'-thiophosphate in a template. Nucleic Acids Res. 23, 2811-2814. Mayeda, A. & Krainer, A.R. (1999a). Preparation of HeLa cell nuclear and cytosolic S100 extracts for in vitro splicing. Methods Mol. Biol. 118, 309-314. Mayeda, A. & Krainer, A.R. (1999b). Mammalian in vitro splicing assays. Methods Mol. Biol. 118, 315-321. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., Gocayne, J.D., Amanatides, P., Ballew, R.M., Huson, D.H., Wortman, J.R., Zhang, Q., Kodira, C.D., et al. (2001). The sequence of the human genome. Science 291, 1304-1351. Vogel, J., Hess, W.R. & Borner, T. (1997). Precise branch point mapping and quantification of splicing intermediates. Nucleic Acids Res. 25, 2030-2031. - 13 -