RNA Vol. 418, No. 6894 (11 July 2002). The DNA molecule, as the primary repository of genetic information in living systems, is constrained to be stable and predictably structured. RNA differs little from DNA in chemical terms, but by contrast contrives to exhibit remarkable conformational flexibility and functional versatility, playing the Oscar Wilde to DNA's Marquess of Queensberry. The past few decades of intensive research have revealed, for example, that RNA physically conveys and interprets the genetic blueprint of every living cell; it performs essential structural roles in a number of molecular machines; its ability to form transient duplexes allows it to work as a switch; and it is likely to work as an essential catalyst in several Cover illustration Electron micrograph shows nascent pre-rRNA attached to its DNA template from a lysed yeast cell. biologically important reactions. This Insight comprises an eclectic series of articles on different facets of RNA chemistry and biology. The starting point, appropriately enough, is a discussion of the pivotal role that RNA seems to have played in the evolution of life on Earth, which likely accounts for its wide distribution in present-day organisms. The catalytic properties of RNA enzymes — ribozymes — are then surveyed, and the role of RNA in ribosome structure and function is described as perhaps the best understood example of an RNA–protein assemblage. RNA RICHARD TURNER | Full text | PDF (84 K) | 213 The antiquity of RNA-based evolution 214 GERALD F. JOYCE | Summary | Full text | PDF (206 K) | The chemical repertoire of natural ribozymes 222 JENNIFER A. DOUDNA AND THOMAS R. CECH | Summary | Full text | PDF (242 K) | The involvement of RNA in ribosome function 229 PETER B. MOORE AND THOMAS A. STEITZ | Summary | Full text | PDF (466 K) | Alternative pre-mRNA splicing and proteome expansion in metazoans 236 TOM MANIATIS AND BOSILJKA TASIC | Summary | Full text | PDF (586 K) | RNA interference 244 GREGORY J. HANNON | Summary | Full text | PDF (340 K) | Emerging clinical applications of RNA BRUCE A. SULLENGER AND ELI GILBOA | Summary | Full text | PDF (169 K) | RNA RICHARD TURNER Senior Editor 252 The DNA molecule, as the primary repository of genetic information in living systems, is constrained to be stable and predictably structured. RNA differs little from DNA in chemical terms, but by contrast contrives to exhibit remarkable conformational flexibility and functional versatility, playing the Oscar Wilde to DNA's Marquess of Queensberry. The past few decades of intensive research have revealed, for example, that RNA physically conveys and interprets the genetic blueprint of every living cell; it performs essential structural roles in a number of molecular machines; its ability to form transient duplexes allows it to work as a switch; and it is likely to work as an essential catalyst in several biologically important reactions. Image courtesy of Y. Osheim, K. Wehner, A. Beyer and S. Baserga. This Insight comprises an eclectic series of articles on different facets of RNA chemistry and biology. The starting Cover illustration point, appropriately enough, is a discussion of the pivotal role Electron micrograph shows nascent pre-rRNA attached that RNA seems to have played in the evolution of life on to its DNA template from a Earth, which likely accounts for its wide distribution in lysed yeast cell. present-day organisms. The catalytic properties of RNA enzymes — ribozymes — are then surveyed, and the role of RNA in ribosome structure and function is described as perhaps the best understood example of an RNA–protein assemblage. Moving gradually from chemistry into biology, eukaryotic pre-mRNA splicing is discussed in the context of how it contributes to the diversity of proteins and, ultimately, the cells and tissues they make up. Exciting recent work on RNA-based gene regulation is then described, hinting that much remains to be learned about the role of RNA in eukaryotic regulatory networks. Finally, current efforts towards the goal of using RNA molecules as therapeutic agents are reviewed. Space constraints mean that many fascinating aspects of RNA structure and function have had to be omitted — the organization and behaviour of certain RNA viruses for one — but we nonetheless hope that readers will find the articles stimulating and enjoyable. 11 July 2002 Nature 418, 214 - 221 (2002); doi:10.1038/418214a The antiquity of RNA-based evolution GERALD F. JOYCE Departments of Chemistry and Molecular Biology and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA (e-mail: gjoyce@scripps.edu) All life that is known to exist on Earth today and all life for which there is evidence in the geological record seems to be of the same form — one based on DNA genomes and protein enzymes. Yet there are strong reasons to conclude that DNA- and proteinbased life was preceded by a simpler life form based primarily on RNA. This earlier era is referred to as the 'RNA world', during which the genetic information resided in the sequence of RNA molecules and the phenotype derived from the catalytic properties of RNA. The RNA molecule has a pervasive role in contemporary biology, especially with regard to the most fundamental and highly conserved cellular processes. It is involved as a primer in DNA replication, a messenger that carries genetic information to the translation machinery, and a catalyst that lies at the heart of the ribosome. RNA instructs the processing of precursor messenger RNAs during splicing and editing, and mediates numerous other transactions of RNA and proteins in the cell. Catalytic RNAs (ribozymes) assist in RNA processing events and the replication of viral genomes. Individual nucleotides serve as important signalling molecules and their coenzyme derivatives participate in most of the reactions of central metabolism. It is as if a primitive civilization had existed prior to the start of recorded history, leaving its mark in the foundation of a modern civilization that followed. Although there may never be direct physical evidence of an RNA-based organism, because the RNA world is likely to have been extinct for almost four billion years, molecular archaeologists have uncovered artefacts of this ancestral era, none more pronounced than the recently reported crystal structure of the ribosome1-3. This structure reveals the face of the RNA world in the active role that RNA has in protein synthesis. In the laboratory, biochemists have come to appreciate the remarkable structural and functional versatility of RNA. Despite containing only four different chemical subunits, RNA folds into a variety of complex tertiary structures, analogous to structured proteins, and catalyses a broad range of chemical transformations (see review in this issue by Doudna and Cech, pages 222–228). RNA evolution in the laboratory, which can be viewed as a model of RNA evolution in the RNA world, has been used to obtain many new RNA enzymes. These include RNAs that catalyse nucleotide synthesis4 RNA polymerization5, aminoacylation of transfer RNA6 and peptide bond formation7. It seems likely that RNA has the capability to support life based on RNA genomes that are copied and maintained through the catalytic function of RNA. There are substantial gaps, however, in scientific understanding concerning how the RNA world arose, the degree of metabolic complexity that it attained, and the way that it led to DNA genomes and protein enzymes. The dawn of darwinian evolution The worlds of prebiotic chemistry and primitive biology lie on opposite sides of the defining moment for life, when darwinian evolution first began to operate (Fig. 1). Before that time, chemical processes may have led to a substantial level of complexity. Depending on the nature of the prebiotic environment, available building blocks may have included amino acids, hydroxy acids, sugars, purines, pyrimidines and fatty acids. These could have combined to form polymers of largely random sequence and mixed stereochemistry (handedness). Some of the polymers may have had special properties, such as adherence to a particular mineral surface, unusual resistance to degradation, or the propensity to form supramolecular aggregates. Eventually every polymer, no matter how stable, would have succumbed to degradation. Figure 1 Timeline of events pertaining to the early history of life on Earth, with approximate dates in billions of years before the present. Full legend High resolution image and legend (20k) A special class of polymers are those that are capable of self-replication. Although polymer self-replication is often interpreted as involving residue-by-residue copying of the polymer — a view biased by familiarity with nucleic acid replication in biology — all that is actually required is that the polymer gives rise to additional polymer molecules of the same sequence. If the rate of production of new copies exceeds the rate of degradation of existing copies, then a particular polymer sequence will persist over time. Natural environments are subject to fluctuating conditions, ranging from diurnal and seasonal variation to unpredictable and potentially cataclysmic events. When the environment is altered, the special properties associated with a particular polymer may no longer apply and the capacity for self-replication may be lost. Persistence in a changing environment requires a more general mechanism for self-replication that allows the polymer sequence to change somewhat over time, but retain its heritage in most of the sequence that is unchanged. The polymer must be replicated in essentially the same manner regardless of its sequence. Variation will arise owing to inevitable copying errors, and those variants too must be amenable to replication. Once a general mechanism existed for self-replication, allowing the introduction of variation and the ability to replicate those variants, darwinian evolution began to operate. This marked the beginning of life. The special properties of a particular polymer sequence then were defined by its net rate of accumulation (rate of production minus rate of degradation), and sequences that were associated with the most favourable survival rates would have come to dominate their locale. From that point onward, the natural history of life on Earth played out as a succession of dominant polymer sequences and their associated functional properties. A cluttered path to RNA RNA is a polymer of variable sequence that is amenable to self-replication by a templating mechanism8. Different sequences have different chemical properties, but almost all sequences are able to form Watson–Crick duplex structures that facilitate the production of new copies. If the building blocks of RNA were available in the prebiotic environment, if these combined to form polynucleotides, and if some of the polynucleotides began to selfreplicate, then the RNA world may have emerged as the first form of life on Earth9, 10. But based on current knowledge of prebiotic chemistry, this is unlikely to have been the case. Ribose, phosphate, purines and pyrimidines all may have been available, although the case for pyrimidines is less compelling11, 12. These may have combined to form nucleotides in very low yield13, 14, complicated by the presence of a much larger amount of various nucleotide analogues. The nucleotides (and their analogues) may even have joined to form polymers, with a combinatorial mixture of 2',5'-, 3',5'- and 5',5'-phosphodiester linkages, a variable number of phosphates between the sugars, D- and L- stereoisomers of the sugars, and -anomers at the glycosidic bond, and assorted modifications of the sugars, phosphates and bases (Fig. 2). It is difficult to visualize a mechanism for self-replication that either would be impartial to these compositional differences or would treat them as sequence information in a broader sense and maintain them as heritable features. Figure 2 Prebiotic clutter surrounding RNA. Full legend High resolution image and legend (35k) The chief obstacle to understanding the origin of RNA-based life is identifying a plausible mechanism for overcoming the clutter wrought by prebiotic chemistry. Several avenues of investigation are being pursued. Perhaps there were special conditions that led to the preferential synthesis of activated -D-nucleotides or the preferential incorporation of these monomers into polymers. For example, the prebiotic synthesis of sugars from formaldehyde can be biased by starting from glycoaldehyde phosphate, leading to ribose 2,4-diphosphate as the predominant pentose sugar15. This reaction can occur starting from dilute aqueous solutions of reactants at near-neutral pH when carried out in the presence of certain metal-hydroxide minerals16. The polymerization of adenylate, activated as the 5'phosphorimidazolide, yields 2',5'-linked products in solution, but mostly 3',5'-linked products in the presence of a montmorillonite clay17. Thus, through a series of biased syntheses, fractionations and other enrichment processes, there may have been a special route to a warm little pond of RNA. Another approach is to hypothesize that life did not begin with RNA; some other genetic system preceded RNA, just as it preceded DNA and protein (Fig. 1). This approach has met with substantial progress in recent years, despite the lack of guidance from known metabolic pathways in biology regarding the chemical nature of a precursor to RNA. A systematic investigation of potentially natural nucleic acid analogues containing various sugars and linkage isomers has led to the recognition of some intriguing pairing systems18. Most notable is the threose nucleic acid (TNA) analogue based on -L-threofuranosyl units joined by 3',2'-phosphodiester linkages19 (Fig. 3a). This analogue forms stable Watson– Crick pairs with itself and with RNA. From the point of view of overcoming the clutter of prebiotic chemistry, TNA is more advantageous than RNA because of its relative chemical simplicity. Threose is one of only two aldotetroses (four-carbon sugars) and can only be joined at the 2' and 3' positions. Additionally, it is not difficult to imagine how a 'TNA world' might have made the transition to an RNA world while preserving the continuity of genetic information (see below). Figure 3 Candidate precursors to RNA during the early history of life on Earth. Full legend High resolution image and legend (67k) There are other interesting candidates for a potential predecessor to RNA. Peptide nucleic acid (PNA) consists of a peptide-like backbone of N-(2-aminoethyl)glycine units with the bases attached through a methylenecarbonyl group20 (Fig. 3b). Aminoethylglycine has been synthesized in spark discharge reactions from nitrogen, ammonia, methane and water21, although the prebiotic synthesis of an entire PNA monomer has not been achieved. PNA forms Watson–Crick-like duplex structures with itself and with RNA. Even though it is non-chiral, PNA is susceptible to cross-inhibition of the opposing enantiomers when directing the polymerization of activated D,L-ribonucleotides22, 23. Furthermore, PNA monomers can undergo an intramolecular N-acyl transfer reaction that would prevent any conventional mechanism for their polymerization24. Two other proposals for what might have come before RNA are glycerol-derived nucleic acid analogues25-29 (Fig. 3c) and pyranosyl-RNA (containing 4',2'-linked -D-ribopyranosyl units)30, 31 (Fig. 3d), although neither has garnered sufficient experimental support to be considered a strong candidate. It is also possible that RNA-based life was preceded by a replicating, evolving polymer that bore no resemblance to nucleic acids. Self-replication without darwinian evolution has been demonstrated for certain peptides32 and even small organic compounds33. Why not cast the net broadly and consider any polymer that is capable of self-replication? A critical issue then becomes whether there is a sufficient diversity of polymer sequences that can be replicated faithfully to provide the basis for darwinian evolution. Nucleic acids have the great advantage that their potential to act as a template is sequence independent, but the templating properties of a particular nucleic acid molecule are highly sequence specific. Peptide replication based on templating within a complex of -helices offers more restricted choices of distinct self-replicating entities, but perhaps enough to sustain a lineage of compounds in the face of a changing environment. A more radical suggestion is that the first form of life was not based on organic polymers at all, but rather on inorganic clays34. Information would be represented by the distribution of charges or shapes along the surface of the clay, and replication would involve copying that information to newly formed clay layers. Suggestions of this kind challenge chemists to think more broadly about the nature of heritable chemical information and to devise experiments to test these ideas. The transition to RNA from whatever might have preceded it would have had a very different character depending on whether the predecessor was a nucleic acid-like molecule. If the predecessor was able to cross-pair with RNA then the transition may have been a gradual one. Genetic information could have been preserved by 'transcription' of the preRNA to RNA, conferring selective advantage based on the function of the transcribed molecules. Once the RNA became self-replicating, it could have usurped the role of genetic material and the pre-RNA would have become expendable. If the predecessor was not a nucleic acid-like molecule, the appearance of RNA might have involved either a 'translation' process, adapting pre-RNA-based information to RNA-based information, or a 'genetic takeover'35 in which none of the genetic information in pre-RNA was passed on to RNA. Catalytic activity that resided in a pre-RNA molecule, even if that molecule was very similar to RNA, would not be expected to carry over to RNA without further evolutionary refinement. However, a pre-RNA catalyst that resembled RNA might be pre-adapted to evolve a specific function when prepared as the corresponding RNA because of preserved features of its secondary and tertiary structure. The catalytic potential of TNA, PNA and other proposed precursors to RNA has not yet been explored, but any cogent hypothesis regarding pre-RNA life must consider whether that prior genetic system could have facilitated the appearance of RNA. Once RNA appeared and became beneficial to a system undergoing darwinian evolution, further evolutionary innovation pertaining to the synthesis and utilization of RNA would be expected to follow. In this way, pre-RNA life may have helped to overcome the problems of clutter in prebiotic synthesis by providing solutions discovered through natural selection. Eventually RNA molecules would have become responsible for ensuring the availability and replicability of RNA, ushering in the era of RNA-based darwinian evolution. RNA-catalysed RNA replication The general features of RNA-based life can be inferred by considering the requirements for darwinian evolution and the biochemical properties of RNA. The central process of the RNA world was the replication of RNA, presumably catalysed by RNA. The most widely studied, but by no means exclusive model for RNA replication involves template-directed polymerization of activated mononucleotides. Alternatively, replication may have involved the joining of oligonucleotides or even larger subunits36, perhaps by modular assembly rather than organization along a linear template. The standard model, however, is most congruent with known biological systems and illustrates the requirements for RNAcatalysed RNA replication. Nucleotides can be activated in several ways. The biological strategy of using nucleoside 5'-triphosphates (NTPs) is especially appealing because the linkage between the - and phosphates provides a strong thermodynamic driving force (standard free energy of hydrolysis at pH 7 of about -10 kcal mol-1), yet is kinetically stable in typical aqueous environments (khydrolysis 10-10 min-1 at pH 7 and 37 °C)37, 38. Furthermore, polymerization of NTPs is accompanied by release of inorganic pyrophosphate, a small molecule that can readily diffuse away from the reaction centre, avoiding product inhibition. Because of its kinetic stability, however, the , -phosphoester reacts slowly with the 3'-hydroxyl of RNA unless the reaction is catalysed. The uncatalysed rate of joining two adjacent templatebound oligonucleotides, one bearing a 2',3'-hydroxyl and the other a 5'-triphosphate, is only 10-7 min-1 at pH 7 and 37 °C (ref. 39). This is comparable to the rate of hydrolysis of a single RNA phosphodiester under the same reaction conditions40. There is no known ribozyme in biology that catalyses the template-directed polymerization of NTPs, but such molecules have been obtained using test-tube evolution. Like the evolution of organisms in nature, evolution of RNA in the laboratory involves repeated rounds of selective amplification, linking the survival of an RNA species to its fitness41-43. In the laboratory, fitness is defined by the experimenter, for example, based on the ability of RNA to catalyse a particular chemical reaction. Molecules that have been selected as a consequence of their function are then amplified using standard molecular biology techniques, typically reverse transcription followed by amplification using the polymerase chain reaction and then forward transcription. Random mutations may be introduced during the amplification process in order to maintain variation in the population. Through repeated rounds of selective amplification and mutation, a population of RNA molecules can be evolved to perform a defined task, provided that they have the capacity to do so. There are several examples of in vitro-evolved ribozymes that catalyse the templatedirected joining of an oligonucleotide 3'-hydroxyl and oligonucleotide 5'-triphosphate. The best studied of these is the class I ligase, first isolated almost ten years ago44 (Fig. 4a). It contains an internal template region that binds a complementary RNA substrate, and directs attack of the 3'-hydroxyl of the substrate on the 5'-triphosphate of the ribozyme, forming a 3',5'-phosphodiester45. The ribozyme contains 120 nucleotides and operates with a catalytic rate of 100 min-1, corresponding to a rate enhancement of 109-fold compared to the uncatalysed reaction. The class I ligase also catalyses the polymerization of NTPs, adding up to three residues to the 3' end of an RNA primer in a template-directed manner46 (Fig. 4b). Figure 4 Successive phases in the in vitro evolution of an RNA polymerase ribozyme. Full legend High resolution image and legend (32k) The class I ligase was used as a starting point for further evolution experiments, resulting in a ribozyme with much more robust NTP polymerization activity5 (Fig. 4c). The final evolved ribozyme contains 200 nucleotides and catalyses extension of an RNA primer on an external RNA template, adding up to 14 successive nucleotides in 24 hours. It is general with respect to the template sequence, yet operates with an average fidelity of 97% per nucleotide in copying the template sequence to that of a complementary product. This activity is not sufficient to support the RNA-catalysed replication of RNAs that are as large as the catalyst itself. However, there does not seem to be any fundamental obstacle to achieving the required level of activity, provided there exists a sufficiently powerful evolutionary search procedure. There are likely to be many different RNA molecules that are capable of catalysing the template-directed polymerization of NTPs. The hc ligase ribozyme, also obtained by in vitro evolution47, has no significant structural or sequence similarity to the class I ligase, but also catalyses the extension of an RNA primer on an external RNA template48. It adds only one or two nucleotides, but so far has been selected for ligation rather than polymerization activity. Two other in vitro-evolved ribozymes, the L1 and R3 ligases, catalyse formation of a 3',5'-phosphodiester on an internal, but not external, template49, 50. The latter ribozyme is notable because it contains only three of the four nucleotides, completely lacking cytidine. Other ligases have been obtained that catalyse formation of a 2',5'- rather than 3',5'-phosphodiester, the ligases themselves being composed of 3',5'-linked RNA45, 51. A thorough exploration of the vast number of possible RNA sequences would be expected to produce numerous 3',5' ligases, many of which could be evolved into NTP polymerases. Each of these would be tolerant of substantial sequence variation by replacing base pairs in stem regions with other Watson–Crick pairs and substituting non-critical residues within loop regions by different nucleotides. Although a very large number of RNA polymerase ribozymes might be possible, collectively they would comprise only a tiny fraction of the huge number of possible RNA sequences. For RNA molecules that contain 100 nucleotides, there are 4100 ( 1060) possible sequences. A pool of one copy each of these molecules would have a mass greater than 1013 times that of the Earth. A pool of one copy each of all possible 40mers, with a mass of 26 kg, just might be achievable, but it is not clear if 40 nucleotides are sufficient to provide robust RNA polymerase activity. The ribozyme would be required not only to perform the chemistry of polymerization, but also to do so with sufficient fidelity to maintain the selected sequence information over successive generations. Occasional mutations are needed to maintain variability in an evolving population, but too many mutations make it impossible to retain an advantageous genotype. There is a well-established theoretical framework for assessing the effect of genome size, replication rate and replication fidelity on the ability to maintain heritable genetic information52. As a rule of thumb, the error rate of replication per nucleotide must be no more than about the inverse of genome length, corresponding to 99% fidelity for replication of a 100mer and 97.5% fidelity for replication of a 40mer. There may be polymerase ribozymes that meet these requirements, although such molecules have not yet been demonstrated. The above discussion ignores other obstacles to RNA-catalysed RNA replication, such as maintaining a supply of activated mononucleotides, ensuring that the ribozyme will recognize its corresponding genomic RNA while ignoring other RNAs in the environment, overcoming stable self-structure within the template strand, separating the template and product strands, and operating in a similar manner on the product strand to generate new copies of the template. Additional genetic information might be required to overcome these obstacles, but a longer genome would necessitate an even higher fidelity of replication. Mitigating against these demands is the likelihood that RNA polymerase activity first arose in a pre-RNA world. The earliest RNA polymerases need not have been responsible for replicating entire RNA genomes, but merely for generating RNAs that enhanced the fitness of pre-RNA-based life. Further evolutionary innovation could have occurred by exploring sequences related to these functional polymerases, rather than a much broader search of all possible sequences. Metabolic function in the RNA world Although the central process of the RNA world was the replication of RNA genomes, some form of metabolism must have supported the process. In keeping with the second law of thermodynamics, the increase in order that occurs in a genetic system is achieved through the expenditure of high-energy starting materials that are converted to lower-energy products. The incorporation of NTPs into an RNA polymer would qualify as a simple metabolism, although one would need to account for the source of the high-energy NTPs. Some of the starting materials may have been provided by the environment, for example, ribose and other sugars, inorganic polyphosphate, and the building blocks of purines and pyrimidines. Chemical processes in the environment may have led to more complex compounds, drawing on natural energy sources such as sunlight, electric discharges and geothermal activity. These reactions would not be considered part of metabolism, because they would not be carried out by genetically encoded catalysts. The final touches, however, leading to specific chemical organization, probably would have required the assistance of evolved catalysts. In the RNA world, those catalysts are assumed to have been ribozymes. Ribozymes that catalyse some of the steps of nucleotide synthesis have been obtained by in vitro evolution (Fig. 5). One such ribozyme catalyses the formation of a nucleotide from a pyrimidine and activated ribose4. This is a notoriously difficult reaction in prebiotic chemistry53, but was achieved starting with a pool of random-sequence RNAs that were tethered to 5-phosphoribosyl-1-pyrophosphate and allowed to react with 4-thiouracil. The evolved ribozyme performs the reaction with a catalytic rate of 0.1 min-1 and a rate enhancement of >107-fold compared to the uncatalysed reaction. Another in vitro-evolved ribozyme catalyses 5'-phosphorylation of polynucleotides, using ATP- -S (or ATP) as the phosphate donor54. It operates with a catalytic rate of 0.2 min-1 (or 0.003 min-1 with ATP) and a catalytic rate enhancement of 109-fold. Yet another ribozyme catalyses activation of the 5'-phosphate by attachment of a 5',5'-pyrophosphate-linked nucleotide55. This linkage is less energetic than the , -phosphoanhydride of an NTP. However, ribozymes have been obtained that catalyse template-directed ligation of RNA driven by release of adenylate from a terminal adenosine-5',5'-pyrophosphate56. Figure 5 Hypothetical pathway for RNA-catalysed synthesis of RNA. Full legend High resolution image and legend (57k) There are several important reactions in nucleotide synthesis that have not yet been carried out with a ribozyme (Fig. 5). The formation of ribose from simple aldehydes would be a significant achievement, requiring a ribozyme that catalyses a substrate-specific aldol condensation. The formation of alkylphosphates, such as glycerol phosphate or phosphorylethanolamine, would be notable, especially if the source of phosphate was a mineral or some other compound that was abundant in the environment. RNA is adept at catalysing phosphoryl transfer reactions, so it is not difficult to imagine how the phosphate, once mobilized, then would be transferred to ribose or other compounds. The synthesis of purines (for example, from cyanates) and pyrimidines (for example, from carbamoyl phosphate and aspartate) would be important as well. The possibility of a more complex RNA-based metabolism is purely conjectural. That said, one could imagine that all of the reactions of central metabolism, now catalysed by protein enzymes, were once catalysed by ribozymes. It has been suggested that the nucleotidederived coenzymes, which have a prominent role in most of these reactions today, are remnants of an earlier RNA-based metabolism57. Another extreme but opposite point of view is that the only catalytic function of RNA in the RNA world was to direct the synthesis of encoded polypeptides. The synthesis of nucleotides and the template-directed polymerization of RNA may have been the responsibility of pre-RNA catalysts, with RNA merely serving as a messenger, aminoacyl adaptor and peptidyl transferase catalyst, just as it does today. How does one assess the likelihood that a putative RNA-based function did in fact exist in the RNA world? First, it must fall within the capabilities of RNA, preferably bolstered by an experimental demonstration. Ribozymes have been obtained through in vitro evolution that catalyse a broad range of chemical reactions, including acyl transfer58, 59, N- and Salkylation60, 61, carbon–carbon bond formation62, 63, amide bond formation64 and Michael addition65. But controlling a free radical within a hydrophobic pocket is likely to be beyond the capabilities of RNA, leading some to suggest that ribozymes never were responsible for the conversion of ribose to deoxyribose66, 67. Second, the RNA-catalysed reaction must have provided some selective advantage that was not contingent on future evolutionary developments. For example, a ribozyme that catalysed formation of ribosyl-1-amine from ribosyl-1-pyrophosphate and glutamine may have been selected based on the utility of the amino-sugar, but not with regard to its eventual role as an intermediate in purine biosynthesis. If both of these requirements are met, then one should consider whether there is any evidence for the RNA-based function in contemporary biology or the geological record. It is possible, of course, that the function existed in the RNA world, but left no trace that can be detected today. Conversely, an RNA-based function that exists in contemporary biology need not have arisen in the RNA world. If, however, that function is widely distributed across all three kingdoms of life and uses RNA in a way that is not uniquely beholding to the chemical properties of RNA, then the argument that it is a remnant of the RNA world becomes more persuasive68. The instructed synthesis of proteins is a strong candidate for a function that existed in the RNA world. Similar rationale has been applied to suggest that tetrapyrrole biosynthesis may have arisen in the RNA world68. The C5 pathway for the synthesis of 5-aminolevulinic acid leading to the tetrapyrroles is well represented in all three kingdoms. It uses glutamyltransfer RNA for the synthesis of glutamate-1-semialdehyde69, although other glutamyl esters would do just as well. In contrast, self-splicing introns are broadly distributed in contemporary biology but take special advantage of the base-pairing properties of RNA, and thus should be viewed more agnostically. Other RNA-based functions for which there is no evidence in biology, such as nucleotide synthesis and RNA polymerization, are assumed to have existed in the RNA world based on first principles, but it is important to recognize that this assumption is not supported by available historical evidence. It is often said, again based on first principles rather than historical evidence, that RNAbased life must have entailed some form of cellular compartmentalization70, 71. This would be advantageous for keeping together an RNA replicase ribozyme and its corresponding genomic RNA, and more generally for retaining the fruits of an RNA-based metabolism for the benefit of the system that produced them. The notion of cellular compartmentalization should not be taken too literally — although all cells in contemporary biology are surrounded by a membrane composed of amphipathic lipids, there are other ways to achieve the preferential association of an ensemble of compounds. It has been proposed, for example, that there were organizing centres, analogous to modern ribosomes, where various RNAs and small molecules came together through non-covalent or transient covalent interactions72. Small organic molecules could have been esterified to RNA and these carrier-linked metabolites could have been held in close proximity through RNA–RNA interactions or organization along a surface. The same outcome could be achieved by passive compartmentalization on the surface of fine particulate matter, within aerosol particles in the atmosphere73, or within the pores of a rock. Even if the RNA world (or preRNA world) synthesized compartments, those compartments might have been assembled from something other than complex phospholipids, such as nucleic acids, alternating polypeptides that form -sheets74, or simple terpenoids75. Transition to the DNA–protein world Although RNA is well suited as a genetic molecule and can evolve to perform a broad range of catalytic tasks, it has limited chemical functionality and thus may not be equipped to meet certain challenges and opportunities that arise in the environment. An important innovation of life on Earth was the development of a separate macromolecule that would be responsible for most catalytic functions, even though that molecule contained subunits that were poorly suited for replication. The invention of protein synthesis, instructed and catalysed by RNA, was the crowning achievement of the RNA world, but also began its demise. RNA is capable of performing all of the reactions of protein synthesis (Fig. 6). The messenger, transfer and ribosomal RNA molecules that exist in all known organisms direct the assembly of specific polypeptide sequences, instructed by corresponding RNA sequences. The activation of amino acids in the form of aminoacyl adenylates, and subsequent transfer of the amino acids to the 2'(3') terminus of tRNAs, are catalysed in modern biology by the set of 20 aminoacyl-tRNA synthetase proteins. These reactions also have been achieved with in vitro-evolved ribozymes. A ribozyme that contains 110 nucleotides catalyses addition of either leucine or phenylalanine to its own 5' terminus, forming an aminoacyl-nucleotide anhydride76. Another ribozyme catalyses aminoacylation of its own 2'(3') terminus, using various aminoacyl-adenylate substrates77, 78. Yet another ribozyme catalyses aminoacylation of tRNAs that are either covalently attached to the ribozyme or provided as a separate substrate79. Figure 6 Hypothetical pathway for RNA-catalysed protein synthesis. Full legend High resolution image and legend (60k) The final step of protein synthesis involves binding aminoacyl and peptidyl oligonucleotides at adjacent positions along an RNA template and catalysing peptide bond formation through attack of the -amine of the amino acid on the carbonyl of the peptidyl ester. From a chemical perspective this is the easiest step, proceeding spontaneously once the reactants are brought into close proximity. It has been shown, for example, that when 2'(3')-glycyl adenosine is bound to a complementary template, peptide bond formation ensues, giving rise to diglycine, which then cyclizes to form a diketopiperazine80. The modern ribosome achieves high template occupancy and precise orientation of the aminoacyl- and peptidyl-tRNAs, and may use additional catalytic strategies in promoting peptide bond formation81. It is not difficult to visualize how RNA alone could carry out this reaction; in fact, the crystal structure of the ribosome reveals a peptidyl transferase site that is composed entirely of RNA (ref. 82, and see review in this issue by Moore and Steitz, pages 229–235). In vitro evolution has been used to develop ribozymes that catalyse peptide bond formation. Beginning with a pool of random-sequence RNAs with phenylalanine tethered to their 5' end, molecules were selected based on their ability to react with a methionyl-adenylate substrate to form a tethered dipeptide7. One of the evolved ribozymes contains 190 nucleotides and has a catalytic rate of 0.1 min-1. It accepts several different aminoacyl adenylates, preferring methionine, and can be made to operate with multiple turnover by providing the phenylalanine substrate tethered to a short oligonucleotide rather than to the ribozyme itself83. Just as the class I ligase ribozyme has been evolved to function as an RNA polymerase, this peptidyl transferase ribozyme might be evolved to form multiple peptide bonds in succession. It is not known whether the invention of protein synthesis preceded or followed the invention of DNA genomes. The primary advantage of DNA over RNA as a genetic material is the greater chemical stability of DNA, allowing much larger genomes based on DNA. Protein synthesis may require more genetic information than can be maintained by RNA. However, the original aminoacyl adaptor molecules may have been smaller in size and fewer in number than contemporary tRNAs84, and if the aminoacylation and peptidyl transfer activities were far less sophisticated than in the modern ribosome, an RNA genome of only a few thousand nucleotides might have been sufficient for protein synthesis. The chief argument in favour of proteins before DNA is that ribozymes seem to be incapable of catalysing the reduction of deoxyribose to ribose through the same mechanism used by all known ribonucleoside reductase proteins66, 67. It is conceivable, however, that RNA used a different mechanism, perhaps involving reduction of an attached purine followed by acidcatalysed elimination of the ribose 2'-hydroxyl (A. Eschenmoser, personal communication). The template-directed polymerization of DNA is more difficult than for RNA because the 3'-hydroxyl of DNA has substantially lower acidity compared to that of RNA85. Nonetheless, ribozymes are capable of deprotonating a DNA 3'-hydroxyl, allowing nucleophilic attack on phosphate to form a 3',5'-phosphodiester linkage86. It is not difficult to imagine that a ribozyme could function as a DNA polymerase. Such a molecule might arise in nature or in the laboratory as an evolutionary descendant of an RNA polymerase ribozyme. RNA-based information could be reverse transcribed to DNA for safe keeping, then read back to RNA by a DNA-dependent RNA polymerase. Eventually the DNA molecules became the objects of replication, completing the transition to the DNA–protein world. A largely open question concerns the origin of the genetic code. The aminoacylation of RNA initially must have provided some selective advantage unrelated to the eventual development of a translation machinery. It has been proposed, for example, that aminoacylation protected RNA from degradation87, anchored RNA in an advantageous environment87, marked genomic RNAs for replication87, 88, or enhanced the catalytic properties of RNA89. Some amino acids would have been especially useful in these roles, leading to selective aminoacylation with one or perhaps a few related amino acids. Different RNAs may have been aminoacylated with different amino acids, providing the basis for a family of precursors to the modern aminoacyl-tRNAs. The RNA component of these precursor molecules may have been as simple as a stem-loop structure84, 90 or as elaborate as a ribozyme that catalyses its own aminoacylation77, 79. The next step towards the origin of the genetic code was the formation of peptide bonds between amino acids that were attached to RNA. The products of this reaction must have conferred some selective advantage, even though the peptides probably would have been too small and too heterogeneous in sequence to function as catalysts. Instead, they might have served as cofactors for ribozymes91, 92 or been more effective than amino acids for any of the roles suggested above. RNA-catalysed peptide bond formation would have resulted in a large number of possible peptide sequences, and even this mixture may have been useful87. However, the development of a crude mechanism for controlling the diversity of possible peptides would have been advantageous, and progressive refinement of that mechanism would have provided further selective advantage. It is reasonable to postulate that, like the modern translation apparatus, the ancestral translation system made use of messenger-like RNA molecules to gather aminoacyl-RNAs in a specific order through Watson–Crick pairing interactions. It is not clear, however, how the detailed assignments of the genetic code were made. RNA has several features that make it suitable as the basis for a simple darwinian system: it contains only four different subunits with very similar chemical properties, its subunits polymerize readily when activated and bound to a complementary template, it is a polyanion that is readily soluble in water almost irrespective of sequence, it forms simple secondary structures that are highly tolerant of sequence variation, and it can adopt entirely different structures following the acquisition of a few critical mutations93, 94. These same features make it less sophisticated compared to its DNA and protein successors. The lower reactivity but greater stability of DNA makes it a better choice for the genetic material, whereas the greater chemical diversity of the subunits of proteins, including anionic, cationic and hydrophobic groups, makes protein a better choice as the basis for catalytic function. However, those more sophisticated molecules could not have arisen without the foundation that had been laid by RNA. Outlook The reign of the RNA world on Earth probably began no more than about 4.2 billion years ago and ended no less than about 3.6 billion years ago95. It may have occupied only a small portion of that interval, with the pre-RNA world having come before. Insight into the origin and operation of the RNA world is largely inferential, based on the known chemical and biochemical properties of RNA. In the best of circumstances those inferences are supported by examining the role of RNA in contemporary biology. Without that support one must be careful not to draw detailed conclusions regarding these historical events. Future studies will sharpen the picture of ancestral RNA-based life through combined efforts in prebiotic chemistry, in vitro evolution, biochemical analysis and molecular phylogenetics. It should be possible to formulate more precise boundary conditions regarding the environmental conditions of the early Earth and the types of chemical reactions that would have occurred under those conditions. Additional catalytic RNAs are likely to be found in biology and undoubtedly many more will be discovered through test-tube evolution. The construction of artificial RNA-based life from synthetic oligonucleotides is a distinct possibility71, and there even is a chance that a remnant of the RNA world will be found lurking in some special contemporary microenvironment. References 1. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A. The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905-920 (2000). | Article | PubMed | ISI | 2. Wimberly, B. T. et al. Structure of the 30S ribosomal subunit. Nature 407, 327-338 (2000). | Article | PubMed | ISI | 3. Yusupov, M. et al. Crystal structure of the ribosome at 5.5 Å resolution. Science 292, 883-896 (2001). | Article | PubMed | ISI | 4. Unrau, P. J. & Bartel, D. P. RNA-catalysed nucleotide synthesis. Nature 395, 260-263 (1998). | Article | PubMed | ISI | 5. Johnston, W. K., Unrau, P. J., Lawrence, M. S., Glasner, M. E. & Bartel, D. P. RNA-catalyzed RNA polymerization: accurate and general RNA-templated primer extension. Science 292, 1319-1325 (2001). | Article | PubMed | ISI | 6. Lee, N., Bessho, Y., Wei, K., Szostak, J. W. & Suga, H. Ribozyme-catalyzed tRNA aminoacylation. Nature Struct. Biol. 7, 28-33 (2000). | Article | PubMed | ISI | 7. Zhang, B. & Cech, T. R. Peptide bond formation by in vitro selected ribozymes. Nature 390, 96100 (1997). | Article | PubMed | ISI | 8. von Kiedrowski, G. A self-replicating hexadeoxynucleotide. Angew. Chem. 25, 932-935 (1986). | ISI | 9. Gilbert, W. The RNA world. Nature 319, 618 (1986). | ISI | 10. Joyce, G. F. RNA evolution and the origins of life. Nature 338, 217-224 (1989). | PubMed | ISI | 11. Ferris, J. P., Sanchez, R. A. & Orgel, L. E. Studies in prebiotic synthesis III. Synthesis of pyrimidines from cyanoacetylene and cyanate. J. Mol. Biol. 33, 693-704 (1968). | PubMed | ISI | 12. Robertson, M. P. & Miller, S. L. An efficient prebiotic synthesis of cytosine and uracil. Nature 375, 772-774 (1995). | PubMed | ISI | 13. Lohrmann, R. & Orgel, L. E. Prebiotic synthesis: phosphorylation in aqueous solution. Science 161, 64-66 (1968). | PubMed | ISI | 14. Fuller, W. D., Sanchez, R. A. & Orgel, L. E. Studies in prebiotic synthesis. VI. Synthesis of purine nucleosides. J. Mol. Biol. 67, 25-33 (1972). | PubMed | ISI | 15. Müller, D. et al. Chemie von -Aminonitrilen. Aldomerisierung von Glykolaldehydphosphat zu racemischen Hexose-2,4,6-triphosphaten und (in Gegenwart von Formaldehyd) racemischen Pentose-2,4-diphosphaten: rac.-Allose-2,4,6-triphosphat und rac.-Ribose-2,4-diphosphat sind die Reaktionshauptprodukte. Helv. Chim. Acta 73, 1410-1468 (1990). | ISI | 16. Krishnamurthy, R., Pitsch, S. & Arrhenius, G. Mineral induced formation of pentose-2,4bisphosphates. Origins Life Evol. Biosph. 29, 139-152 (1999). | ISI | 17. Ferris, J. P. & Ertem, G. Oligomerization of ribonucleotides on montmorillonite: reaction of the 5'-phosphorimidazolide of adenosine. Science 257, 1387-1389 (1992). | PubMed | ISI | 18. Eschenmoser, A. Chemical etiology of nucleic acid structure. Science 284, 2118-2124 (1999). | Article | PubMed | ISI | 19. Schöning, K.-U. et al. Chemical etiology of nucleic acid structure: the -threofuranosyl-(3' 2') oligonucleotide system. Science 290, 1347-1351 (2000). | PubMed | ISI | 20. Nielsen, P. E., Egholm, M., Berg, R. H. & Buchardt, O. Sequence-selective recognition of DNA by strand displacement with a thymine-substituted polyamide. Science 254, 1497-1500 (1991). | PubMed | ISI | 21. Nelson, K. E., Levy, M. & Miller, S. L. Peptide nucleic acids rather than RNA may have been the first genetic molecule. Proc. Natl Acad. Sci. USA 97, 3868-3871 (2000). | PubMed | ISI | 22. Joyce, G. F. et al. Chiral selection in poly(C)-directed synthesis of oligo(G). Nature 310, 602604 (1984). | PubMed | ISI | 23. Schmidt, J. G., Nielsen, P. E. & Orgel, L. E. Enantiomeric cross-inhibition in the synthesis of oligonucleotides on a nonchiral template. J. Am. Chem. Soc. 119, 1494-1495 (1997). | Article | PubMed | ISI | 24. Eriksson, M. et al. Sequence dependent N-terminal rearrangement and degradation of peptide nucleic acid (PNA) in aqueous solution. New J. Chem. 22, 1055-1059 (1998). | Article | ISI | 25. Spach, G. Chiral versus chemical evolutions and the appearance of life. Origins Life Evol. Biosph. 14, 433-437 (1984). | ISI | 26. Joyce, G. F., Schwartz, A. W., Miller, S. L. & Orgel, L. E. The case for an ancestral genetic system involving simple analogues of the nucleotides. Proc. Natl Acad. Sci. USA 84, 4398-4402 (1987). | PubMed | ISI | 27. Schneider, K. C. & Benner, S. A. Oligonucleotides containing flexible nucleoside analogues. J. Am. Chem. Soc. 112, 453-455 (1990). | ISI | 28. Merle, Y., Bonneil, E., Merle, L., Sagi, J. & Szemzo, A. Acyclic oligonucleotide analogues. Int. J. Biol. Macromol. 17, 239-246 (1995). | PubMed | ISI | 29. Chaput, J. C. & Switzer, C. Nonenzymatic oligomerization on templates containing phosphoester-linked acyclic glycerol nucleic acid analogues. J. Mol. Evol. 51, 464-470 (2000). | PubMed | ISI | 30. Pitsch, S., Wendeborn, S., Jaun, B. & Eschenmoser, A. Why pentose- and not hexose nucleic acids? Pyranosyl-RNA ('p-RNA'). Helv. Chim. Acta 76, 2161-2183 (1993). | ISI | 31. Pitsch, S. et al. Pyranosyl-RNA ('p-RNA'): base-pairing selectivity and potential to replicate. Helv. Chim. Acta 78, 1621-1635 (1995). | ISI | 32. Lee, D. H., Granja, J. R., Martinez, J. A., Severin, K. & Ghadiri, M. R. A self-replicating peptide. Nature 382, 525-528 (1996). | PubMed | ISI | 33. Tjivikua, T., Ballester, P. & Rebek, J. Jr A self-replicating system. J. Am. Chem. Soc. 112, 12491250 (1990). | ISI | 34. Cairns-Smith, A. G. The origin of life and the nature of the primitive gene. J. Theor. Biol. 10, 5388 (1966). | PubMed | ISI | 35. Cairns-Smith, A. G. & Davies, C. J. in Encyclopaedia of Ignorance (eds Duncan, R. & WestonSmith, M.) 391-403 (Pergamon, Oxford, 1977). 36. James, K. D. & Ellington, A. D. The fidelity of template-directed oligonucleotide ligation and the inevitability of polymerase function. Origins Life Evol. Biosph. 29, 375-390 (1999). | ISI | 37. Kirby, A. J. & Younas, M. The reactivity of phosphate esters. Diester hydrolysis. J. Chem. Soc. B 510-513 (1970). | ISI | 38. Admiraal, S. J. & Herschlag, D. Catalysis of phosphoryl transfer from ATP by amine nucleophiles. J. Am. Chem. Soc. 121, 5837-5845 (1999). | Article | ISI | 39. Rohatgi, R., Bartel, D. P. & Szostak, J. W. Kinetic and mechanistic analysis of nonenzymatic, template-directed oligoribonucleotide ligation. J. Am. Chem. Soc. 118, 3332-3339 (1996). | Article | PubMed | ISI | 40. Li, Y. & Breaker, R. R. Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2'-hydroxyl group. J. Am. Chem. Soc. 121, 5364-5372 (1999). | Article | ISI | 41. Robertson, D. L. & Joyce, G. F. Selection in vitro of an RNA enzyme that specifically cleaves single-stranded DNA. Nature 344, 467-468 (1990). | PubMed | ISI | 42. Tuerk, C. & Gold, L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249, 505-510 (1990). | PubMed | ISI | 43. Ellington, A. D. & Szostak, J. W. In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822 (1990). | PubMed | ISI | 44. Bartel, D. P. & Szostak, J. W. Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411-1418 (1993). | PubMed | ISI | 45. Ekland, E. H., Szostak, J. W. & Bartel, D. P. Structurally complex and highly active RNA ligases derived from random RNA sequences. Science 269, 364-370 (1995). | PubMed | ISI | 46. Ekland, E. H. & Bartel, D. P. RNA-catalysed RNA polymerization using nucleoside triphosphates. Nature 382, 373-376 (1996). | PubMed | ISI | 47. Jaeger, L., Wright, M. C. & Joyce, G. F. A complex ligase ribozyme evolved in vitro from a group I ribozyme domain. Proc. Natl Acad. Sci. USA 96, 14712-14717 (1999). | Article | PubMed | ISI | 48. McGinness, K. E. & Joyce, G. F. RNA-catalyzed RNA ligation on an external RNA template. Chem. Biol. 9, 297-307 (2002). | PubMed | ISI | 49. Robertson, M. P. & Ellington, A. D. In vitro selection of an allosteric ribozyme that transduces analytes into amplicons. Nature Biotechnol. 17, 62-66 (1999). | Article | PubMed | ISI | 50. Rogers, J. & Joyce, G. F. The effect of cytidine on the structure and function of an RNA ligase ribozyme. RNA 7, 395-404 (2001). | Article | PubMed | ISI | 51. Landweber, L. F. & Pokrovskaya, I. D. Emergence of a dual-catalytic RNA with metal-specific cleavage and ligase activities: the spandrels of RNA evolution. Proc. Natl. Acad. Sci. USA 96, 173-178 (1999). | Article | PubMed | ISI | 52. Eigen, M. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften 58, 465-523 (1971). | PubMed | ISI | 53. Orgel, L. E. & Lohrmann, R. Prebiotic chemistry and nucleic acid replication. Acc. Chem. Res. 7, 368-377 (1974). | ISI | 54. Lorsch, J. & Szostak, J. W. In vitro evolution of new ribozymes with polynucleotide kinase activity. Nature 371, 31-36 (1994). | PubMed | ISI | 55. Huang, F. & Yarus, M. Versatile 5' phosphoryl coupling of small and large molecules to an RNA. Proc. Natl Acad. Sci. USA 94, 8965-8969 (1997). | PubMed | ISI | 56. Hager, A. J. & Szostak, J. W. Isolation of novel ribozymes that ligate AMP-activated RNA substrates. Chem. Biol. 4, 607-617 (1997). | PubMed | ISI | 57. White, H. B. III Coenzymes as fossils of an earlier metabolic state. J. Mol. Evol. 7, 101-104 (1976). | PubMed | ISI | 58. Lohse, P. A. & Szostak, J. W. Ribozyme-catalysed amino-acid transfer reactions. Nature 381, 442-444 (1996). | PubMed | ISI | 59. Jenne, A. & Famulok, M. A novel ribozyme with ester transferase activity. Chem. Biol. 5, 23-34 (1998). | PubMed | ISI | 60. Wilson, C. & Szostak, J. W. In vitro evolution of a self-alkylating ribozyme. Nature 374, 777-782 (1995). | PubMed | ISI | 61. Wecker, M., Smith, D. & Gold, L. In vitro selection of a novel catalytic RNA: characterization of a sulfur alkylation reaction and interaction with a small peptide. RNA 2, 982-994 (1996). | PubMed | ISI | 62. Tarasow, T. M., Tarasow, S. L. & Eaton, B. E. RNA-catalysed carbon-carbon bond formation. Nature 389, 54-57 (1997). | Article | PubMed | ISI | 63. Seelig, B. & Jäschke, A. A small catalytic RNA motif with Diels-Alderase activity. Chem. Biol. 6, 167-176 (1999). | PubMed | ISI | 64. Wiegand, T. W., Janssen, R. C. & Eaton, B. E. Selection of RNA amide synthases. Chem. Biol. 4, 675-683 (1997). | PubMed | ISI | 65. Sengle, G., Eisenführ, A., Arora, P. S., Nowick, J. S. & Famulok, M. Novel RNA catalysts for the Michael reaction. Chem. Biol. 8, 459-473 (2001). | PubMed | ISI | 66. Freeland, S. J., Knight, R. D. & Landweber, L. F. Do proteins predate DNA? Science 286, 690692 (1999). | Article | PubMed | ISI | 67. Stubbe, J., Ge, J. & Yee, C. S. The evolution of ribonucleotide reduction revisited. Trends Biochem. Sci. 26, 93-99 (2001). | PubMed | ISI | 68. Benner, S. A., Ellington, A. D. & Tauer, A. Modern metabolism as a palimpsest of the RNA world. Proc. Natl Acad. Sci. USA 86, 7054-7058 (1989). | PubMed | ISI | 69. Schön, A. et al. The RNA required in the first step of chlorophyll biosynthesis is a chloroplast glutamate tRNA. Nature 322, 281-284 (1986). | PubMed | ISI | 70. Luisi, P. L. About various definitions of life. Origins Life Evol. Biosph. 28, 613-622 (1998). | ISI | 71. Szostak, J. W., Bartel, D. P. & Luisi, P. L. Synthesizing life. Nature 409, 387-390 (2001). | Article | ISI | 72. Gibson, T. J. & Lamond, A. I. Metabolic complexity in the RNA world and implications for the origin of protein synthesis. J. Mol. Evol. 30, 7-15 (1990). | PubMed | ISI | 73. Dobson, C. M., Ellison, G. B., Tuck, A. F. & Vaida, V. V. Atmospheric aerosols as prebiotic chemical reactors. Proc. Natl Acad. Sci. USA 97, 11864-11868 (2000). | PubMed | ISI | 74. Brack, A. & Orgel, L. E. structures of alternating polypeptides and their possible prebiotic significance. Nature 256, 383-387 (1975). | PubMed | ISI | 75. Ourisson, G. & Nakatani, Y. The terpenoid theory of the origin of cellular life: the evolution of terpenoids to cholesterol. Chem. Biol. 1, 11-23 (1994). | PubMed | 76. Kumar, R. K. & Yarus, M. RNA-catalyzed amino acid activation. Biochemistry 40, 6998-7004 (2001). | PubMed | ISI | 77. Illangasekare, M., Sanchez, G., Nickles, T. & Yarus, M. Aminoacyl-RNA synthesis catalyzed by an RNA. Science 267, 643-647 (1995). | PubMed | ISI | 78. Illangasekare, M. & Yarus, M. Small-molecule-substrate interactions with a self-aminoacylating ribozyme. J. Mol. Biol. 268, 631-639 (1997). | Article | PubMed | ISI | 79. Saito, H., Kourouklis, D. & Suga, H. An in vitro evolved precursor tRNA with aminoacylation 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. activity. EMBO J. 20, 1797-1806 (2001). | PubMed | ISI | Weber, A. L. & Orgel, L. E. Poly(U)-directed peptide bond formation from the 2'(3')-glycyl esters of adenosine derivatives. J. Mol. Evol. 16, 1-10 (1980). | PubMed | ISI | Barta, A. et al. Mechanism of ribosomal peptide bond formation. Science 291, 203a (2001) (published online at http://www.sciencemag.org/cgi/content/full/291/5502/203a). | Article | Nissen, P., Hansen, J., Ban, N., Moore, P. B. & Steitz, T. A. The structural basis of ribosome activity in peptide bond synthesis. Science 289, 920-930 (2000). | Article | PubMed | ISI | Zhang, B. & Cech, T. R. Peptidyl-transferase ribozymes: trans reactions, structural characterization and ribosomal RNA-like features. Chem. Biol. 5, 539-553 (1998). | PubMed | ISI | Schimmel, P. & Henderson, B. Possible role of aminoacyl-RNA complexes in noncoded peptide synthesis and origin of coded synthesis. Proc. Natl Acad. Sci. USA 91, 11283-11286 (1994). | PubMed | ISI | Izatt, R. M., Hansen, L. D., Rytting, J. H. & Christensen, J. J. Proton ionization from adenosine. J. Am. Chem. Soc. 87, 2760-2761 (1965). | ISI | Sugimoto, N., Tomka, M., Kierzek, R., Bevilacqua, P. C. & Turner, D. H. Effects of substrate structure on the kinetics of circle opening reactions of the self-splicing intervening sequence from Tetrahymena thermophila: evidence for substrate and Mg2+ binding interactions. Nucleic Acids Res. 17, 355-371 (1989). | PubMed | ISI | Orgel, L. E. The origin of polynucleotide-directed protein synthesis. J. Mol. Evol. 29, 465-474 (1989). | PubMed | ISI | Weiner, A. M. & Maizels, N. 3' terminal tRNA-like structures tag genomic RNA molecules for replication: implications for the origin of protein synthesis. Proc. Natl Acad. Sci. USA 84, 73837387 (1987). | PubMed | ISI | Wong, J.-T. Origin of genetically encoded protein synthesis: a model based on selection for RNA peptidation. Origins Life Evol. Biosph. 21, 165-176 (1991). | ISI | Schimmel, P. & Ribas de Pouplana, L. Transfer RNA: from minihelix to genetic code. Cell 81, 983-986 (1995). | PubMed | ISI | Roth, A. & Breaker, R. R. An amino acid as a cofactor for a catalytic polynucleotide. Proc. Natl Acad. Sci. USA 95, 6027-6031 (1998). | Article | PubMed | ISI | Joyce, G. F. Nucleic acid enzymes: playing with a fuller deck. Proc. Natl Acad. Sci. USA 95, 5845-5847 (1998). | PubMed | ISI | Fontana, W. & Schuster, P. Continuity in evolution: on the nature of transitions. Science 280, 1451-1455 (1998). | PubMed | ISI | Schultes, E. A. & Bartel, D. P. One sequence, two ribozymes: implications for the emergence of new ribozyme folds. Science 289, 448-452 (2000). | Article | PubMed | ISI | Joyce, G. F. The rise and fall of the RNA world. New Biol. 3, 399-407 (1991). | PubMed | ISI | Acknowledgements. I thank T. Cech, A. Eschenmoser, R. Krishnamurthy, L. Orgel, N. Paul, P. Schimmel and W. Shih for helpful comments on the manuscript. I acknowledge funding from the National Aeronautics and Space Administration and the Skaggs Institute for Chemical Biology. Figure 1 Timeline of events pertaining to the early history of life on Earth, with approximate dates in billions of years before the present. Figure 2 Prebiotic clutter surrounding RNA. Each of the four components of RNA (coloured green, red, purple and blue) would have been accompanied by several closely related analogues (listed in black type), which could have assembled in almost any combination. All possible building blocks for each of the components should be regarded as sorting independently; for example, the phosphodiester linkage may have comprised either a 3',5' linkage involving a phosphate or a 2',5' linkage involving a pyrophosphate. Figure 3 Candidate precursors to RNA during the early history of life on Earth. a, Threose nucleic acid; b, peptide nucleic acid; c, glycerol-derived nucleic-acid analogue; d, pyranosyl-RNA. B, nucleotide base. Figure 4 Successive phases in the in vitro evolution of an RNA polymerase ribozyme. a, Class I ligase ribozyme catalyses template-directed joining of the 3' end of an RNA primer (open line) to the 5' end of the ribozyme44. b, Class I ligase also catalyses addition of three nucleoside 5'triphosphates (NTPs) to the 3' end of the primer, directed by an internal template46. c, Class Iderived polymerase catalyses addition of up to 14 NTPs on an external RNA template5. Figure 5 Hypothetical pathway for RNA-catalysed synthesis of RNA. A circled letter indicates reactions that have been demonstrated experimentally. a, Aldol condensation of glycoaldehyde and glyceraldehyde to form ribose. b, Transfer of the carbamoyl group of carbamoyl phosphate to aspartate and subsequent cyclization to form a pyrimidine. c, Pentamerization of HCN to form a purine. d, Addition of a purine or pyrimidine (B) to ribose to form a nucleoside. e, Phosphorylation of a nucleoside to form a nucleotide. f, Activation of a nucleotide by transfer of the nucleotide portion of NTP. g, Addition of a nucleotide to the 3' end of an RNA primer (open line). The two RNA substrates are bound at adjacent positions on a complementary template (not shown). h, Successive nucleotide additions resulting in further primer extension. i, Phosphoryl transfer from an alkyl polyphosphate to NDP, regenerating NTP. NMP is converted to NDP in a similar manner. The ultimate source of phosphate is a polyphosphate mineral. Figure 6 Hypothetical pathway for RNA-catalysed protein synthesis. A circled letter indicates reactions that have been demonstrated experimentally. a, Activation of an amino acid by formation of an aminoacyl-nucleotide anhydride. b, Transfer of an activated amino acid to the 2'(3') terminus of tRNA. The semicircle between the 2'- and 3'-oxygens indicates that the amino acid migrates rapidly between these two positions. c, Peptidyl transfer resulting in formation of a dipeptide. The two aminoacyl-tRNA substrates are bound at adjacent positions on a complementary template (not shown). d, Successive peptidyl transfer reactions resulting in formation of a polypeptide. 11 July 2002 Nature 418, 222 - 228 (2002); doi:10.1038/418222a <> The chemical repertoire of natural ribozymes JENNIFER A. DOUDNA* AND THOMAS R. CECH† * Department of Molecular and Cell Biology, and Howard Hughes Medical Institute, University of California, Berkeley, California 94720, USA (e-mail: doudna@uclink.berkeley.edu) † Howard Hughes Medical Institute, 4000 Jones Bridge Road, Chevy Chase, Maryland 20815, USA (e-mail: thomas.cech@colorado.edu) Although RNA is generally thought to be a passive genetic blueprint, some RNA molecules, called ribozymes, have intrinsic enzyme-like activity — they can catalyse chemical reactions in the complete absence of protein cofactors. In addition to the well-known small ribozymes that cleave phosphodiester bonds, we now know that RNA catalysts probably effect a number of key cellular reactions. This versatility has lent credence to the idea that RNA molecules may have been central to the early stages of life on Earth. How life began on Earth is one of the great scientific mysteries. Molecular biologists have long suspected that RNA molecules were key to the process, in part because RNA has essential roles in a most fundamental process — protein synthesis — within all cells. The first example of an RNA molecule that forms a catalytic active site for a series of precise biochemical reactions was reported 20 years ago: the self-splicing pre-ribosomal RNA (rRNA) of the ciliate Tetrahymena. Although there was only one example, the word 'ribozyme' was coined for the general concept of an RNA molecule with enzyme-like activity1. The following year catalytic activity was discovered in the RNA component of a ribonucleoprotein enzyme, ribonuclease (RNase) P, providing the first example of a multiple-turnover enzyme using RNA-based catalysis2. These findings lent increased credibility to the hypothesis of an RNA world, where RNA served both as the genetic material and the principal cellular enzyme, probably assisted in the latter role by metal ions, amino acids and other small-molecule cofactors. The RNA world hypothesis posits that as cellular metabolism became more sophisticated, increasing demands on biocatalysts provided the impetus for the transition to protein enzymes. Descendants from this proposed RNA-dominated era inhabit today's world in the form of naturally occurring ribozymes present in organisms ranging from bacteria to humans (Table 1). Although the known natural cellular and viral ribozymes catalyse only phosphodiester transfer chemistry, ribozymes obtained through in vitro selection techniques can exhibit the sort of biochemical sophistication necessary to support cellular metabolism. Starting with a pool of random RNA sequences, molecules possessing a desired activity are isolated through successive cycles of activity selection, reverse transcription of the 'winners' into DNA and amplification of those sequences by the polymerase chain reaction. This methodology has allowed identification of ribozymes that form a nucleotide from a base plus a sugar3, synthesize amide bonds4, 5, form Michael adducts such as those involved in the methylation of uridine monophosphate to give thymidine monophosphate6, and form acyl-coenzyme A, which is found in many protein enzymes7. It is tantalizing to think that these ribozymes are analogues of missing links in a transition from an RNA world to contemporary biology (ref. 8, and see review in this issue by Joyce, pages 214–221). Because the structures and chemical mechanisms of in vitro-selected ribozymes are largely unknown at present, we focus here on the more extensively studied natural ribozymes. RNA-based catalysis How do RNA catalysts compare to their better-known protein enzyme counterparts? First, ribozyme rate enhancements can be substantial. For example, the rate constant for chemistry of the self-cleaving hepatitis delta virus (HDV) ribozyme is estimated at 102–104 s-1, which is close to the maximal cleavage rate of RNase A (1.4 103 s-1 at 25 °C). The Tetrahymena group I self-splicing intron also has a rate constant for its chemical step that is comparable to those of protein enzymes. Second, ribozymes can use cofactors such as imidazole during catalysis9, 10, and they can be switched on and off by the binding of smallmolecule allosteric effectors11. Finally, molecular structures have revealed that ribozymes, like protein catalysts, fold into specific three-dimensional shapes that can harbour deep grooves and solvent-inaccessible active sites. These tertiary structures facilitate catalysis in part by orienting substrates adjacent to catalytic groups and metal ions. To facilitate chemical transformations, catalysts stabilize the transition state between substrate and product. Both protein and RNA catalysts may achieve this by adding or removing protons during a reaction, orienting substrates so that they are optimally positioned to react, and using binding interactions away from the reaction site to 'force' an unfavourable contact that is relieved in the transition state. Here we discuss the structural and chemical basis for RNA catalysis as determined for several of the naturally occurring ribozymes. The lack of diverse functional groups in RNA molecules and the propensity for RNA to bind metal ions led to early hypotheses that all ribozymes might act as metalloenzymes, positioning metal ions for direct roles in catalysis. This seems to be true for group I and group II self-splicing introns and RNase P, and in the Tetrahymena group I intron these metal ions and their specific functions have been identified. Surprisingly, however, those ribozymes that perform site-specific strand scission — the hammerhead, hairpin, Neurospora Varkud satellite (VS) and HDV ribozymes — may use diverse catalytic mechanisms, and none has emerged clearly as a metalloenzyme. Site-specific RNA self-cleavage The hammerhead, HDV, hairpin and VS ribozymes are small RNA structures of 40–160 nucleotides that catalyse site-specific self-cleavage (Table 1). Found in viral, virusoid or satellite RNAs, they process the multimeric products of rolling-circle replication into genome-length strands. Although the reaction catalysed by these ribozymes is the same as that of many protein RNases (Fig. 1a), they act only at specific phosphodiester bonds by using base-pairing and other interactions to align the cleavage site within the ribozyme active site. The evolutionary maintenance of these sequences may result from the relative simplicity and efficiency of RNA-catalysed RNA strand scission. Figure 1 Mechanism of RNA-catalysed self-cleavage. Full legend High resolution image and legend (60k) Hammerhead ribozyme The hammerhead, at 40 nucleotides, is the smallest of the naturally occurring ribozymes, and mediates rolling-circle replication within circular viruslike RNAs that infect plants. Recent experiments show that the hammerhead motif (Fig. 1b) is the most efficient self-cleaving sequence that can be isolated from randomized pools of RNA, suggesting that it may have arisen multiple times during the evolution of functional RNA molecules12. Consisting of three short helices connected at a conserved sequence junction, the hammerhead catalyses site-specific cleavage of one of its own phosphodiester bonds via nucleophilic attack of the adjacent 2'-oxygen at the scissile phosphate (Fig. 1a). The simplicity of the hammerhead secondary structure lent itself to the design of two-piece constructs in which the strand containing the cleavage site was separated from the rest of the self-cleaving RNA. By treating one strand as the substrate and the other as the enzyme, multiple-turnover cleavage occurred with a typical rate of 1 molecule per minute at physiological salt concentrations, consistent with a substantial 109-fold rate enhancement over the uncatalysed rate of nonspecific RNA hydrolysis. Initial studies revealed a requirement for a divalent metal ion for catalysis, leading to the idea that site-specific positioning of a metal ion such as magnesium might enable efficient deprotonation of the attacking 2'-hydroxyl nucleophile. Later studies at much higher ionic strength (4-M monovalent salt) showed that the hammerhead as well as the hairpin and VS ribozymes could react nearly as fast in the absence of divalent ions13. This discovery suggested two distinct possibilities: either these ribozymes use a different catalytic mechanism in the presence of high, non-physiological concentrations of monovalent salts, or the divalent metal ion requirement at low salt concentrations serves a structural rather than a purely chemical function. The unveiling of the crystal structure of the hammerhead ribozyme in 1994, the first ribozyme structure to be determined, revealed a Y-shaped conformation in which nucleotides essential for catalysis were clustered at the junction of the three helical arms14 (Fig. 1c). Since then, additional crystal structures of the hammerhead ribozyme have provided 'snapshots' of the RNA at several steps along the catalytic reaction pathway: the initial pre-cleaved state, two sequential conformational changes that precede cleavage and rotate the scissile phosphate to be in line with the attacking 2'-oxygen nucleophile, and the post-cleavage product state15-19. Together these structures have led to a model of ribozyme cleavage involving precise positioning of the reactive groups by the structure of the ribozyme. However, it has been difficult to ascertain from these structures what the role of bound divalent metal ions might be. Although several divalent ions were identified unambiguously in the crystal structures, they were not situated close enough to the site of catalysis to support a direct role in RNA cleavage. Site-specific substitution of phosphate oxygens with sulphur atoms, which is readily achieved by solid-phase synthetic methods, enabled direct analysis of the effects of disrupting divalent metal-ion binding sites that were potentially involved in catalysis20. These investigations led to evidence for the direct simultaneous coordination of a single metal ion by the scissile phosphate and a second phosphate oxygen located 20 Å away in the crystal structure21, 22. It was proposed that the crystal structures might represent the 'ground state' conformation of the hammerhead ribozyme, and that prior to catalysis the RNA conformation changed significantly but transiently to bring the critical catalytic metal ion proximal to the cleavage site. More recently, molecular modelling and kinetic analysis of the hammerhead cleavage reaction in the presence of monovalent versus divalent salts support the idea that divalent metal ions are not essential to the catalytic step, but instead stabilize the active ribozyme structure19, 23-25. Whether hammerhead catalysis requires a global conformational change or merely a local rearrangement is not yet resolved. In either case, orientation of reactants within the ribozyme active site probably contributes significantly to the rate of site-specific strand scission in the hammerhead. This is achieved through the unique structure of this RNA motif, conferred by the secondary structure and the presence of multiple conserved nucleotides in the active site. Whether these nucleotides provide anything else, such as general acid–base catalysis, is unknown. Hepatitis delta virus and hairpin ribozymes The HDV and hairpin ribozymes catalyse the same chemical reaction as that of the hammerhead, and they are likewise responsible for cleaving intermediates generated during rolling-circle replication of the HDV and a plant virus satellite RNA, respectively. Crystal structures of these ribozymes showed that in each case the RNA forms an enclosed cleft in which strand scission takes place26, 27. Furthermore, neither ribozyme seems to coordinate a divalent metal ion at the site of catalysis, but instead positions functionally essential nucleotides proximal to the substrate in a configuration suggesting the possibility of their direct role in catalysis. The potential for RNA to use general acid–base chemistry during catalysis was unexpected. Functional groups within proteins that have pKa values near neutrality, and thus can donate or accept a proton readily under physiological conditions, can act as general acids or general bases to shuttle protons during enzyme catalysis. But the lack of functional groups within RNA with pKa values near physiological pH (6–7) means that for RNA to function in this way, one or more of its functional groups must have a pKa significantly shifted towards neutrality. In RNA, adenine and cytosine have the potential for protonation of their ring nitrogens, N1 and N3, respectively, but the pKa values for the free nucleosides are 3.5 and 4.2. However, A and C residues with substantially shifted pKa values have been detected in small functional RNAs, presumably owing to the structural environment of the nucleotide28-32. In the HDV ribozyme, a cytosine base essential to catalytic function is positioned in a cleft adjacent to the site of cleavage in the RNA (Fig. 2a, b). A network of potential hydrogen bonds to this nucleotide is consistent with stabilization of a protonated form of the cytosine that might allow it to donate or accept a proton at some stage during catalysis26. This feature could be useful for mediating catalysis by pulling a proton off the attacking 2'oxygen nucleophile, or by providing a proton to the 5'-oxygen leaving group (Fig. 1a). But does this in fact occur? Although detecting the movement of protons within an enzyme, or ribozyme, active site is difficult to achieve directly, several clever experiments have provided indirect evidence that such a mechanism is at work in the HDV ribozyme. Figure 2 Structure of the hepatitis delta virus (HDV) ribozyme. Full legend High resolution image and legend (42k) In one set of experiments, catalytic activity of ribozymes with point mutations at the critical cytosine was partially restored in the presence of imidazole, the side chain of histidine that readily accepts or donates a proton in many protein enzyme active sites9. Furthermore, the measured pKa values of a series of restored reactions correlated with those of the imidazole analogues that promoted cleavage in the mutants33. In a second series of experiments, kinetic isotope effects and correlation of reaction pKa with the pKa of different bases placed at the position of the critical C supported a direct role of this nucleotide in proton shuttling during catalysis34, 35. However, the pKa of the active-site C is apparently transiently shifted during the reaction, as the shift is not detected within the product or precursor forms of the ribozyme36. Together, the current data support a model in which the C acts as a general acid during the reaction to donate a proton to the 5'-bridging oxygen (Fig. 2c). A hydrated metal ion coordinated near the ribozyme active site may abstract a proton from the 2'-hydroxyl nucleophile37 (Fig. 1a). In the hairpin ribozyme the situation is less clear. Although the crystal structure of a precursor form of the RNA suggested that an active-site adenosine might adopt the role of a general acid or base27 (Fig. 3a, b), nucleotide analogue interference experiments have not provided evidence for a shifted pKa at this position38. However, the lack of a requirement for divalent metal ions during hairpin ribozyme cleavage implies that a metal ionindependent mechanism is at work39-41. Figure 3 Structure of the hairpin ribozyme. Full legend High resolution image and legend (67k) Self-splicing introns Group I introns Group I introns have been found to interrupt genes for rRNA, transfer RNA (tRNA) and messenger RNA (mRNA) in far-reaching corners of biology, including the nuclei of protozoa, the mitochondria of fungi, the chloroplasts of algae, and bacteria and their phages. They are defined as a group by their common core secondary structure, consisting of an array of nine base-paired elements (P1–P9), and by their common mechanism of self-splicing. They accomplish splicing by a two-step transesterification mechanism initiated by an exogenous molecule of guanosine or guanosine triphosphate (GTP; Fig. 4a). Figure 4 Self-splicing intron mechanisms. Full legend High resolution image and legend (95k) Because the excised intron still contains the active site for transesterification, it can be reengineered to give a true catalyst that can cleave or ligate exogenous substrate molecules intermolecularly ('in trans'). The Tetrahymena version of this RNA enzyme, together with oligonucleotide substrates that can be synthesized with subtle chemical variations, has facilitated the dissection of the reaction pathway into its elemental steps. First, the RNA substrate base-pairs to an internal guide sequence within the ribozyme, forming the P1 helix. Second, specific ribose 2'-hydroxyl groups along the minor groove of the P1 helix promote its docking into the active site. Third, guanosine (or a 5'-phosphorylated analogue such as GTP) binds to the G site within P7. Fourth, the 3'-hydroxyl of the G acts as a nucleophile, cleaving the 5' splice-site phosphate with inversion of configuration. Finally, products are released; the slow release of the product bound by base pairing plus tertiary interactions is rate limiting for multiple-turnover cleavage under typical conditions42. Many protein enzymes that promote phosphoryl transfer reactions use metal ions for catalysis, and group I ribozymes use the same trick43. The number and location of the active-site metal ions has been investigated by substituting individual phosphate or ribose oxygen atoms with sulphur or with an amino group, and then testing for changes in metalion specificity (a procedure known as 'thiophilic metal-ion rescue'). The most highly supported current model is shown in Fig. 4b. One metal ion (MB) helps deprotonate the 3'oxygen of the G nucleophile, while another (MA) stabilizes the developing negative charge on the leaving-group oxygen in the transition state. MC could assist in positioning the substrates with respect to one another and, along with MA, could help stabilize the trigonal, bipyramidal transition state44. The enzyme mechanism of the group I introns — involving a nucleotide-binding site, nucleophilic attack and metal-ion catalysis — is commonplace in the world of protein enzymes. But clearly the ribozyme active site must be constructed differently, given the charged and hydrophilic nature of the nucleic acid building blocks and the limited diversity of their side chains compared to amino acids. So how might a catalytic active site be built out of ribonucleotides? The first detailed view of active-site construction came with the crystal structure of the 160nucleotide P4–P6 domain of the Tetrahymena intron, an attractive target because it folds into the same structure as an excised domain as it does in the context of the whole ribozyme. This structure revealed how long-range base triples and divalent cation-mediated structures can fold an RNA molecule into a globular structure with an interior that is relatively inaccessible to solvent45, 46. The way in which this domain and the G-sitecontaining domain combine to create a concave active site was seen at modest resolution in the crystal structure of a 240-nucleotide active ribozyme47, a structure whose general features had been predicted by modelling based on comparative phylogenetic analysis48. Finally, the way the structure embraces the P1 substrate helix was modelled by mapping sites of chemical modification that perturb the reaction49. Future goals are to obtain a highresolution crystal structure of an entire group I ribozyme with bound substrates, and to locate the proposed three catalytic metals within that structure. Group II introns Group II introns, found in bacteria and in organellar genes of eukaryotic cells, catalyse precise self-excision and ligation of the flanking RNA sequences to form a mature transcript. In a mechanism distinct from that of the group I introns, the group II reaction involves nucleophilic attack by the 2'-hydroxyl of a specific adenosine within the intron — the 'branch site' — to form a branched or lariat-type structure (Fig. 4a). Magnesium ions coordinated within the intron are thought to have a direct role in catalysis50-52, and several studies have revealed aspects of the intron tertiary structure that are essential to catalytic function53-56. Models of group II intron architecture have been proposed based on chemical probing, phylogenetic covariation and mutagenesis results57, 58. Interestingly, some group II introns encode proteins that assist RNA splicing and can also enable efficient integration of the intron RNA into double-stranded DNA by reverse splicing and reverse transcription59. This reverse splicing activity promotes intron mobility by enabling insertion into targeted genes. Ribonuclease P RNase P, found in all cells, catalyses site-specific hydrolysis of precursor RNA substrates including tRNA, 5S rRNA and the signal recognition particle RNA60, 61. These substrates probably share structural features that enable efficient recognition by the RNase P substrate-binding site, positioning the reactive phosphate in each case for nucleophilic attack by a coordinated water molecule. The ribozyme is thought to be a metalloenzyme, a hypothesis supported by data from phosphorothioate substitution at the scissile phosphate of a substrate pre-tRNA62, 63. RNase P is in fact an RNA–protein complex whose activity, at least in bacteria, resides within the RNA component. Furthermore, it is a true catalyst in the sense that each ribozyme complex catalyses the cleavage of multiple substrate RNAs. As with group I and II introns, much of the information about the secondary and tertiary structure of RNase P RNA has come from extensive phylogenetic covariation analysis of related sequences. Typically 300–400 nucleotides in length, it comprises two domains containing the substrate-recognition site and the ribozyme active site, respectively. Structural models of a bacterial RNase P RNA have been proposed64, 65, but a crystal structure is not yet available. In human cells, the RNase P complex is larger and includes multiple protein components in addition to the RNA66-68. The human RNase P RNA is not catalytically active in the absence of protein, which has made it challenging to determine whether its active site is composed of RNA, protein or some combination of the two. Ribozyme activity, folding and dynamics Protein enzymes that catalyse nucleophilic attack at a phosphate within RNA or a ribonucleotide apparently utilize different chemical mechanisms depending on the enzyme. For example, mammalian adenylyl cyclases function by a two-metal-ion mechanism69, RNase A uses two histidines for general acid–bases catalysis70, and the anthrax adenylyl cyclase exotoxin uses one histidine and a coordinated metal ion to activate the attacking nucleophile and stabilize the leaving group, respectively71. The fact that ribozymes also catalyse phosphodiester bond cleavage by a variety of mechanisms shows that RNA has a breadth of catalytic potential similar to protein enzymes. Furthermore, like protein enzymes, ribozymes must fold into specific three-dimensional structures to function catalytically. How do these RNAs reach their active conformations? This problem has been the subject of intense study using in vitro systems, and several themes are emerging. RNA structures generally fold via a cooperative, hierarchical pathway in which tertiary interactions follow the formation of a stable secondary structure. Rates of tertiary structure formation vary from tens of milliseconds to several minutes, and for the large ribozymes, can be dominated by folding intermediates that transiently trap the RNA in non-native or partially folded conformations72-75. Single-molecule experiments, in which individual ribozyme molecules are analysed by fluorescence microscopy or mechanical tethering, reveal that multiple folding pathways can exist for a particular RNA sequence76-78. It is not yet clear how these observations relate to RNA folding processes in vivo, because proteins may assist ribozyme assembly either through direct RNA binding79, 80 or through covalent modification of specific nucleotides. In addition, RNA tertiary structures and even secondary structures can undergo rapid conformational changes81-85. Such dynamics may be essential for progress through a catalytic cycle. On the other hand, too much structural flexibility may hamper catalysis, providing the incentive for evolution of ribonucleoprotein enzymes rather than pure ribozymes. A starring role for ribozymes? If the RNA world had a lengthy head-start over the protein catalyst world, why are RNA catalysts relatively minor players in modern cells? In fact, they may be much more central to cell biology than was previously believed. The ribosome, which is responsible for information-directed protein synthesis in all of life, is composed of three (or in some cases four) RNA molecules along with several dozen proteins. No protein subunit has ever been identified as a peptidyl transferase enzyme, and for more than 20 years, evidence for a primary role of RNA in this activity has accumulated86. The most direct evidence came with the deduction of the crystal structure of the large subunit87, in which the peptidyl transferase centre was precisely located by binding a small-molecule inhibitor that is an analogue of the anionic tetrahedral intermediate in amide bond formation88. Remarkably, only RNA and no protein lies in the vicinity of the reaction centre, so the catalysis must be ribozymic. The authors suggested one possible mechanism involving a conserved adenine acting as a general base to abstract an amino proton from the amino acid89, but subsequent mutagenesis of the key A has not provided strong support90, 91. Identifying the rRNA's catalytic strategy is an important direction for future research. Given the large size of the rRNA (Table 1), scientists are using in vitro evolution to find smaller peptidyl transferase ribozymes that might model the biological reaction92. Indeed, RNA catalysts have also been identified that accomplish two other steps of the protein synthesis pathway: formation of activated amino acid adenylates93 and transfer of an amino acid to the 3'-oxygen of a tRNA-like acceptor94, 95. Another ribonucleoprotein catalyst found in eukaryotic cells is the spliceosome, which assembles with nuclear pre-mRNAs and splices out the major class of introns (which are not self-splicing). For many years, a popular hypothesis has been that the spliceosome is also a molecular fossil from the RNA world — with several enzymatic RNAs acting intermolecularly via a mechanism analogous to that used intramolecularly by self-splicing group II introns96, 97. As with the ribosome, most of the evidence for this hypothesis has been circumstantial until very recently. Now Valadkhan and Manley98 have shown that two spliceosomal RNAs, the U2 and U6 small nuclear RNAs, can bind an RNA substrate containing the sequence of the intron branch site and promote a splicing-related reaction in the absence of any of the numerous spliceosomal proteins. The reaction product is not the natural 'branch' — consisting of a nucleotide forming both 2'-5' and 3'-5' phosphodiester bonds — but instead a new product consistent with a phosphotriester. (This is surprising from a chemical perspective, because it would require hydroxyl as the leaving group from a pentavalent phosphorous intermediate or transition state.) This new RNA-catalysed reaction will undoubtedly stimulate fresh investigations of the mechanism by which spliceosomal RNAs catalyse mRNA splicing, and adds weight to the proposition that remnants of the RNA world are still among us. Future directions As some of the first RNAs to be studied in structural and mechanistic detail, ribozymes have provided many important insights into RNA function at a fundamental chemical level. Although much progress has been made, many interesting questions remain to be addressed. Determining detailed reaction mechanisms for RNA catalysts, including large RNA–protein complexes such as the ribosome and the spliceosome, will be a priority, as well as exploring the chemical mechanisms of ribozymes identified by in vitro selection. In addition to revealing new aspects of RNA biology, these investigations may shed light on aspects of the proposed RNA world and the role of RNA in early evolution. References 1. Kruger, K. et al. Self-splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell 31, 147-157 (1982). | PubMed | ISI | 2. Guerrier-Takada, C., Gardiner, K., Marsh, T., Pace, N. & Altman, S. The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell 35, 849-857 (1983). | PubMed | ISI | 3. Unrau, P. J. & Bartel, D. P. RNA-catalysed nucleotide synthesis. Nature 395, 260-263 (1998). | Article | PubMed | ISI | 4. Lohse, P. A. & Szostak, J. W. Ribozyme-catalysed amino-acid transfer reactions. Nature 381, 442-444 (1996). | PubMed | ISI | 5. Wiegand, T. W., Janssen, R. C. & Eaton, B. E. Selection of RNA amide synthases. Chem. Biol. 4, 675-683 (1997). | PubMed | ISI | 6. Sengle, G., Eisenfuhr, A., Arora, P. S., Nowick, J. S. & Famulok, M. Novel RNA catalysts for the Michael reaction. Chem. Biol. 8, 459-473 (2001). | PubMed | ISI | 7. Jadhav, V. R. & Yarus, M. Acyl-CoAs from coenzyme ribozymes. Biochemistry 41, 723-729 (2002). | Article | PubMed | ISI | 8. Wilson, D. S. & Szostak, J. W. In vitro selection of functional nucleic acids. Annu. Rev. Biochem. 68, 611-647 (1999). | PubMed | ISI | 9. Perrotta, A. T., Shih, I. & Been, M. D. Imidazole rescue of a cytosine mutation in a self-cleaving ribozyme. Science 286, 123-126 (1999). | Article | PubMed | ISI | 10. Santoro, S. W., Joyce, G. F., Sakthivel, K., Gramatikova, S. & Barbas, C. F. III RNA cleavage by a DNA enzyme with extended chemical functionality. J. Am. Chem. Soc. 122, 2433-2439 (2000). | Article | PubMed | ISI | 11. Tang, J. & Breaker, R. R. Rational design of allosteric ribozymes. Chem. Biol. 4, 453-459 (1997). | PubMed | ISI | 12. Salehi-Ashtiani, K. & Szostak, J. W. In vitro evolution suggests multiple origins for the hammerhead ribozyme. Nature 414, 82-84 (2001). | Article | PubMed | ISI | 13. Murray, J. B., Seyhan, A. A., Walter, N. G., Burke, J. M. & Scott, W. G. The hammerhead, 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. hairpin and VS ribozymes are catalytically proficient in monovalent cations alone. Chem. Biol. 5, 587-595 (1998). | PubMed | ISI | Pley, H. W., Flaherty, K. M. & McKay, D. B. Three-dimensional structure of a hammerhead ribozyme. Nature 372, 68-74 (1994). | PubMed | ISI | Scott, W. G., Finch, J. T. & Klug, A. The crystal structure of an all-RNA hammerhead ribozyme: a proposed mechanism for RNA catalytic cleavage. Cell 81, 991-1002 (1995). | PubMed | ISI | Scott, W. G., Murray, J. B., Arnold, J. R., Stoddard, B. L. & Klug, A. Capturing the structure of a catalytic RNA intermediate: the hammerhead ribozyme. Science 274, 2065-2069 (1996). | Article | PubMed | ISI | Murray, J. B. et al. The structural basis of hammerhead ribozyme self-cleavage. Cell 92, 665673 (1998). | PubMed | ISI | Murray, J. B., Szoke, H., Szoke, A. & Scott, W. G. Capture and visualization of a catalytic RNA enzyme-product complex using crystal lattice trapping and X-ray holographic reconstruction. Mol. Cell 5, 279-287 (2000). | PubMed | ISI | Murray, J. B., Dunham, C. M. & Scott, W. G. A pH-dependent conformational change, rather than the chemical step, appears to be rate-limiting in the hammerhead ribozyme cleavage reaction. J. Mol. Biol. 315, 121-130 (2002). | Article | PubMed | ISI | Scott, E. C. & Uhlenbeck, O. C. A re-investigation of the thio effect at the hammerhead cleavage site. Nucleic Acids Res. 27, 479-484 (1999). | PubMed | ISI | Peracchi, A., Beigelman, L., Scott, E. C., Uhlenbeck, O. C. & Herschlag, D. Involvement of a specific metal ion in the transition of the hammerhead ribozyme to its catalytic conformation. J. Biol. Chem. 272, 26822-26826 (1997). | Article | PubMed | ISI | Wang, S., Karbstein, K., Peracchi, A., Beigelman, L. & Herschlag, D. Identification of the hammerhead ribozyme metal ion binding site responsible for rescue of the deleterious effect of a cleavage site phosphorothioate. Biochemistry 38, 14363-14378 (1999). | Article | PubMed | ISI | Murray, J. B. & Scott, W. G. Does a single metal ion bridge the A-9 and scissile phosphate groups in the catalytically active hammerhead ribozyme structure? J. Mol. Biol. 296, 33-41 (2000). | Article | PubMed | ISI | O'Rear, J. L. et al. Comparison of the hammerhead cleavage reactions stimulated by monovalent and divalent cations. RNA 7, 537-545 (2001). | Article | PubMed | ISI | Curtis, E. A. & Bartel, D. P. The hammerhead cleavage reaction in monovalent cations. RNA 7, 546-552 (2001). | Article | PubMed | ISI | Ferre-D'Amare, A. R., Zhou, K. & Doudna, J. A. Crystal structure of a hepatitis delta virus ribozyme. Nature 395, 567-574 (1998). | Article | PubMed | ISI | Rupert, P. B. & Ferre-D'Amare, A. R. Crystal structure of a hairpin ribozyme-inhibitor complex with implications for catalysis. Nature 410, 780-786 (2001). | Article | PubMed | ISI | Rajagopal, P. & Feigon, J. Triple-strand formation in the homopurine:homopyrimidine DNA oligonucleotides d(G-A)4 and d(T-C)4. Nature 339, 637-640 (1989). | PubMed | ISI | Sklenar, V. & Feigon, J. Formation of a stable triplex from a single DNA strand. Nature 345, 836-838 (1990). | PubMed | ISI | Connell, G. J. & Yarus, M. RNAs with dual specificity and dual RNAs with similar specificity. Science 264, 1137-1141 (1994). | PubMed | ISI | Legault, P. & Pardi, A. In situ probing of adenine protonation in RNA by 13C NMR. J. Am. Chem. Soc. 116, 8390-8391 (1994). | ISI | Ravindranathan, S., Butcher, S. E. & Feigon, J. Adenine protonation in domain B of the hairpin ribozyme. Biochemistry 39, 16026-16032 (2000). | Article | PubMed | ISI | Shih, I. H. & Been, M. D. Involvement of a cytosine side chain in proton transfer in the ratedetermining step of ribozyme self-cleavage. Proc. Natl Acad. Sci. USA 98, 1489-1494 (2001). | PubMed | ISI | 34. Nakano, S., Chadalavada, D. M. & Bevilacqua, P. C. General acid-base catalysis in the mechanism of a hepatitis delta virus ribozyme. Science 287, 1493-1497 (2000). | Article | PubMed | ISI | 35. Nakano, S. & Bevilacqua, P. C. Proton inventory of the genomic HDV ribozyme in Mg2+containing solutions. J. Am. Chem. Soc. 123, 11333-11334 (2001). | Article | PubMed | ISI | 36. Luptak, A., Ferre-D'Amare, A. R., Zhou, K., Zilm, K. W. & Doudna, J. A. Direct pKa measurement of the active-site cytosine in a genomic hepatitis delta virus ribozyme. J. Am. Chem. Soc. 123, 8447-8452 (2001). | Article | PubMed | ISI | 37. Nakano, S., Proctor, D. J. & Bevilacqua, P. C. Mechanistic characterization of the HDV genomic ribozyme: assessing the catalytic and structural contributions of divalent metal ions within a multichannel reaction mechanism. Biochemistry 40, 12022-12038 (2001). | Article | PubMed | ISI | 38. Ryder, S. P. et al. Investigation of adenosine base ionization in the hairpin ribozyme by nucleotide analog interference mapping. RNA 7, 1454-1463 (2001). | PubMed | ISI | 39. Hampel, A. & Cowan, J. A. A unique mechanism for RNA catalysis: the role of metal cofactors in hairpin ribozyme cleavage. Chem. Biol. 4, 513-517 (1997). | PubMed | ISI | 40. Nesbitt, S., Hegg, L. A. & Fedor, M. J. An unusual pH-independent and metal-ion-independent mechanism for hairpin ribozyme catalysis. Chem. Biol. 4, 619-630 (1997). | PubMed | ISI | 41. Walter, N. G. & Burke, J. M. The hairpin ribozyme: structure, assembly and catalysis. Curr. Opin. Chem. Biol. 2, 303 (1998). | PubMed | ISI | 42. Cech, T. R. & Herschlag, D. (eds) Group I Ribozymes: Substrate Recognition, Catalytic Strategies and Comparative Mechanistic Analysis (Springer, Berlin, 1996). 43. Narlikar, G. J. & Herschlag, D. Mechanistic aspects of enzymatic catalysis: lessons from comparison of RNA and protein enzymes. Annu. Rev. Biochem. 66, 19-59 (1997). | PubMed | ISI | 44. Shan, S., Kravchuk, A. V., Piccirilli, J. A. & Herschlag, D. Defining the catalytic metal ion interactions in the Tetrahymena ribozyme reaction. Biochemistry 40, 5161-5171 (2001). | PubMed | ISI | 45. Cate, J. H. et al. Crystal structure of a group I ribozyme domain: principles of RNA packing. Science 273, 1678-1685 (1996). | PubMed | ISI | 46. Juneau, K., Podell, E., Harrington, D. J. & Cech, T. R. Structural basis of the enhanced stability of a mutant ribozyme domain and a detailed view of RNA-solvent interactions. Structure (Camb.) 9, 221-231 (2001). | PubMed | ISI | 47. Golden, B. L., Gooding, A. R., Podell, E. R. & Cech, T. R. A preorganized active site in the crystal structure of the Tetrahymena ribozyme. Science 282, 259-264 (1998). | Article | PubMed | ISI | 48. Michel, F. & Westhof, E. Modelling of the three-dimensional architecture of group I catalytic introns based on comparative sequence analysis. J. Mol. Biol. 216, 585-610 (1990). | PubMed | ISI | 49. Szewczak, A. A. et al. An important base triple anchors the substrate helix recognition surface within the Tetrahymena ribozyme active site. Proc. Natl Acad. Sci. USA 96, 11183-11188 (1999). | PubMed | ISI | 50. Gordon, P. M., Sontheimer, E. J. & Piccirilli, J. A. Kinetic characterization of the second step of group II intron splicing: role of metal ions and the cleavage site 2'-OH in catalysis. Biochemistry 39, 12939-12952 (2000). | Article | PubMed | ISI | 51. Sigel, R. K., Vaidya, A. & Pyle, A. M. Metal ion binding sites in a group II intron core. Nature Struct. Biol. 7, 1111-1116 (2000). | Article | PubMed | ISI | 52. Gordon, P. M. & Piccirilli, J. A. Metal ion coordination by the AGC triad in domain 5 contributes to group II intron catalysis. Nature Struct. Biol. 8, 893-898 (2001). | Article | PubMed | ISI | 53. Jestin, J. L., Deme, E. & Jacquier, A. Identification of structural elements critical for inter-domain interactions in a group II self-splicing intron. EMBO J. 16, 2945-2954 (1997). | PubMed | ISI | 54. Boudvillain, M., de Lencastre, A. & Pyle, A. M. A tertiary interaction that links active-site domains to the 5' splice site of a group II intron. Nature 406, 315-318 (2000). | Article | PubMed | ISI | 55. Chu, V. T., Adamidi, C., Liu, Q., Perlman, P. S. & Pyle, A. M. Control of branch-site choice by a group II intron. EMBO J. 20, 6866-6876 (2001). | PubMed | ISI | 56. Zhang, L. & Doudna, J. A. Structural insights into group II intron catalysis and branch-site selection. Science 295, 2084-2088 (2002). | PubMed | ISI | 57. Costa, M., Michel, F. & Westhof, E. A three-dimensional perspective on exon binding by a group II self-splicing intron. EMBO J. 19, 5007-5018 (2000). | Article | PubMed | ISI | 58. Swisher, J., Duarte, C. M., Su, L. J. & Pyle, A. M. Visualizing the solvent-inaccessible core of a group II intron ribozyme. EMBO J. 20, 2051-2061 (2001). | Article | PubMed | ISI | 59. Yang, J., Zimmerly, S., Perlman, P. S. & Lambowitz, A. M. Efficient integration of an intron RNA into double-stranded DNA by reverse splicing. Nature 381, 332-335 (1996). | PubMed | ISI | 60. Frank, D. N. & Pace, N. R. Ribonuclease P: unity and diversity in a tRNA processing ribozyme. Annu. Rev. Biochem. 67, 153-180 (1998). | PubMed | ISI | 61. Morl, M. & Marchfelder, A. The final cut. The importance of tRNA 3'-processing. EMBO Rep. 2, 17-20 (2001). | PubMed | ISI | 62. Warnecke, J. M., Held, R., Busch, S. & Hartmann, R. K. Role of metal ions in the hydrolysis reaction catalyzed by RNase P RNA from Bacillus subtilis. J. Mol. Biol. 290, 433-445 (1999). | Article | PubMed | ISI | 63. Warnecke, J. M., Sontheimer, E. J., Piccirilli, J. A. & Hartmann, R. K. Active site constraints in the hydrolysis reaction catalyzed by bacterial RNase P: analysis of precursor tRNAs with a single 3'-S-phosphorothiolate internucleotide linkage. Nucleic Acids Res. 28, 720-727 (2000). | PubMed | ISI | 64. Westhof, E. & Altman, S. Three-dimensional working model of M1 RNA, the catalytic RNA subunit of ribonuclease P from Escherichia coli. Proc. Natl Acad. Sci. USA 91, 5133-5137 (1994). | PubMed | ISI | 65. Harris, M. E., Kazantsev, A. V., Chen, J. L. & Pace, N. R. Analysis of the tertiary structure of the ribonuclease P ribozyme-substrate complex by site-specific photoaffinity crosslinking. RNA 3, 561-576 (1997). | PubMed | ISI | 66. Frank, D. N., Adamidi, C., Ehringer, M. A., Pitulle, C. & Pace, N. R. Phylogenetic-comparative analysis of the eukaryal ribonuclease P RNA. RNA 6, 1895-1904 (2000). | Article | PubMed | ISI | 67. Li, Y. & Altman, S. A subunit of human nuclear RNase P has ATPase activity. Proc. Natl Acad. Sci. USA 98, 441-444 (2001). | PubMed | ISI | 68. Xiao, S., Houser-Scott, F. & Engelke, D. R. Eukaryotic ribonuclease P: increased complexity to cope with the nuclear pre-tRNA pathway. J. Cell. Physiol. 187, 11-20 (2001). | Article | PubMed | ISI | 69. Tesmer, J. J. et al. Two-metal-ion catalysis in adenylyl cyclase. Science 285, 756-760 (1999). | Article | PubMed | ISI | 70. Wyckoff, H. W. et al. The three-dimensional structure of ribonuclease-S. Interpretation of an electron density map at a nominal resolution of 2 Å. J. Biol. Chem. 245, 305-328 (1970). | PubMed | ISI | 71. Drum, C. L. et al. Structural basis for the activation of anthrax adenylyl cyclase exotoxin by calmodulin. Nature 415, 396-402 (2002). | Article | PubMed | ISI | 72. Treiber, D. K. & Williamson, J. R. Exposing the kinetic traps in RNA folding. Curr. Opin. Struct. Biol. 9, 339-345 (1999). | Article | PubMed | ISI | 73. Thirumalai, D. & Woodson, S. A. Maximizing RNA folding rates: a balancing act. RNA 6, 790794 (2000). | Article | PubMed | ISI | 74. Thirumalai, D., Lee, N., Woodson, S. A. & Klimov, D. Early events in RNA folding. Annu. Rev. Phys. Chem. 52, 751-762 (2001). | PubMed | ISI | 75. Treiber, D. K. & Williamson, J. R. Beyond kinetic traps in RNA folding. Curr. Opin. Struct. Biol. 11, 309-314 (2001). | PubMed | ISI | 76. Zhuang, X. et al. A single-molecule study of RNA catalysis and folding. Science 288, 2048-2051 (2000). | Article | PubMed | ISI | 77. Liphardt, J., Onoa, B., Smith, S. B., Tinoco, I. J. & Bustamante, C. Reversible unfolding of single RNA molecules by mechanical force. Science 292, 733-737 (2001). | PubMed | ISI | 78. Russell, R. et al. Exploring the folding landscape of a structured RNA. Proc. Natl Acad. Sci. USA 99, 155-160 (2002). | PubMed | ISI | 79. Caprara, M. G., Mohr, G. & Lambowitz, A. M. A tyrosyl-tRNA synthetase protein induces tertiary folding of the group I intron catalytic core. J. Mol. Biol. 257, 512-531 (1996). | Article | PubMed | ISI | 80. Weeks, K. M. & Cech, T. R. Assembly of a ribonucleoprotein catalyst by tertiary structure capture. Science 271, 345-348 (1996). | PubMed | ISI | 81. Chanfreau, G. & Jacquier, A. An RNA conformational change between the two chemical steps of group II self-splicing. EMBO J. 15, 3466-3476 (1996). | PubMed | ISI | 82. Cohen, S. B. & Cech, T. R. Dynamics of thermal motions within a large catalytic RNA investigated by cross-linking with thiol-disulfide interchange. J. Am. Chem. Soc. 119, 6259-6268 (1997). | Article | ISI | 83. Profenno, L. A., Kierzek, R., Testa, S. M. & Turner, D. H. Guanosine binds to the Tetrahymena ribozyme in more than one step, and its 2'-OH and the nonbridging pro-Sp phosphoryl oxygen at the cleavage site are required for productive docking. Biochemistry 36, 12477-12485 (1997). | Article | PubMed | ISI | 84. Murchie, A. I., Thomson, J. B., Walter, F. & Lilley, D. M. Folding of the hairpin ribozyme in its natural conformation achieves close physical proximity of the loops. Mol. Cell 1, 873-881 (1998). | PubMed | ISI | 85. Andersen, A. A. & Collins, R. A. Rearrangement of a stable RNA secondary structure during VS ribozyme catalysis. Mol. Cell 5, 469-478 (2000). | PubMed | ISI | 86. Noller, H. F., Hoffarth, V. & Zimniak, L. Unusual resistance of peptidyl transferase to protein extraction procedures. Science 256, 1416-1419 (1992). | PubMed | ISI | 87. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A. The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905-920 (2000). | Article | PubMed | ISI | 88. Welch, M., Chastang, J. & Yarus, M. An inhibitor of ribosomal peptidyl transferase using transition-state analogy. Biochemistry 34, 385-390 (1995). | PubMed | ISI | 89. Nissen, P., Hansen, J., Ban, N., Moore, P. B. & Steitz, T. A. The structural basis of ribosome activity in peptide bond synthesis. Science 289, 920-930 (2000). | Article | PubMed | ISI | 90. Polacek, N., Gaynor, M., Yassin, A. & Mankin, A. S. Ribosomal peptidyl transferase can withstand mutations at the putative catalytic nucleotide. Nature 411, 498-501 (2001). | Article | PubMed | ISI | 91. Thompson, J. et al. Analysis of mutations at residues A2451 and G2447 of 23S rRNA in the peptidyltransferase active site of the 50S ribosomal subunit. Proc. Natl Acad. Sci. USA 98, 9002-9007 (2001). | Article | PubMed | ISI | 92. Murray, J. M. & Doudna, J. A. Creative catalysis: pieces of the RNA world jigsaw. Trends Biochem. Sci. 26, 699-701 (2001). | PubMed | ISI | 93. Kumar, R. K. & Yarus, M. RNA-catalyzed amino acid activation. Biochemistry 40, 6998-7004 (2001). | PubMed | ISI | 94. Illangasekare, M. & Yarus, M. Specific, rapid synthesis of Phe-RNA by RNA. Proc. Natl Acad. Sci. USA 96, 5470-5475 (1999). | PubMed | ISI | 95. Illangasekare, M. & Yarus, M. A tiny RNA that catalyzes both aminoacyl-RNA and peptidyl-RNA synthesis. RNA 5, 1482-1489 (1999). | Article | PubMed | ISI | 96. Collins, C. A. & Guthrie, C. The question remains: is the spliceosome a ribozyme? Nature Struct. Biol. 7, 850-854 (2000). | Article | PubMed | ISI | 97. Yean, S. L., Wuenschell, G., Termini, J. & Lin, R. J. Metal-ion coordination by U6 small nuclear RNA contributes to catalysis in the spliceosome. Nature 408, 881-884 (2000). | Article | PubMed | ISI | 98. Valadkhan, S. & Manley, J. L. Splicing-related catalysis by protein-free snRNAs. Nature 413, 701-707 (2001). | Article | PubMed | ISI | Acknowledgements. We acknowledge D. Battle for extensive help with figure preparation, and V. Rath for comments on the manuscript. Figure 1 Mechanism of RNA-catalysed self-cleavage. a, General mechanism of ribonucleases and small self-cleaving ribozymes. The 2'-hydroxyl adjacent to the scissile phosphate is activated for nucleophilic attack by abstraction of its proton. Concurrently, a proton is donated to stabilize the developing negative charge on the leaving group oxygen. b, Secondary structure of the hammerhead ribozyme. Nucleotides important for catalytic activity are indicated; the cleavage site is indicated by an arrow. c, Crystal structure of the hammerhead ribozyme. Coordinates are from ref. 14. The nucleotides flanking the scissile bond are shown in gold. Figure 2 Structure of the hepatitis delta virus (HDV) ribozyme. a, Secondary structure of the genomic form of the HDV ribozyme; the cytosine residue essential to catalytic activity (C75) is indicated, and the cleavage site is marked by the 5' end of the RNA strand. b, Crystal structure of the product form of the HDV ribozyme. The active-site cytosine is shown in red, the 5' nucleotide of the ribozyme in gold, and the U1a RNA-binding-domain protein and its cognate RNA-binding site in grey (this has been engineered into the construct to assist crystallization). c, Proposed mechanism of general acid catalysis by C75, in which the protonated form of the C donates a proton to the leaving group during catalysis (compare with Fig. 1a). Figure 3 Structure of the hairpin ribozyme. a, Secondary structure of the hairpin ribozyme; conserved, functionally important nucleotides are shown explicitly. Dots indicate non-canonical base pairings. b, Crystal structure of a precursor form of the hairpin ribozyme. Nucleotides flanking the scissile bond are shown in gold, whereas the grey structure is the U1a RNA-binding-domain protein and its cognate RNA-binding site, engineered into the construct to assist crystallization. Figure 4 Self-splicing intron mechanisms. a, Pathways for group I and II intron self-splicing, with exons shown as dashed lines and introns as solid lines. For group I introns, step 1 shows how an intron-bound guanosine or GTP (circled) cleaves the 5' splice site while becoming covalently attached to the 5' end of the intron. 'Conf.' indicates a conformational change whereby the G at the 3' end of the intron replaces the original G in the G-binding site. In step 2, the cleaved 5' exon, still held to the intron by base pairing (P1), then cleaves the 3' splice site; as a result, the exons are ligated and the intron excised. For group II introns, step 1 shows how an adenosine 2'-hydroxyl within domain 6 (D6) attacks the 5' splice site, which is identified by base-pairing interactions involving domain 1 (D1); this results in a branched 'lariat' RNA intermediate. In step 2, the cleaved 5' exon then attacks the 3' splice site, ligating the exons and excising the lariat intron. b, Threemetal-ion mechanism for RNA cleavage catalysed by the Tetrahymena group I intron (adapted from ref. 44). The step shown is the same as step 1 in a; the cleavage site phosphate (between U–1 and A1) is recognized in part by interactions with G22 in the internal guide sequence (IGS). Note: Figures may be difficult to render in a web browser. In such cases, we recommend downloading the PDF version of this document. 11 July 2002 Nature 418, 229 - 235 (2002); doi:10.1038/418229a <> The involvement of RNA in ribosome function PETER B. MOORE*† AND THOMAS A. STEITZ*†‡ * Department of Molecular Biophysics and Biochemistry, Yale University, PO Box 208107, New Haven, Connecticut 06520-8107, USA † Department of Chemistry, Yale University, PO Box 208107, New Haven, Connecticut 06520-8107, USA ‡ Howard Hughes Medical Institute, New Haven, Connecticut 06520-8114, USA (email: peter.moore@yale.edu) The ribosome is a particle made of RNA and protein that is found in abundance in all cells that are actively making protein. It catalyses the messenger RNA-directed synthesis of proteins. Recent structural work has demonstrated a profound involvement of the ribosome's RNA component in all aspects of its function, supporting the hypothesis that proteins were added to the ribosome late in its evolution. The discovery of the ribosome and the elucidation of its role in gene expression was one of the main achievements of molecular biology in the 1950s and '60s. Early investigations revealed that ribosomes invariably consist of a large and a small subunit, the former being roughly twice the molecular mass of the latter, and that both subunits are composed of 60% RNA by weight. The small subunit mediates the interactions between messenger RNA (mRNA) and transfer RNAs (tRNAs) that determine the sequences of the proteins ribosomes make. The large subunit catalyses peptide bond formation1. That ribosomes contain protein surprised no one in the 1950s, especially after it became clear that they catalyse protein synthesis. Everyone knew that enzymes are proteins. The surprise was that ribosomes contain RNA. At the time, three hypotheses could be entertained about the contribution RNA makes to ribosome function: (1) it is the substance that determines the sequences of the proteins ribosomes make; (2) it is the (inert) structural scaffold on which the catalytically active, proteinaceous parts of the protein synthetic apparatus are assembled; or (3) it contributes directly to the catalytic processes of protein synthesis. The discovery of mRNA around 1960 put an end to the first possibility, and left the field in a quandary. The second hypothesis was unappealing because it does not make evolutionary sense2, 3. If protein performs the key functions of the ribosome now, it probably did so when the ribosome first evolved, and if that is so, why does the ribosome contain RNA? Furthermore, if the first proteins were synthesized by proteins in the ribosome, what made the first ribosomal proteins? The third alternative was troubling because it challenged the doctrine that enzymes are proteins. With the progression of time, the third hypothesis has gained increasing support. The idea that ribosomal RNAs (rRNAs) might participate directly in protein synthesis became entirely credible in the early 1980s, following the discovery that RNAs can indeed catalyse chemical reactions4, 5. In addition, genetic and biochemical evidence slowly accumulated that implicated rRNA in ribosome function6-8. Nevertheless, a full appreciation of the functional importance of rRNA in the ribosome has emerged only recently. The more than two-dozen crystal structures of ribosomes and their ligand complexes now available document the involvement of RNA in ribosome activity at a level that even the most ardent advocates of the functional importance of rRNA could hardly have anticipated. It is the proteins in the ribosome that perform a largely structural function, not the RNA. Ribosome active sites The functional importance of RNA in the ribosome, relative to protein, can best be evaluated by examining its prominence in the small fraction of the ribosome's total mass found in its active sites, rather than by measuring what it contributes to the particle as a whole. That said, it is almost certainly true that the overall abundance of RNA in the modern ribosome reflects an evolutionary history that started with an all-RNA particle. Presumably, this original proto-ribosome had decoding and peptide synthesis sites that were similar to the ones we see today, and proteins were added to it later to enhance its function. But before such issues can be approached intelligently, it is important to consider what has been learned from biochemical studies about the active sites of the eubacterial ribosome. Archaeal and eukaryotic ribosomes work in much the same way. Aminoacyl-tRNAs are the substrates that ribosomes consume during protein synthesis. Every cell contains a population of tRNAs that differ in sequence, but have similar relative molecular mass ( 25,000), similar secondary structures, and the same L-shaped tertiary structure. At the distal end of the longer arm of the L there is a three-base, anticodon sequence in every tRNA that is complementary to one of the mRNA base triplets that encodes a specific amino acid. At the distal end of the short arm of the L of every tRNA is a 3'-terminal CCA sequence to which the amino acid specified by the anticodon is attached9. During protein synthesis, the anticodon ends of tRNAs interact with the mRNA bound to the small subunit and their aminoacylated CCA sequences interact with the large subunit. Amino acids are attached to tRNAs by enzymes called aminoacyl-tRNA synthetases, of which there is one for each kind of amino acid. They catalyse the formation of ester bonds between the -carboxylate groups of amino acids and the 2'- or 3'-hydroxyl groups of the 3'terminal adenine residue of tRNAs. Synthetases translate the genetic code with great specificity by attaching one type of amino acid to all tRNAs bearing the appropriate anticodons. There is at least one kind of tRNA molecule for every amino acid used in protein synthesis. Ribosomes assemble proteins one amino acid at a time, starting with the N-terminal residue10. The reaction that results in amino acid polymerization is the nucleophilic attack of the -amino group of an amino acid esterified to one tRNA on the carbonyl carbon of the ester linking a peptide (or amino acid) to a second tRNA. Resolution of the resulting tetrahedral intermediate deacylates the tRNA that is the carbonyl group donor, and leaves the tRNA that provided the -amino group esterified to a peptide that has been extended by one amino acid. This reaction occurs at the peptidyl transferase centre of the large ribosomal subunit, which includes a sub-site to which the CCA-peptide moiety of peptidyltRNAs binds (the P site), and a sub-site that interacts with the amino acid-carrying CCA end of aminoacyl-tRNA (the A site). Before another amino acid can be added a growing peptide chain, the deacylated tRNA in the P site must be replaced by the peptidyl-tRNA resident in the A site, and a new aminoacyl-tRNA must be placed in the A site. The exchange of P-site tRNAs required results from a still poorly understood conformational transformation called translocation, which involves both subunits (see below). It is mediated by a G-protein called elongation factor G (EF-G) in prokaryotes, and it is accompanied by the cleavage of guanosine triphosphate (GTP). Delivery of the correct aminoacyl-tRNA to the ribosome is the sequence-determining step of protein synthesis. If the polypeptide being made by a ribosome is to have the sequence required by the mRNA bound to it, only one of the many different aminoacyl-tRNAs present in the cell can be accepted during each cycle of peptide chain elongation. The selection required is accomplished by the small subunit's decoding centre. AminoacyltRNAs are delivered to the ribosome bound to a second G-protein factor, elongation factor Tu (EF-Tu) and GTP. These so-called ternary complexes bind tightly to the ribosome only if the anticodon sequences of their tRNA components are complementary to the mRNA codon presented in the decoding centre's A sub-site. If the match is satisfactory, acceptance occurs, a process that involves cleavage of GTP, followed the release of EF-Tu GDP from the ribosome, and a major change in the orientation of the tRNA on the ribosome. The decoding centre also has a P sub-site. The codon in the small subunit's P site interacts with the anticodon of a peptidyl-tRNA bound in the large subunit's P site at the time that peptide bond formation occurs. During translocation, the anticodon end of a tRNA in the A site moves into the small subunit's P site, displacing the deacylated tRNA already bound there (if any), and the mRNA bound to the small subunit advances in sympathy in the 5' direction by three nucleotides. This results in the presentation of a new codon in the A site so that the next cycle of aminoacyl-tRNA selection, peptide bond formation and translocation can occur. In addition to its peptidyl transferase and decoding centres, the ribosome includes a factorbinding centre. It is part of the large subunit, and all of the G-protein factors involved in protein synthesis interact with it during at least part of their duty cycles. In some poorly understood way, it triggers the GTPase activities of these proteins, and mediates the conformational changes they all facilitate. That said, it is important to realize that these factors act catalytically to enhance properties the ribosomal system already possesses. Ribosomes programmed with mRNAs will select aminoacyl-tRNAs correctly in the absence of EF-Tu. Furthermore, the peptidyl transferase reaction occurs spontaneously on any ribosome that has appropriate substrates bound to it. Even translocation can occur in the absence of EF-G. In fact, ribosomes programmed with mRNAs of appropriate sequence will slowly synthesize oligopeptides in the absence of factors11, although in the presence of factors protein synthesis proceeds much faster than it otherwise would, and the accuracy with which mRNAs are translated increases. Even though the chemical logic of protein synthesis would seem to require the existence of no more than two tRNA-binding sites on the ribosome, there is a third, the E site. During translocation deacylated tRNAs bound to the P site move into the E site12. Release of E site-bound tRNAs into solution accompanies the acceptance of the next aminoacyl-tRNA in the A site, not the translocation step of the next cycle of elongation. Like the A and P sites, the E site has components on both subunits that adjoin the decoding centre and the peptidyl transferase centre13. The peptidyl transferase centre Much of what is known today about the structures and mechanisms of action of the decoding and peptidyl transferase centres has been deduced from crystals of isolated subunits. Ideally, this information would have been obtained from crystals of 70S ribosomes with appropriate ligands bound, but there are currently no crystals of the 70S ribosome that diffract to the resolution required for independent determination of atomic structure. The reason subunit crystals are useful in this context is that the peptidyl transferase and decoding centres are contained entirely within single, separated subunits; thus, activities that are obviously related to those that the two centres display in 70S ribosomes can be elicited from isolated subunits. The data obtained must be interpreted carefully, however, as the low resolution of the available 70S electron density maps means that small differences may exist between the structures of these centres in isolated subunits and in 70S ribosomes. The crystal structures of the large subunit bound with substrates, substrate analogues and products show that the peptidyl transferase centre is at the bottom of a large cleft in its small-subunit-binding surface14, 15. Figure 1 shows that face of the large subunit with tRNAs placed in the E, P and A sites; their acceptor stems disappear into the cleft. As Fig. 2a reveals, the terminal CCA sequences of the A- and P-site-bound tRNAs, which must interact during peptide bond formation, meet at the small-subunit end of a tunnel that passes through the subunit to its back side. (Fig. 2a is derived from Fig. 1 by rotating the top of the subunit 90° away from the viewer, and then slicing the subunit vertically along a plane that includes the acceptor stems of the bound tRNAs and the axis of the tunnel; the 'hemisphere' of the subunit closest to the viewer has been removed, and the tRNAs left in place.) The two CCA sequences meet at the site where peptidyl transferase substrate analogues and products bind to the subunit (Fig. 2b). Figure 1 The arrangement of tRNA-binding sites on the large ribosomal subunit. Full legend High resolution image and legend (120k) Figure 2 Interactions of the CCA ends of ribosome-bound tRNA with the large ribosomal subunit. Full legend High resolution image and legend (44k) None of these discoveries was a surprise. Electron microscopic evidence for the tunnel (appropriately named the 'peptide exit tunnel') first emerged in the 1980s16, 17, following the demonstration that nascent, ribosome-bound peptides first become accessible to solvent on the back side of the large ribosome18, 19. All doubts about its existence were resolved by the cryoelectron microscopic studies reported in the late 1990s that also demonstrated that the peptidyl transferase centre is located at its small subunit end20-22. The peptidyl transferase centre is formed by nucleotides from domain V of the principal RNA component of the large subunit, 23S rRNA; most of these nucleotides are components of its central loop, consistent with earlier biochemical results23, 24. In addition to verifying these biochemical results, these crystal structures prove that there is no protein whatsoever in the peptidyl transferase centre of the ribosome. Thus the most fundamental of the ribosome's activities, its peptide bond-forming activity, is catalysed by a structure composed entirely of RNA14. Precise substrate alignment is an important source of the catalytic power in the peptidyl transferase centre14, as indeed it is in all enzymes25-27. It had been proposed earlier on theoretical grounds that substrate orientation might be important in the ribosome28, and much of the alignment required is ensured by base-pairing interactions between the CCA sequences of P-site- and A-site-bound tRNA with nucleotides in the so-called P loop (helix 80 of 23S rRNA)29 and A loop (helix 92)30, respectively (Fig. 2b). These interactions position the -amino group of an aminoacyl-tRNA in the A site so that it can attack the carbonyl carbon of the ester linking a polypeptide to a tRNA bound in the P site14. Structures of the large ribosomal subunit of the halophilic archaeon Haloarcula marismortui with substrate and transition-state analogues bound suggest that the peptidyl transferase centre may derive catalytic power from a second source. The N3 of residue A2451 (Escherichia coli numbering) of 23S rRNA is hydrogen bonded to the -amino group of aminoacyl-tRNAs in the A site14. Not only does this interaction contribute to the positioning of that group, it could further enhance the rate of peptide bond formation by helping abstract a proton from that -amino group (Fig. 3). Figure 3 A possible mechanism for the involvement of A2486 (H. marismortui)/A2451 (E. coli) in peptide bond formation. Full legend High resolution image and legend (74k) Biochemical data obtained recently using ribosomes mutated at position 2451 suggest that A2451 may indeed act as a general base during peptide bond formation. First, all such mutations are dominant lethal in E. coli, consistent with the view that A2451 is critical to ribosome function31-33. Second, recent kinetic studies show that the rate of the chemical step of peptide bond formation increases by a factor of 100–150 when a group in 70S ribosomes that has a pKa near 7.5 is deprotonated, and that this titration effect is not seen in ribosomes that have uracil at position 2451 instead of adenine34. These observations contradict conclusions published previously that were also based on measurements of the pH dependence of the rate of peptide bond formation, but under the conditions used in these earlier experiments, the rate of the chemical step of the reaction seems not have been rate limiting32, 35. The simplest interpretation of these observations may be the right one; A2451 may be the group whose titration affects the rate of peptide bond formation, and it may do so because it functions as a general base during peptide bond formation. However, it would be premature to conclude that the issue is settled. It has not been proven that it is the titration of A2451 that affects the rate of peptide bond formation, only that the nucleotide at position 2451 must be an adenine for the titration effect to be seen. The pKa of the N3 of A2451 would have to be 7.5 both to function as a general base under physiological conditions, as proposed14, and to explain the titration data. But the pKa of the N3 of an unperturbed adenosine is only about 1.0. At the time the A2451 hypothesis was advanced, chemical data existed that seemed to show that the pKa of A2451 is unusually high31, but it is now clear that those data are irrelevant to the properties of A2451 in ribosomes competent in peptide bond formation33, 36, 37. Finally, mutation of one of the bases postulated earlier to be crucial for perturbing the pKa of A2451 (ref. 14) does not affect viability and is also reported not to affect the rate of peptide bond formation32, 33. For all these reasons, it remains possible that a pH-dependent conformational change of some kind that depends on the identity of the nucleotide at position 2451 affects the rate of peptide bond synthesis. The decoding centre A remarkable amount of information about the decoding centre has been generated from crystal structures of the small subunit, in part as a result of a fortuitous accident38. The small subunit of Thermus thermophilus ribosomes has a protruding stem–loop structure, called the spur, whose conformation resembles that of the anticodon stem–loop of a tRNA. In addition, the 3' end of 16S rRNA folds back into the P site, where it binds as though it were a piece of mRNA. One of the intermolecular interactions that stabilizes crystals of these subunits involves the insertion of the spur of one subunit into the P site of a neighbour. The resulting interaction is similar to that of a tRNA bound to the P site of 70S ribosomes with its small subunit component39, 40, and thus much can be learned about the tRNA/P-site interaction from crystals of small subunits that contain no tRNA or mRNA. Unlike the peptidyl transferase centre, the decoding centre includes protein. The minor groove of the anticodon stem of a tRNA bound to the P site contacts not only nucleotides belonging to the 3' major domain and the central domain of the RNA component of the small subunit, 16S rRNA, but also amino acids belonging to ribosomal proteins S13 and S9. The anticodon of tRNA bound to the P site interacts with the mRNA codon exposed at the bottom of that site as well as the penultimate helix (helix 44). The A site is also a 'hybrid' structure38, 41. It includes residues from ribosomal protein S13, and a loop of S12 abuts the mRNA codon exposed there. The elements of 16S rRNA represented in the A site include the 530 loop, helix 44 and the 3' major domain. But despite these interactions between proteins and tRNA, the decoding centre, like the peptidyl transferase centre, is basically an RNA machine. One gets the impression that if the proteins in the decoding site could be removed without otherwise altering its structure, it would still function properly. The decoding centre is located in the region where the head of the small subunit meets its body, and it includes the most important parts of the site that interacts with mRNA. Parts of that site have been mapped out in small subunit crystals that have mRNA analogues bound41, and the rest has been observed at lower resolution in 70S crystals carrying short, natural mRNA sequences42. A double helix forms between the Shine–Dalgarno sequence of appropriately positioned mRNAs and the complementary sequence at the 3' terminus of 16S rRNA, as expected43, 44. The Shine–Dalgarno helix is found in the region between the subunit's head and platform, close to its E site. The P-site portion of bound mRNAs is hard to visualize in difference maps owing to the tendency of the 3' end of 16S rRNA to occupy that region in vacant ribosomes. Nevertheless, it is clear that there is a sharp kink in the backbone of mRNAs between codons in the P site and codons in the A site. On the 3' side of the A site, the molecule passes through a tunnel formed in part by proteins S3, S4 and S5, which may function as a helicase to remove secondary structure from mRNA as it enters the decoding region42. The first step in the mechanism by which the small ribosomal subunit correctly decodes mRNAs seems to be accomplished entirely by 16S rRNA, and the conformation of the complex that forms between mRNA and tRNA in the A site depends on whether or not the interaction between codon and anticodon is cognate41. When a cognate tRNA enters the A site, the bases of A1492 and A1493 change positions and form type II and type I A-minor interactions45, respectively, with the minor groove edge of the first two base pairs of the resulting codon–anticodon helix (Fig. 4). These interactions are not possible if the bases in the second and third positions of the anticodon of the tRNA in the A site do not form Watson–Crick base pairs with the first and second bases of the mRNA codon presented there. G530 responds to the codon–anticodon pairing that occurs with both the second and third bases in the mRNA codon. However, the interactions of 16S rRNA with the third base pair, which involve primarily G530, are less sensitive to base-pair geometry than the interactions it makes with the first two pairs. Thus these interactions not only explain why the code is degenerate in the third position, but also why the ribosome is able to discriminate as well as it does between cognate tRNAs and near-cognate tRNAs. They are also likely to be important for the ribosome's capacity to enhance translational fidelity by proofreading. Figure 4 Fidelity-checking interactions in the A site of the small ribosomal subunit. Full legend High resolution image and legend (87k) Electron microscopic data show that the orientation of aminoacyl-tRNAs on the ribosome changes dramatically between the time they are delivered to the ribosome by EF-Tu and the time when peptide bond formation occurs13, 46, 47. At the time of delivery, when tRNAs are initially selected, the anticodons of tRNAs are in the A site of the decoding centre, but their aminoacylated CCA sequences are nowhere near the peptidyl transferase centre. Accommodation — the reorientation that puts the CCA end of a tRNA into the A site of the peptidyl transferase centre — does not occur unless the tRNA delivered is cognate or near cognate to the mRNA sequence in the A site. The structures of small subunits with stem– loops bound and 70S ribosomes with tRNAs bound correspond to the post-accommodation state, and it is unclear how the geometry of the A-site interaction with tRNA anticodons in the post-accommodation state differs from that of the pre-accommodation state, whose interactions determine initially whether a tRNA is accepted or not. Nevertheless, when a cognate interaction occurs in the A site, by a mechanism that is unknown, a signal is transmitted from the A site to the G domain of EF-Tu, which is bound to the factor-binding centre on the large subunit. That signal, which may involve the conformational change that occurs in the A site of the small subunit when a cognate codon– anticodon interaction occurs, triggers GTP hydrolysis, factor release from the ribosome, and the other steps of accommodation. These inter-subunit interactions will not be understood until high-resolution crystal structures are obtained of ribosomes complexed with factors and tRNAs trapped on the pre- and post-accommodation states. The E site The E site is richer in protein than the A and P sites. In addition to the several contacts that E-site-bound tRNAs make with both 16S and 23S rRNA, their anticodon stems interact extensively with S7 in the small subunit, and their T loops and T stems contact L1 in the large subunit. There are additional interactions involving L33 (ref. 40), and a substantial body of biochemical evidence exists indicating that the E site interacts allosterically with the A site12. The factor-binding centre The GTPase domains of EF-Tu, EF-G and all the other G-protein factors involved in protein synthesis interact with the factor-binding centre during protein synthesis. Because there are no high-resolution structures of ribosomes with G-protein factors bound, our understanding of the centre's location and the way it interacts with factors is limited to what can be surmised from lower-resolution electron microscopic images and molecular biological investigations45-50, combined with model building using atomic structures44. The sarcin–ricin loop (stem–loop 95 of 23S rRNA) is one of the centre's critical components48, but the centre also includes several proteins (for example, L7/12, L11, L6, L14; refs 49, 50), some of which are essential for its operation. The factors that interact with the centre undergo significant conformational changes as they perform their functions, as well as cleaving GTP, and the geometries of the complexes they form with the ribosome change in sympathy, as does the conformation of the ribosome itself51-55. Other ribosomal sites The protein synthesis factors that are not G proteins bind to sites on the ribosome outside the factor-binding centre. High-resolution structures have been reported for two such factors bound to the small subunit: initiation factor 1 (IF1)56, and the C-terminal domain of initiation factor 3 (IF3)57. The site to which IF1 binds includes the A site of the small subunit's decoding centre, and its interaction with that site causes conformational changes in the small subunit that interfere with its interaction with the large subunit, as well as with tRNAs. The C-terminal domain of IF3 binds to the solvent side of the platform in small subunit crystals into which it has been soaked, but as inter-subunit contacts in the crystals examined obstruct the IF3-binding site identified by others58, 59, the relevance of this observation is unclear. There are now about 20 structures available for antibiotics bound to ribosomal subunits, and the number is increasing rapidly. These structures affirm the importance of RNA in ribosome function. Most of the antibiotics examined structurally have been found to inhibit ribosome function by binding to sites composed entirely of RNA. Many antibiotics that target the small subunit interfere with protein synthesis by inhibiting one or more of the conformational changes that accompany the normal function of that subunit38, 41, 57. In contrast, antibiotics examined so far that inhibit the function of the large subunit antibiotics act in a different way. Their interactions with the ribosome sterically block the access of substrates to the peptidyl transferase centre or the passage of nascent peptides down the exit tunnel (refs 60, 61, and J. L. Hansen, N. Ban, P. Nissen, P.B.M. and T.A.S., in preparation). Another functionally vital region of the ribosome that consists primarily of RNA is the interface between the two subunits in the 70S ribosome. It includes several sites, called bridges, where components of the two subunits contact each other. Information about these bridges can be gleaned only from structures of 70S ribosomes, and hence the resolution at which they are understood is currently limited. Nevertheless, it is clear that most bridges result from the interaction of 16S rRNA sequences with 23S rRNA sequences40, and so are rich in RNA. It is also evident that the motions of tRNAs through the inter-subunit gap of the 70S ribosome that are essential for protein synthesis must be accompanied by the making and breaking of inter-subunit bridges. These structures play a dynamic role in protein synthesis. The final ribosomal sites to consider are the region on the back side of the large ribosomal subunit that binds the translocon, which is the apparatus responsible for protein secretion, and the site that interacts with the signal recognition particle, which is responsible for the translational arrest that occurs early in the synthesis of secreted proteins so that ribosomes can interact with the translocon before synthesis is completed. The structure of eukaryotic ribosomes bearing nascent polypeptides bound to Sec61 — the protein complex that forms the interface between secreting ribosomes and the rest of the translocon — has been investigated extensively by electron microscopy62. Sec61 is a more-or-less toroidal protein assembly that binds with its pore concentric with the distal end of the polypeptide exit tunnel. The contacts formed involve both elements of 23S/28S rRNA and the proteins that surround the end of the exit tunnel, L19, L23, L24 and L29 (ref. 62). Evolution of the ribosome's active sites It has often been emphasized that no enzyme as complex as the modern ribosome could possibly have emerged from the primordial soup in a single step. The first ribosomes were very likely composed entirely of RNA. The evidence for this hypothesis is that the functional core of the modern ribosome, its decoding site and its peptidyl transferase centre, consists primarily of RNA, and the bulk of its proteins are found on its surface, well removed from its functional centres (Fig. 5). The first ribosome was almost certainly a smaller, simpler structure than its modern descendants, and had a more limited function. The 'proto-ribosome' might have been a one-subunit object that catalysed peptide bond formation in an uncoded, non-processive way, and thus made small oligopeptides of random sequence that were not proteins in the modern sense, and may have had some other purpose63. It probably gained its decoding function, and the small subunit with which that function is associated, later, in parallel with the evolution of the genome and of tRNA. Because the protein factors that facilitate ribosome function are still not absolutely essential, it is likely that they came later. The E site also seems to be a late evolutionary addition; one can easily imagine that the primitive ribosome discharged deacylated tRNAs directly from the P site into solution rather than retaining them bound. Figure 5 Conservation in the large ribosomal subunit. Full legend High resolution image and legend (83k) The chemical compositions of the ribosome's functional centres correlate crudely with their evolutionary age in this scheme — the more recent the centre, the more protein it contains. The peptidyl transferase centre, the most ancient of them all, has no protein in it whatsoever. The decoding centre, the second to be added to the particle, is predominantly RNA, but protein does make a modest contribution to its function. No atomic-resolution crystal structures have been established yet for elongation factor/ribosome complexes, but both biochemical data and lower-resolution structural studies indicate not only that the factor-binding site contains several proteins, but also that many of them are critical for its function. Consistent with this pattern, the E site, which is a late addition, is protein rich. Concluding comment The importance of RNA in ribosome function has been proven beyond question by the crystal structures published in the past two years. But illuminating as these structures are, they provide only tantalizing hints about how the nucleotides in the ribosome's active centres perform their functions. A fertile area of inquiry has opened up that should challenge biochemists and molecular biologists for years to come. A full understanding of ribosome function is unlikely to emerge without additional help from the structural biology community. Large conformational changes occur in the ribosome during protein synthesis, and in the absence of high-resolution structures for the 70S ribosome in many more states than have been characterized so far, it will be difficult to understand what these changes are and why they occur. Although it will not be easy to prepare the crystals required, the rewards are certain to be enormous. References 1. Tissieres, A. in Ribosomes (eds Nomura, M., Tissieres, A. & Lengyel, P.) 3-12 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1974). 2. Crick, F. H. C. The origin of the genetic code. J. Mol. Biol. 38, 367-379 (1968). | PubMed | ISI | 3. Woese, C. R. in Ribosomes. Structure, Function, and Genetics (eds Chambliss, G. et al.) 357376 (University Park Press, Baltimore, 1980). 4. Cech, T., Zaug, A. & Grabowski, P. In vitro slicing of the ribosomal RNA precursor of Tetrahymena: involvement of the guanosine nucleotide in the excision of the intervening sequence. Cell 27, 487-496 (1981). | PubMed | ISI | 5. Guerrier-Takada, C., Gardiner, K., Marsh, T., Pace, N. & Altman, S. The RNA moiety of ribonuclease P is the catalytically active subunit of the enzyme. Cell 35, 849-857 (1983). | PubMed | ISI | 6. Noller, H. F. Ribosomal RNA and translation. Annu. Rev. Biochem. 60, 191-227 (1991). | PubMed | ISI | 7. Moore, P. B. in The RNA World (eds Gesteland, R. F. & Atkins, J. F.) 119-136 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993). 8. Santer, M. & Dahlberg, A. E. in Ribosomal RNA, Structure, Evolution, Processing, and Function in Protein Biosynthesis (eds Zimmermann, R. A. & Dahlberg, A. E.) 3-20 (CRC Press, Boca Raton, 1995). 9. Soll, D. & RajBhandary, U. L. (eds) tRNA. Structure, Biosynthesis, and Function (American Society for Microbiology Press, Washington DC, 1995). 10. Green, R. & Noller, H. F. Ribosomes and translation. Annu. Rev. Biochem. 66, 679-716 (1997). | PubMed | ISI | 11. Gavrilova, L. P., Kostiashkina, O. E., Koteliansky, V. E., Rutkevitch, N. M. & Spirin, A. S. Factorfree ("non-enzymatic") and factor-dependent systems of translation of polyuridylic acid by Escherichia coli ribosomes. J. Mol. Biol. 101, 537-552 (1976). | PubMed | ISI | 12. Rheinberger, H.-J. et al. in The Ribosome. Structure, Function & Genetics (eds Hill, W. E. et al.) 318-330 (ASM Press, Washington DC, 1990). 13. Agrawal, R. K. et al. Direct visualization of A-, P-, and E-site transfer RNAs in the Escherichia coli ribosome. Science 271, 1000-1002 (1996). | PubMed | ISI | 14. Nissen, P., Ban, N., Hansen, J., Moore, P. B. & Steitz, T. A. The structural basis of ribosome activity in peptide bond synthesis. Science 289, 920-930 (2000). | Article | PubMed | ISI | 15. Schmeing, T. M. et al. A pre-translocational intermediate in protein synthesis observed in crystals of enzymatically active 50S subunits. Nature Struct. Biol. 9, 225-230 (2002). | Article | PubMed | ISI | 16. Milligan, R. A. & Unwin, P. N. T. Location of the exit channel for nascent protein in 80S ribosome. Nature 319, 693-695 (1986). | PubMed | ISI | 17. Yonath, A., Leonard, K. R. & Wittmann, H. G. A tunnel in the large ribosomal subunit revealed by three-dimensional image reconstruction. Science 236, 813-816 (1987). | PubMed | ISI | 18. Bernabeu, C. & Lake, J. A. Nascent polypeptide chains emerge from the exit domain of the large ribosomal subunit: immune mapping of the nascent chain. Proc. Natl Acad. Sci. USA 79, 3111-3115 (1982). | PubMed | ISI | 19. Bernabeu, C., Tobin, E. M., Fowler, A., Zabin, I. & Lake, J. A. Nascent polypeptide chains exit the ribosome in the same relative position in both eucaryotes and procaryotes. J. Cell Biol. 96, 1471-1474 (1983). | PubMed | ISI | 20. Frank, J. et al. A model for protein synthesis based on cryo-electron microscopy of the E. coli ribosome. Nature 376, 441-444 (1995). | PubMed | ISI | 21. Stark, H. et al. The 70S Escherichia coli ribosome at 23 Å resolution: fitting the ribosomal RNA. Structure 3, 815-821 (1995). | PubMed | ISI | 22. Beckmann, R. et al. Alignment of conduits for the nascent polypeptide chain in the ribosomeSec61 complex. Science 278, 2123-2128 (1997). | Article | PubMed | ISI | 23. Noller, H. F. Structure of ribosomal RNA. Annu. Rev. Biochem. 53, 119-162 (1984). | PubMed | ISI | 24. Garrett, R. A. & Rodriguez-Fonseca, C. in Ribosomal RNA. Structure, Evolution, Processing and Function in Protein Biosynthesis (eds Zimmermann, R. A. & Dahlberg, A. E.) 327-355 (CRC Press, Boca Raton, 1996). 25. Koshland, D. E., Caraway, K. W., Dafforn, G. A., Gass, J. D. & Storm, D. R. The importance of orientation factors in enyzmatic reactions. Cold Spring Harb. Symp. Quant. Biol. 36, 13-20 (1971). | ISI | 26. Koshland, D. E. Molecular basis of enzyme catalysis and control. Pure Appl. Chem. 25, 119(1971). | PubMed | 27. Page, M. I. & Jencks, W. P. Aminolysis of acetylimidazole and rate acceleration caused by intramolecular catalysis. Fed. Proc. 30, 1240 (1971). | ISI | 28. Nierhaus, K. H., Schulze, H. & Cooperman, B. S. Molecular mechanisms of the ribosomal peptyl transferase center. Biochem. Int. 1, 185-192 (1980). | ISI | 29. Samaha, R. R., Green, R. & Noller, H. F. A base pair between tRNA and 23S rRNA in the peptidyl transferase centre of the ribosome. Nature 377, 309-314 (1995). | PubMed | ISI | 30. Kim, D. F. & Green, R. Base-pairing between 23S rRNA and tRNA in the ribosomal A site. Mol. Cell 4, 859-864 (1999). | PubMed | ISI | 31. Muth, G. W., Ortoleva-Donnelly, L. & Strobel, S. A. A single adenosine with a neutral pKa in the ribosomal peptidyl transferase center. Science 289, 947-950 (2000). | Article | PubMed | ISI | 32. Polacek, N., Gaynor, M., Yassin, A. & Mankin, A. S. Ribosomal peptidyl transferase can withstand mutations at the putative catalytic nucleotide. Nature 411, 498-501 (2001). | Article | PubMed | ISI | 33. Thompson, J. et al. Analysis of mutations at residues A2451 and G2447 of 23S rRNA in the peptidyltransferase active site of the 50S ribosomal subunit. Proc. Natl Acad. Sci. USA 98, 9002-9007 (2001). | Article | PubMed | ISI | 34. Katunin, V. I., Muth, G. W., Strobel, S., Wintermeyer, W. & Rodnina, M. V. Important contribution to catalysis of peptide bond formation by a single ionizing group within the ribosome. Mol. Cell (in the press). 35. Bayfield, M. A., Dahlberg, A. E., Schulmeister, U., Dorner, S. & Barta, A. A conformational change in the ribosomal peptidyl transferase center upon active/inactive transition. Proc. Natl Acad. Sci. USA 98, 10096-10101 (2001). | PubMed | ISI | 36. Muth, G. W., Chen, L., Kosek, A. & Strobel, S. pH-dependent conformational flexibility within the ribosomal peptidyl transferase center. RNA 7, 1403-1415 (2001). | PubMed | ISI | 37. Xiong, L., Polacek, N., Sander, P., Boettger, E. G. & Mankin, A. S. pKa of adenine 2451 in the ribosomal peptidyl transferase center remains elusive. RNA 7, 1365-1369 (2001). | PubMed | ISI | 38. Carter, A. P. et al. Functional insights from the structure of the 30S ribosomal subunit and its interactions with antibiotics. Nature 407, 340-348 (2000). | Article | PubMed | ISI | 39. Cate, J. H., Yusupov, M. M., Yusupova, G. Z., Earnest, T. N. & Noller, H. F. X-ray crystal structures of 70S ribosome functional complexes. Science 285, 2095-2104 (1999). | Article | PubMed | ISI | 40. Yusupov, M. M. et al. Crystal structure of the ribosome at 5.5 Å resolution. Science 292, 883896 (2001). | Article | PubMed | ISI | 41. Ogle, J. M. et al. Recognition of cognate transfer RNA by the 30S ribosomal subunit. Science 292, 897-902 (2001). | Article | PubMed | ISI | 42. Yusupova, G. Z., Yusupov, M. M., Cate, J. H. D. & Noller, H. F. The path of messenger RNA through the ribosome. Cell 106, 231-241 (2001). 43. Shine, J. & Dalgarno, L. The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc. Natl Acad. Sci. USA 71, 1342-1346 (1974). | PubMed | 44. Steitz, J. A. & Jakes, K. How ribosomes select initiator regions in mRNA: base pair formation between the 3' terminus of 16S rRNA and the mRNA during initiation of protein synthesis in E. coli. Proc. Natl Acad. Sci. USA 71, 1342-1346 (1975). 45. Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. & Steitz, T. A. RNA tertiary interactions in the large ribosomal subunit: the A-minor motif. Proc. Natl Acad. Sci. USA 98, 4899-4903 (2001). | PubMed | ISI | 46. Stark, H. et al. Visualization of elongation factor Tu on the Escherichia coli ribosome. Nature 389, 403-406 (1997). | Article | PubMed | ISI | 47. Agrawal, R. K. et al. Visualization of tRNA movements on the Escherichia coli 70 S ribosome during the elongation cycle. J. Cell Biol. 150, 447-459 (2000). | PubMed | ISI | 48. Wool, I. G., Correll, C. C. & Chan, Y.-L. in The Ribosome. Structure, Function, Antibiotics, and Cellular Interactions (eds Garrett, R. A. et al.) 461-473 (ASM Press, Washington, DC, 2000). 49. Ban, N. et al. Placement of protein and RNA structures into a 5 Å-resolution map of the 50S ribosomal subunit. Nature 400, 841-847 (1999). | Article | PubMed | ISI | 50. Agrawal, R. K., Linde, J., Sengupta, J., Nierhaus, K. H. & Frank, J. Localization of L11 protein on the ribosome and elucidation of its involvement in EF-G-dependent translocation. J. Mol. Biol. 311, 777-787 (2001). | Article | PubMed | ISI | 51. Czworkowski, J. & Moore, P. B. The elongation phase of protein synthesis. Prog. Nucleic Acids Res. Mol. Biol. 54, 293-332 (1996). | ISI | 52. Frank, J. & Agrawal, R. K. A rachet-like inter-subunit reorganization of the ribosome during translocation. Nature 406, 318-322 (2000). | Article | PubMed | ISI | 53. Agrawal, R. K., Heagle, A. B., Penczek, P., Grassucci, R. A. & Frank, J. EF-G-dependent GTP hydrolysis induces translocation accompanied by large conformational changes in the 70S ribosome. Nature Struct. Biol. 6, 643-647 (1999). | Article | PubMed | ISI | 54. Stark, H., Rodnina, M. V., Wieden, H.-J., van Heel, M. & Wintermeyer, W. Large-scale movement of elongation factor G and extensive conformational change of the ribosome during translocation. Cell 100, 301-309 (2000). | PubMed | ISI | 55. Peske, F., Matassova, N. B., Savelsbergh, A., Rodnina, M. V. & Wintermeyer, W. Conformationally restricted elongation factor G retains GTPase activity but is inactive in translocation on the ribosome. Mol. Cell 6, 501-505 (2000). | PubMed | ISI | 56. Carter, A. P. et al. Crystal structure of an initiation factor bound to the 30S ribosomal subunit. Science 291, 498-501 (2001). | PubMed | ISI | 57. Pioletti, M. et al. Crystal structures of complexes of the small ribosomal subunit with tetracycline, edeine and IF3. EMBO J. 20, 1829-1839 (2001). | Article | PubMed | ISI | 58. McCuthcheon, J. P. et al. Location of translational initiation factor IF3 on the small ribosomal subunit. Proc. Natl Acad. Sci. USA 96, 4301-4306 (1999). | Article | PubMed | ISI | 59. Dallas, A. & Noller, H. F. Interaction of translation initiation factor 3 with the 30S ribosomal subunit. Mol. Cell 8, 855-864 (2001). | PubMed | ISI | 60. Schluenzen, F. et al. Structural basis for the interaction of antibiotics with the peptidyl transferase centre in eubacteria. Nature 413, 814-821 (2001). | Article | PubMed | ISI | 61. Hansen, J. L., Ban, N., Nissen, P., Moore, P. B. & Steitz, T. A. The structures of four macrolide antibiotics bound to the large ribosomal subunit. Mol. Cell (in the press). 62. Beckmann, R. et al. Architecture of the protein-conducting channel associated with the translating 80S ribosome. Cell 107, 361-372 (2001). | PubMed | ISI | 63. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A. The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905-920 (2000). | Article | PubMed | ISI | Acknowledgements. We have benefited from discussions with our colleagues S. A. Strobel, J. L. Hansen, D. J. Klein and T. M. Schmeing, but the opinions expressed here are our responsibility. This work was supported by the Howard Hughes Medical Institute, NIH and the Agouron Institute. Figure 1 The arrangement of tRNA-binding sites on the large ribosomal subunit. The surface of the H. marismortui large ribosomal subunit that interacts with the small subunit is shown with its RNA portions in blue, and its protein regions in purple. tRNAs are shown in ribbon format bound to the E, P and A sites. The positions they occupy are deduced from the 70S structure of Noller and colleagues39. (Reproduced with permission from ref. 14.) Figure 2 Interactions of the CCA ends of ribosome-bound tRNA with the large ribosomal subunit. a, Cut-away view of the large ribosomal subunit with tRNAs bound. tRNAs are positioned on the large ribosomal subunit as described in the legend for Fig. 1, and the subunit sliced in half along a plane approximately perpendicular to the Fig. 1 plane of view to reveal the placement of the acceptor stems of ribosome-bound tRNAs in the large subunit's peptide exit tunnel. rRNA is shown in white, and ribosomal protein in yellow. tRNAs are colour coded as follows: E site, brown; P site, purple; and A site, green. b, Interactions of the CCA sequences bound in the P site and the A site with 23S rRNA. The molecule bound in the P site is the deacylated tRNA analogue CCA (purple). The molecule bound in the A site is the peptidyl-tRNA analogue CCA–puromycin– phenylalanine– caproic acid–biotin (green)15. Blue bases are components of the P loop29; brown bases belong to the A loop30. The nucleotides in cyan and red are other components of the peptidyl transferase centre. Bases are numbered according to the sequence of H. marismortui 23S rRNA. The two 23S rRNA bases closest to the newly formed peptide bond are A2486 and U2620, which correspond to A2451 and U2585 in E. coli, respectively. (Reproduced with permission from ref. 15.) Figure 3 A possible mechanism for the involvement of A2486 (H. marismortui)/A2451 (E. coli) in peptide bond formation. One of the hydrogens belonging to the -amino group of the aminoacyltRNA bound in the A site of the large subunit seems to interact with the N3 of A2451 (E. coli numbering) during the nucleophilic attack of that amino group on the carbonyl carbon of the ester linking the nascent peptide to the tRNA in the P site (top frame). If that interaction were strong enough, it could enhance the rate of peptide bond formation by abstracting a proton from the attacking -amino group (middle and bottom frames). Figure 4 Fidelity-checking interactions in the A site of the small ribosomal subunit. This figure shows the interactions between ribosome residues and the base pairs that form when an anticodon stem–loop with a GAA anticodon interacts in the A site with a (cognate) UUU codon. a, The type I A-minor interaction between A1493 and the AU pair that forms between the U at the first position in the codon and the 3' A of the anticodon. b, The type II A-minor interaction that forms between the AU pair involving the second U in the mRNA codon and A1492, and the similar interaction with the AU pair involving G530, which also hydrogen bonds with A1492. c, The interaction of the ribosome with the so-called wobble-position base pair, which involves the 3' U of the mRNA codon. Note these interactions monitor only the placement of the mRNA base, not the placement of the anticodon base that is paired with it. (Reproduced from ref. 41 with permission.) Figure 5 Conservation in the large ribosomal subunit. The large ribosomal subunit of H. marismortui is viewed from its small subunit interface side, looking down into its active site, where an inhibitor (red) is shown bound. RNA atoms are shown as space-filling spheres, and the protein is shown in ribbon format. The blue-coloured RNA is RNA belonging to the core of 23S/28S rRNA that is conserved in all species. The green-coloured RNA represents sequences that are outside the conserved core. (Reproduced with permission from Science.) 11 July 2002 Nature 418, 236 - 243 (2002); doi:10.1038/418236a <> Alternative pre-mRNA splicing and proteome expansion in metazoans TOM MANIATIS AND BOSILJKA TASIC Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA The protein coding sequences of most eukaryotic messenger RNA precursors (premRNAs) are interrupted by non-coding sequences called introns. Pre-mRNA splicing is the process by which introns are removed and the protein coding elements assembled into mature mRNAs. Alternative pre-mRNA splicing selectively joins different protein coding elements to form mRNAs that encode proteins with distinct functions, and is therefore an important source of protein diversity. The elaboration of this mechanism may have had a significant role in the expansion of metazoan proteomes during evolution. Metazoans display an extraordinarily broad spectrum of functional and behavioural complexity. Complex organisms probably evolved by increasing the number of components that constitute them (for example, proteins), and/or by elaborating the relationships between components (for example, regulatory networks). The size of the proteome of an organism (the complete set of proteins expressed by the genome during the life of an organism) can be expanded during evolution by increasing the number of genes, by elaborating preexisting mechanisms that generate protein diversity, and by inventing new mechanisms. For example, the mechanism of somatic-cell DNA rearrangement, which can generate virtually unlimited diversity of antibodies and T-cell receptors, arose during vertebrate evolution. Mechanisms that increase protein diversity in all metazoans include the use of multiple transcription start sites1, alternative pre-mRNA splicing2-5, polyadenylation6, pre-mRNA editing7, and post-translational protein modifications8. Among these mechanisms, alternative pre-mRNA splicing is considered to be the most important source of protein diversity in vertebrates2, 3. Here we review the mechanisms involved in the regulation of alternative pre-mRNA splicing, and discuss the effect of this process on expansion of the proteomes of multicellular organisms. Exon recognition in constitutively spliced pre-mRNAs The pre-mRNA splicing reaction is carried out by spliceosomes — multicomponent ribonucleoprotein complexes containing five small nuclear RNAs (snRNAs) and a large number of associated proteins9-11. Spliceosomes recognize 5' and 3' splice sites, which are located at exon–intron boundaries (Fig. 1). The assembly of spliceosomes is a highly dynamic process, culminating in the juxtaposition of 5' and 3' splice sites in the catalytic core of the complex. The splicing reaction occurs via a two-step mechanism. In the first step, the 5' end of the intron is joined to an adenine residue in the branchpoint sequence upstream from the 3' splice site to form a branched intermediate called an intron lariat. In the second step, the exons are ligated and the intron lariat is released12. Figure 1 Exon recognition. Full legend High resolution image and legend (27k) A fundamental problem in pre-mRNA splicing is 'exon recognition', the process by which exons are distinguished from introns, and intron–exon boundaries are accurately defined13, 14 . The average size of a human exon is 150 nucleotides, whereas introns average around 3,500 nucleotides15, and can be as large as 500,000 nucleotides16. Thus, the splicing machinery must recognize small exon sequences located within vast stretches of intronic RNA. Moreover, 5' and 3' splice sites are poorly conserved, and introns contain large numbers of cryptic splice sites, which match the loose 5' or 3' splice-site consensus. Cryptic splice sites are normally avoided by the splicing machinery, but can be selected for splicing when normal splice sites are altered by mutation. Identification of the correct splice sites is achieved by virtue of their proximity to exons13, 14. Specific sequence elements in exons known as exonic splicing enhancers (ESEs) interact with SR proteins, a family of conserved serine/arginine-rich splicing factors14, 17. These recruit the splicing machinery to the flanking 5' and 3' splice sites (Fig. 1). Thus, exon sequences are under multiple evolutionary constraints, as they must be conserved not only for protein coding but also for recognition by SR proteins18, 19. The many cryptic splice sites present in introns could also be avoided by the presence of intronic splicing silencers (ISSs) and competition with splice sites flanking the recognized exons20. Once exon recognition is completed, the flanking splice sites must be joined in the correct 5' 3' order to prevent exon skipping. This is probably accomplished, at least in part, through the mechanistic coupling of transcription and splicing21. In this model, splicing factors, which are bound to the carboxy-terminal domain (CTD) of RNA polymerase II, interact with exons as they emerge from the exit pore of the polymerase. This interaction tethers the newly synthesized exon to the CTD until the next exon is synthesized. In large introns, many hours could pass between the synthesis of the 5' and 3' splice sites. Although coupling transcription to splicing can prevent exon skipping in constitutively spliced premRNAs, exon skipping does occur during alternative pre-mRNA splicing. In this case, the presence or absence of regulatory proteins can determine whether an exon is recognized and subsequently included in the mature mRNA. Mutations that interfere with proper exon recognition result in a large number of human genetic diseases22. Approximately 15% of the single base-pair mutations that cause human genetic diseases result in pre-mRNA splicing defects23. Some of these mutations interfere with the function of normal 5' and 3' splice sites, thereby leading to the recognition of nearby pre-existing cryptic splice sites. Others actually create new splice sites that are used instead of the normal ones. Finally, single base-pair mutations within exons can interfere with the binding of SR proteins, leading to exon exclusion from the mature mRNA22, 24, 25. For example, a translationally silent C-to-T mutation that occurs within an ESE of the human survival of motor neuron 2 (SMN2) gene disrupts the binding site of the SR protein SF2/ASF and leads to exon skipping22. General mechanisms of alternative splicing Alternative pre-mRNA splicing is the process by which multiple mRNAs can be generated from the same pre-mRNA by the differential joining of 5' and 3' splice sites. For example, exons can be extended or shortened, skipped or included, and introns can be removed or retained in the mRNA22. In some cases, exons are included in the mRNA in a mutually exclusive manner. Although there are very few examples in which the mechanisms of alternative splicing are known in detail, a general outline is understood. Regulatory proteins interact with specific sequences within pre-mRNAs and subsequently stimulate or repress exon recognition. These proteins bind directly to 5' or 3' splice sites, or to other pre-mRNA sequences called exonic or intronic splicing enhancers (ESEs or ISEs) and silencers (ESSs or ISSs). Enhancers and silencers stimulate or repress splice-site selection, respectively9, 17, 22, 26-28 . A common feature of proteins that regulate splicing is the presence of two functional domains, an RNA-binding domain and a protein–protein interaction domain. The best characterized RNA-binding domains are the RNA-recognition motif (RRM) and Khomology (KH) domains29. The three-dimensional structures of the two domains are distinct, as are the general features of the RNA-binding sites they recognize. Most ESEs are recognized by SR proteins, which contain one or more RRM domains and an arginine/serine-rich (RS) protein–protein interaction domain27, 30. SR proteins are essential, multifunctional splicing factors required at different steps of spliceosome assembly17. They are also thought to mediate cross-intron interactions between splicing factors bound to the 5' and 3' splice sites17 (Fig. 1). Finally, SR proteins are required for cross-exon interactions in both constitutively and alternatively spliced pre-mRNAs14, 17, 27, 30 . The best characterized ESSs and ISSs are recognized by members of the heterogeneous nuclear ribonucleoprotein (hnRNP) family. hnRNP proteins are highly abundant RNAbinding proteins that lack an RS domain. Several family members contain an arginine/glycine-rich domain that may be involved in both RNA binding and interactions with other proteins. Among the best characterized members of this family are the hnRNP A1 (ref. 31) and hnRNP I (ref. 32) proteins. hnRNP proteins are diffusely distributed throughout the nucleus33, 34, unlike SR proteins, which co-localize with other splicing factors in nuclear speckles27, 30. It is important to note that some proteins containing an RS domain function as splicing repressors, whereas certain hnRNP proteins have been shown to function as splicing activators28. SR proteins were originally defined by a set of common functional characteristics17. Subsequently, other proteins containing RS domains that do not share these characteristics were discovered, and they are referred to as SR-like proteins. For example, two recently identified SR-like proteins have been shown to function as splicing repressors, and were therefore named SR-repressor proteins35. We also note that there are splicing regulators that do not belong to either the SR or the hnRNP protein family. Some of these will be discussed below. Remarkably, the role of regulatory proteins in splice-site selection can be affected by the promoter that generates the pre-mRNA36. Thus transcription of the same pre-mRNA from different promoters can produce distinct mRNAs. This mechanism of alternative splicing could be a consequence of the coupling between transcription and pre-mRNA splicing21. For example, particular SR proteins could be differentially recruited to RNA polymerase complexes assembled on different promoters, and then transferred to cognate splicing enhancers to promote the inclusion of specific exons5. Well characterized examples of alternative splicing The mechanism of exon recognition in constitutively spliced pre-mRNAs provides the basis for positive and negative regulation of alternative splicing. The organization of regulatory sequences within pre-mRNAs (ESEs, ESSs, ISEs and ISSs) and the relative ratios of different regulatory proteins determine which splice sites are used in the splicing reaction. This, in turn, determines which exons are included in the mRNA4, 5, 28. The best characterized examples of regulated alternative splicing derive from studies of the Drosophila sex-determination pathway37. The key regulatory factors were first identified by genetic analyses38, 39, and then characterized in biochemical experiments. Multiple steps in this pathway are regulated by different mechanisms of alternative pre-mRNA splicing. The examples described below include both splice-site repression and activation. The Drosophila female-specific protein Sex-lethal (SXL) represses male-specific 3' splice sites in transformer (tra) and sxl pre-mRNAs by two distinct mechanisms. As shown in Fig. 2a, exon 2 of tra pre-mRNA is preceded by two alternative 3' splice sites. The proximal and distal 3' splice sites are used in males and females, respectively. Splicing to the male-specific 3' splice site produces an mRNA containing a premature translational stop codon. By contrast, splicing to the female-specific 3' splice site produces an mRNA that encodes functional TRA protein. In males, the splicing factor U2AF binds to the malespecific 3' splice site and initiates spliceosome assembly (Fig. 2a)37. However in females, this splice site is bound by the female-specific splicing repressor SXL, thus blocking the binding of U2AF. Instead, U2AF binds to the female-specific 3' splice site, and functional tra mRNA is produced (Fig. 2a). Figure 2 Regulation of alternative pre-mRNA splicing in the Drosophila sex-determination pathway. Full legend High resolution image and legend (51k) SXL also regulates the alternative splicing of its own pre-mRNA, albeit by an entirely different mechanism40. sxl exon 3 is excluded from the mRNA only in females. In males, inclusion of exon 3 introduces a premature translational stop codon. Exon 3 inclusion requires the protein SPF45, a second-step splicing factor that binds to the AG dinucleotide of the male-specific 3' splice site (Fig. 2b). In females, SXL binds to a site adjacent to SPF45, and the two proteins interact. This interaction interferes with the activity of SPF45, and thus blocks the second step of the splicing reaction. As a consequence, exon 3 is skipped and exon 2 is spliced to exon 4, thus producing an mRNA encoding functional SXL protein. Thus, SXL blocks the first step of the splicing reaction in tra pre-mRNA and the second step in sxl pre-mRNA. sxl autoregulation is the only known example in which alternative pre-mRNA splicing is regulated at the second step of the splicing reaction. While the previous examples of regulated alternative splicing involve splice-site repression, the female-specific splicing of Drosophila doublesex (dsx) pre-mRNA is the best characterized example of splice-site activation37. The 3' splice site immediately upstream from exon 4 of dsx pre-mRNA is not recognized by the splicing machinery in males, thus leading to the exclusion of this exon (Fig. 2c). The male-specific dsx mRNA encodes a transcriptional repressor of female-specific genes. In females, the regulatory protein TRA promotes the cooperative binding of an SR protein, RBP1, and an SR-like protein, Transformer 2 (TRA2), to individual ESEs within exon 4 (refs 41, 42). This heterotrimeric protein complex recruits the splicing machinery to the upstream 3' splice site, leading to the inclusion of exon 4. The female-specific dsx mRNA encodes a transcriptional repressor of male-specific genes. The basic mechanisms of alternative splicing established in Drosophila have been shown to function in mammals. Alternative splice-site selection in mammals is also controlled by differential binding of regulatory proteins to splice sites, or to enhancer or silencer sequences within the pre-mRNA28. The organization of these sequences and the interplay of different regulatory proteins determine the outcome of the splicing reaction. As in Drosophila, regulatory proteins can exert competing influences on the splicing of a premRNA. For example, the binding of hnRNP A1 to an ESS in an exon of HIV tat pre-mRNA represses the inclusion of this exon in the mRNA. The binding of hnRNP A1 prevents the interaction of the SR protein SC35 with a nearby ESE31. Although the ESS and the ESE do not directly overlap, hnRNP A1 bound to the ESS seems to promote cooperative binding of additional hnRNP A1 proteins to adjacent exon sequences, thus spreading into the ESE. However, another SR protein, SF2/ASF, has a higher affinity for the ESE than SC35 and is therefore able to displace hnRNP A1. One of the best characterized mammalian repressors of exon recognition is hnRNP I, also called a polypyrimidine tract-binding protein (PTB)22, 27, 28, 43. This protein can repress exon inclusion by directly interfering with binding of general splicing factors to the pyrimidine tract. However, in most cases, the hnRNP I/PTB-binding sites flank the excluded exon. Two mechanisms for repression by hnRNP I/PTB have been proposed to account for this observation32. In the first mechanism, hnRNP I/PTB proteins bound on both sides of the exon interact with each other in such a way that the intervening exon 'loops out' and is isolated from the splicing machinery. In the second mechanism, hnRNP I/PTB cooperatively spreads across the exon, thus creating a 'zone of silencing'22, 32. Regulation of tissue-specific alternative splicing In addition to the ubiquitous RNA-binding proteins implicated generally in alternative splicing, a number of specific regulatory proteins have been identified, and some of them are expressed only in certain tissues. These proteins regulate alternative splicing of specific sets of pre-mRNAs. Drosophila ELAV (for embryonic-lethal, abnormal visual system) is a pan-neuronal premRNA-binding protein that regulates alternative splicing of at least three different premRNAs44. This protein contains three RRMs, which are required for binding to uridine-rich regulatory sequences in the target pre-mRNAs. For example, ELAV binding to neuroglian pre-mRNA results in alternative inclusion of a neural-specific terminal exon. As yet there is no evidence that the mammalian homologues of Drosophila ELAV are involved in the regulation of pre-mRNA splicing45, 46. The Drosophila Half pint (HFP) protein regulates alternative splicing of a subset of premRNAs in the ovary47. One of these encodes the ovarian tumour protein, which is required for oogenesis, while another encodes a translation initiation factor. HFP is also required for constitutive splicing of the gurken pre-mRNA in oocytes. The mammalian orthologues of HFP are the human PUF60 protein48 and the rat Siah-binding protein49. These proteins interact with the splicing factor U2AF65, which binds to 3' splice sites in all pre-mRNAs (Fig. 1). Although HFP targets specific pre-mRNAs, human PUF60 stimulates splicing nonspecifically in mammalian cell extracts, suggesting that it is a general splicing factor48. The Drosophila P-element transposon is active only in the germline because the splicing of its transposase pre-mRNA is repressed in somatic cells. The P-element somatic inhibitor protein (PSI) binds to the transposase pre-mRNA and inhibits removal of a specific intron in somatic cells, leading to the production of a truncated protein that represses transposition50. By contrast, PSI is not expressed in germ cells, and functional transposase is produced. PSI binds to 'pseudo' 5' splice sites in transposase pre-mRNA and recruits U1 snRNP to these sites through interactions with the U1 snRNP 70K protein50. This leads to the formation of an abortive 5' splice-site complex that interferes with splicing at the normal 5' splice site. The mammalian homologue of PSI, called KSRP (for KH-type splicing regulatory protein), regulates neural-specific splicing of the human src premRNA51. The mammalian splicing factor NOVA-1 (for neuro-oncological ventral antigen-1) regulates alternative pre-mRNA splicing in neurons. NOVA-1 binds to specific intronic sequences of target pre-mRNAs, and stimulates the inclusion of specific exons in the corresponding mRNAs45. For example, inclusion of alternatively spliced exons in the premRNAs encoding subunits of the glycine and GABA ( -aminobutyric acid) receptors requires NOVA-1. At present, there are only a few examples of proteins that regulate tissue-specific alternative splicing, but it is likely that many more regulatory proteins, target pre-mRNAs and regulatory mechanisms will be discovered. Although many tissue-specific regulatory proteins are conserved between flies and mammals, their target pre-mRNAs can be different. This suggests a role for tissue-specific alternative splicing in the generation of species-specific traits and functions. Inducible alternative splicing The proteome of a cell can rapidly change in response to extracellular stimuli through complex signal-transduction pathways. Changes in protein composition can be regulated at many different levels, but have been studied primarily at the level of transcription and posttranslational protein modification. Recently, several cases of inducible alternative premRNA splicing have been reported. An example from the mammalian immune system is the alternative splicing of a premRNA encoding distinct isoforms of CD45, a transmembrane protein tyrosine phosphatase. CD45 pre-mRNA is alternatively spliced in response to T-cell activation, leading to the production of proteins with distinct extracellular domains. Recent studies revealed a role for protein kinase C and Ras in the signalling pathway required for the switch in CD45 alternative splicing52. Although the details of the signalling pathway are not fully understood, several lines of evidence point to SR proteins as the downstream effectors in the pathway. First, the relative levels of several SR proteins change upon T-cell activation53. Second, overexpression of different SR proteins in cultured cells leads to distinct patterns of alternative splicing of CD45 pre-mRNA53. Finally, a T-cell-specific conditional knockout of the gene encoding the SR protein SC35 interferes with the normal pattern of CD45 alternative splicing upon T-cell activation54. Based on these observations, it has been proposed that T-cell activation leads to the production of a new combination of SR proteins that bind to CD45 pre-mRNA and change the pattern of its alternative splicing52-54. Similar mechanisms may control alternative splicing in the brain in response to neuronal activity43. For example, alternative splicing of the rat SR-like protein Tra2- (an orthologue of Drosophila TRA2) changes in response to small molecule-induced neural activity55. These changes in the types and levels of Tra2- isoforms could, in turn, control the alternative splicing of other brain pre-mRNAs. A potential target of human Tra2- is the SMN2 pre-mRNA, because its splicing pattern can be altered by overexpression of this SRlike protein56. The ania-6 pre-mRNA, which encodes a member of a new family of cyclins, provides another example of neuronal activity-dependent alternative pre-mRNA splicing. The synthesis of this protein is inducible in the adult brain of rats by cocaine and dopamine agonists, and two distinct ania-6 mRNAs are generated by alternative splicing in response to different neurotransmitters and drugs57. One mRNA encodes a protein that associates with RNA polymerase II and co-localizes with splicing factors in nuclear speckles, while the other mRNA encodes a protein that localizes to the cytoplasm57. Thus, two functionally distinct proteins are produced from the same pre-mRNA in response to neural activity. The only case of inducible alternative splicing in which the regulatory sequences within pre-mRNA have been identified is the neural activity-dependent alternative splicing of the rat Slo pre-mRNA58. This pre-mRNA is alternatively spliced, leading to the production of multiple protein isoforms of a calcium-dependent potassium channel3, 59. These isoforms display distinct electrophysiological properties. A cell-culture model system was used to study the signalling pathway that induces Slo alternative splicing, and to identify cis-acting regulatory sequences required for the inclusion of a particular exon, stress axis-regulated exon (STREX), in Slo mRNA58. Depolarization of pituitary cells represses inclusion of the STREX exon in a process that requires the activity of a Ca2+/calmodulin-dependent protein kinase (CaMK IV). This signalling pathway acts through an ESS in the STREX exon and the 3' splice site upstream of STREX. Thus, it seems that depolarization of the membrane leads to the activation of CaMK IV, which in turn activates unidentified repressor proteins that bind the STREX exon and the 3' splice site, thereby preventing STREX exon inclusion in Slo mRNA58 (Fig. 3). Inclusion of the STREX exon increases the calcium sensitivity of the channel, which can modulate the electrical properties of the cell58. Figure 3 Inducible alternative splicing of rat Slo pre-mRNA. Full legend High resolution image and legend (40k) Alternative trans-splicing In the examples of alternative splicing discussed above, exons located within an individual pre-mRNA are differentially joined to generate mature mRNAs (alternative cis-splicing). Trans-splicing can join exons located within separate pre-mRNAs, whereas alternative trans-splicing differentially uses exons within separate pre-mRNAs to produce distinct mRNAs. Trans-splicing was first discovered in trypanosomes and later found in organisms as complex as chordates60. However, this type of trans-splicing, called spliced leader (SL)- addition trans-splicing, involves specialized SL RNAs that provide 5'-terminal non-coding exons for all mRNAs61. SL-addition trans-splicing is not a source of protein diversity, as it is a constitutive process that does not lead to the production of alternatively spliced mRNAs. So far, there is no evidence for SL-addition trans-splicing in vertebrates. The potential for mammalian trans-splicing was first demonstrated in vitro14, and then shown to occur in vivo between separate viral pre-mRNAs, and between viral and cellular RNAs in infected cells62, 63. Alternative trans-splicing has also been shown to result in exon duplications in several mammalian cellular RNAs64-67. However, mRNAs containing duplicated exons have not been shown to encode functional proteins, and in one case, exon duplication is not conserved in closely related organisms68. Although several examples of mammalian interchromosomal trans-splicing have been reported, none of these studies definitively proves the existence of this phenomenon69-72. There are also cases of intergenic trans-splicing between pre-mRNAs encoded by closely linked genes in mammals73-76. For example, several hybrid mRNAs are produced from a cluster of human cytochrome P450 3A genes76 (Fig. 4a). Three of the P450 genes (here labelled as 2, 3 and 4) are transcribed from one DNA strand, whereas one (1) is transcribed from the opposite DNA strand. Hybrid mRNAs containing exon 1 of gene 1 joined to various exons of genes 2 and 4 were detected (Fig. 4a). Because gene 1 is transcribed from one DNA strand and genes 2 and 4 from the other, the hybrid mRNAs must be generated by trans- splicing. However, the hybrid intergenic mRNAs are produced at levels that are orders of magnitude lower than those of intragenic mRNAs, and the existence of the endogenous proteins encoded by these hybrid mRNAs has not been demonstrated76. Figure 4 Alternative trans-splicing in mammals and flies. Full legend High resolution image and legend (40k) It is important to note that in every example of mammalian trans-splicing so far reported the pre-mRNAs also engage in cis-splicing. Therefore, it is possible that the observed trans-splicing represents 'splicing noise' resulting from low levels of splice-site pairing between separate pre-mRNAs. Thus, the functional significance of mammalian alternative trans-splicing remains to be established. The only known example in which alternative trans-splicing is required for the production of an essential protein occurs in the Drosophila modifier of mdg4 (mod(mdg4)) locus77, 78. MOD(MDG4) protein isoforms, which function in the establishment or maintenance of chromatin structure, are encoded by an unusually complex genetic locus (Fig. 4b). Twentysix alternatively spliced mRNAs encoded by this locus have been identified, and they all contain four common exons located at the 5' end of the locus78 (shown in red in Fig. 4b). Distinct mRNAs are generated by alternative splicing of the fourth common exon to individual downstream 'variable' exons (shown in blue or yellow in Fig. 4b). Strikingly, a number of variable exons are transcribed from the opposite DNA strand (blue exons in Fig. 4b), strongly suggesting that they are joined to the fourth common exon by trans-splicing. A central question in the regulation of alternative trans-splicing is how splice sites located on separate pre-mRNAs are recognized by the splicing machinery and correctly joined. This is not an entirely new question, as splice sites at the ends of large introns are essentially configured in trans. In this case, the correct joining of splice sites is probably achieved by coupling transcription and splicing through interactions between the splicing machinery and the CTD of RNA polymerase II (ref. 21). In this mechanism, the splicing factors connect the 5' splice site and the CTD as the long intron is being transcribed. The proximity of the eventually synthesized 3' splice site to the CTD would ensure an interaction between splicing complexes assembled on the 5' and 3' splice sites. This type of coupling cannot occur in trans-splicing because different RNA polymerase complexes transcribe separate trans-splicing precursors. Therefore, there must be another mechanism for bringing together the 5' and 3' splices sites. This might be achieved by the localized transcription of pre-mRNAs in the nucleus — by restricting transcription to 'gene expression factories'21 only those pre-mRNAs transcribed in the same factory would engage in trans-splicing. This could explain the close linkage of trans-spliced exons in the mod(mdg4) locus, and the previously mentioned cases of intergenic trans-splicing. Nuclear compartmentalization could also prevent inappropriate splicing to pre-mRNAs encoded by other genes. Finally, it is possible that trans-splicing precursors interact through specific base pairing, or through interactions between proteins bound to each of the precursors77, 78. Comparative genomics and alternative pre-mRNA splicing Alternative pre-mRNA splicing is an important source of protein diversity that may have contributed to the increase in the phenotypic complexity of metazoans during evolution2, 3. This proposal was based, in part, on the unexpectedly small difference in gene number in different organisms from yeast to humans. For example, flies have only about twice as many genes as yeast79. In addition, worms have more genes than flies (19,000 and 14,000, respectively), but flies are clearly more complex in their development, morphology and behaviour. Remarkably, humans have only about 35,000 genes80, 81. Although this number is still being debated82, even the highest estimate for humans is only around threefold greater than that for worms. It is unlikely that this difference alone could explain the obvious differences in functional and behavioural complexity between invertebrates and vertebrates. The increased organismic complexity of vertebrates could be a consequence of the elaboration of mechanisms that increase proteome size and the evolution of more extensive networks of gene regulation83. A measure of the relative contribution of alternative splicing to proteome size could be obtained by dividing the total number of distinct full-length complementary DNAs (cDNAs) resulting from alternative splicing by the number of genes of an organism. Unfortunately, relatively few full-length cDNA sequences are available84-86, and the annotation of several genomes, including the human genome, is still imprecise. In the absence of this information, partial cDNA sequences and expressed sequence tags (ESTs) have been used to detect alternative splicing events87, 88. Efforts are also being made to use DNA microarrays to identify alternatively spliced mRNAs on a large scale89-91. These approaches have provided preliminary estimates of the percentage of genes with alternatively spliced forms (%GASF) and the average number of alternatively spliced forms per gene (ASF/G). The product of these two numbers provides a quantitative estimate of the contribution of alternative splicing to proteome size. Estimates of %GASF in humans range from 35 to 59% (refs 87, 88). However, it is generally agreed that this is an underestimate, because %GASF depends on the number of ESTs examined92. It is important to note that most of the detected alternative splicing events (about 80%) lead to changes in the amino acid sequence of the encoded proteins88. In a recent report, random sets of 650 non-redundant cDNAs were compared with 100,000 ESTs in different organisms ranging from worms and flies to humans92. The surprising outcome was that the estimate for %GASF is approximately the same in all organisms tested92. Similar values have also been obtained for ASF/G in different organisms; however, the estimates were adversely affected if the analysis included 'outlier' genes that generate extraordinarily large numbers of alternatively spliced forms (ref. 92, and J. Valcarcel and P. Bork, personal communication). For example, the Drosophila axon guidance receptor gene, Dscam, contains 95 alternatively spliced exons organized into four clusters, and it has the potential to generate over 38,000 different protein isoforms — nearly three times more proteins than the number of genes in Drosophila93. Remarkable but less extensive examples in mammals are the previously mentioned Slo gene, and three neurexin genes, which encode proteins that may act as synaptic-cell-surface receptors or cell-adhesion molecules94. The Slo gene and the three neurexin genes have the potential to generate over 500 and 2,000 alternatively spliced forms, respectively3, 16. The conclusion of this analysis is that the relative contribution of alternative splicing to the size of the proteome does not seem to be different in evolutionarily divergent metazoans if 'outlier' genes are not included in the analysis92. However, it is possible that the differences will be observed once the 'outlier' genes are considered. To account for their contribution to the overall estimate of ASF/G, the organization of all genes and the sequence of all fulllength cDNAs are required. Thus, the relationship between alternative splicing and proteome size of different metazoans remains to be established. Perspectives At least two forces affect the evolution of complex organisms. On the one hand there is an extraordinary pressure to conserve the basic components of cellular machines among organisms that differ widely in complexity80, 81. For example, the cellular machines required for different steps of gene expression, such as transcription95, splicing11 and mRNA export96, are so highly conserved between yeast and humans that many components are interchangeable. On the other hand, both the proteome size and the complexity of regulatory networks increase as more complex organisms evolve. Proteome size is determined by the number of genes in an organism, and by mechanisms that expand the coding capacity of the genome, including alternative pre-mRNA splicing. However, as discussed above, the relative contribution of alternative splicing to proteome expansion is not well understood. The organization of regulatory networks and their effect on organismic complexity are even more difficult to determine. Moreover, it is possible that a relatively small increase in the size of the proteome or a change in a key component of the regulatory networks could lead to dramatic changes in organismic complexity. If this were the case, the proteome size would not accurately correlate with organismic complexity. Rather, insights into the origins of organismic complexity would require an understanding of regulatory networks83. References 1. Quelle, D. E., Zindy, F., Ashmun, R. A. & Sherr, C. J. Alternative reading frames of the INK4a tumor suppressor gene encode two unrelated proteins capable of inducing cell cycle arrest. Cell 83, 993-1000 (1995). | PubMed | ISI | 2. Black, D. L. Protein diversity from alternative splicing: a challenge for bioinformatics and postgenome biology. Cell 103, 367-370 (2000). | PubMed | ISI | 3. Graveley, B. R. Alternative splicing: increasing diversity in the proteomic world. Trends Genet. 17, 100-107 (2001). | PubMed | ISI | 4. Goldstrohm, A. C., Greenleaf, A. L. & Garcia-Blanco, M. A. Co-transcriptional splicing of premessenger RNAs: considerations for the mechanism of alternative splicing. Gene 277, 31-47 (2001). | PubMed | ISI | 5. Caceres, J. F. & Kornblihtt, A. R. Alternative splicing: multiple control mechanisms and involvement in human disease. Trends Genet. 18, 186-193 (2002). | PubMed | ISI | 6. Gautheret, D., Poirot, O., Lopez, F., Audic, S. & Claverie, J. M. Alternate polyadenylation in human mRNAs: a large-scale analysis by EST clustering. Genome Res. 8, 524-530 (1998). | PubMed | ISI | 7. Keegan, L. P., Gallo, A. & O'Connell, M. A. The many roles of an RNA editor. Nature Rev. Genet. 2, 869-878 (2001). | Article | PubMed | ISI | 8. Banks, R. E. et al. Proteomics: new perspectives, new biomedical opportunities. Lancet 356, 1749-1756 (2000). | Article | PubMed | ISI | 9. Reed, R. Mechanisms of fidelity in pre-mRNA splicing. Curr. Opin. Cell Biol. 12, 340-345 (2000). | PubMed | ISI | 10. Stevens, S. W. et al. Composition and functional characterization of the yeast spliceosomal penta-snRNP. Mol. Cell 9, 31-44 (2002). | PubMed | ISI | 11. Zhou, Z., Licklider, L., Gygi, S. & Reed, R. The proteome of functional human spliceosomes. Nature (submitted). 12. Staley, J. P. & Guthrie, C. Mechanical devices of the spliceosome: motors, clocks, springs, and things. Cell 92, 315-326 (1998). | PubMed | ISI | 13. Berget, S. M. Exon recognition in vertebrate splicing. J. Biol. Chem. 270, 2411-2414 (1995). | PubMed | ISI | 14. Reed, R. Initial splice-site recognition and pairing during pre-mRNA splicing. Curr. Opin. Genet. Dev. 6, 215-220 (1996). | PubMed | ISI | 15. Deutsch, M. & Long, M. Intron-exon structures of eukaryotic model organisms. Nucleic Acids Res. 27, 3219-3228 (1999). | Article | PubMed | ISI | 16. Rowen, L. et al. Analysis of the human neurexin genes: alternative splicing and the generation of protein diversity. Genomics 79, 587-597 (2002). | Article | PubMed | ISI | 17. Graveley, B. R. Sorting out the complexity of SR protein functions. RNA 6, 1197-1211 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. (2000). | Article | PubMed | ISI | Schaal, T. D. & Maniatis, T. Multiple distinct splicing enhancers in the protein-coding sequences of a constitutively spliced pre-mRNA. Mol. Cell. Biol. 19, 261-273 (1999). | PubMed | ISI | Mayeda, A., Screaton, G. R., Chandler, S. D., Fu, X. D. & Krainer, A. R. Substrate specificities of SR proteins in constitutive splicing are determined by their RNA recognition motifs and composite pre-mRNA exonic elements. Mol. Cell. Biol. 19, 1853-1863 (1999). | PubMed | ISI | Sun, H. & Chasin, L. A. Multiple splicing defects in an intronic false exon. Mol. Cell. Biol. 20, 6414-6425 (2000). | PubMed | ISI | Maniatis, T. & Reed, R. An extensive network of coupling among gene expression machines. Nature 416, 499-506 (2002). | Article | PubMed | ISI | Cartegni, L., Chew, S. L. & Krainer, A. R. Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nature Rev. Genet. 3, 285-298 (2002). | Article | PubMed | ISI | Krawczak, M., Reiss, J. & Cooper, D. N. The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: causes and consequences. Hum. Genet. 90, 41-54 (1992). | PubMed | ISI | Valentine, C. R. The association of nonsense codons with exon skipping. Mutat. Res. 411, 87117 (1998). | PubMed | ISI | Liu, H. X., Cartegni, L., Zhang, M. Q. & Krainer, A. R. A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes. Nature Genet. 27, 55-58 (2001). | Article | PubMed | ISI | Blencowe, B. J. Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends Biochem. Sci. 25, 106-110 (2000). | PubMed | ISI | Hastings, M. L. & Krainer, A. R. Pre-mRNA splicing in the new millennium. Curr. Opin. Cell Biol. 13, 302-309 (2001). | PubMed | ISI | Smith, C. W. & Valcarcel, J. Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem. Sci. 25, 381-388 (2000). | PubMed | ISI | Perez-Canadillas, J. M. & Varani, G. Recent advances in RNA-protein recognition. Curr. Opin. Struct. Biol. 11, 53-58 (2001). | PubMed | ISI | Graveley, B. R., Hertel, K. J. & Maniatis, T. SR proteins are 'locators' of the RNA splicing machinery. Curr. Biol. 9, R6-R7 (1999). | PubMed | ISI | Zhu, J., Mayeda, A. & Krainer, A. R. Exon identity established through differential antagonism between exonic splicing silencer-bound hnRNP A1 and enhancer-bound SR proteins. Mol. Cell 8, 1351-1361 (2001). | PubMed | ISI | Wagner, E. J. & Garcia-Blanco, M. A. Polypyrimidine tract binding protein antagonizes exon definition. Mol. Cell. Biol. 21, 3281-3288 (2001). | PubMed | ISI | Dreyfuss, G., Matunis, M. J., Pinol-Roma, S. & Burd, C. G. hnRNP proteins and the biogenesis of mRNA. Annu. Rev. Biochem. 62, 289-321 (1993). | PubMed | ISI | Krecic, A. M. & Swanson, M. S. hnRNP complexes: composition, structure, and function. Curr. Opin. Cell Biol. 11, 363-371 (1999). | Article | PubMed | ISI | Cowper, A. E., Caceres, J. F., Mayeda, A. & Screaton, G. R. Serine-arginine (SR) protein-like factors that antagonize authentic SR proteins and regulate alternative splicing. J. Biol. Chem. 276, 48908-48914 (2001). | PubMed | ISI | Cramer, P. et al. Coupling of transcription with alternative splicing: RNA pol II promoters modulate SF2/ASF and 9G8 effects on an exonic splicing enhancer. Mol. Cell 4, 251-258 (1999). | PubMed | ISI | Schutt, C. & Nothiger, R. Structure, function and evolution of sex-determining systems in Dipteran insects. Development 127, 667-677 (2000). | PubMed | ISI | Baker, B. S. Sex in flies: the splice of life. Nature 340, 521-524 (1989). | PubMed | ISI | Cline, T. W. & Meyer, B. J. Vive la différence: males vs females in flies vs worms. Annu. Rev. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. Genet. 30, 637-702 (1996). | PubMed | ISI | Lallena, M. J., Chalmers, K. J., Llamazares, S., Lamond, A. I. & Valcarcel, J. Splicing regulation at the second catalytic step by Sex-lethal involves 3' splice site recognition by SPF45. Cell 109, 285-296 (2002). | PubMed | ISI | Lynch, K. W. & Maniatis, T. Assembly of specific SR protein complexes on distinct regulatory elements of the Drosophila doublesex splicing enhancer. Genes Dev. 10, 2089-2101 (1996). | PubMed | ISI | Hertel, K. J. & Maniatis, T. The function of multisite splicing enhancers. Mol. Cell 1, 449-455 (1998). | PubMed | ISI | Grabowski, P. J. & Black, D. L. Alternative RNA splicing in the nervous system. Prog. Neurobiol. 65, 289-308 (2001). | PubMed | ISI | Lisbin, M. J., Qiu, J. & White, K. The neuron-specific RNA-binding protein ELAV regulates neuroglian alternative splicing in neurons and binds directly to its pre-mRNA. Genes Dev. 15, 2546-2561 (2001). | PubMed | ISI | Dredge, B. K., Polydorides, A. D. & Darnell, R. B. The splice of life: alternative splicing and neurological disease. Nature Rev. Neurosci. 2, 43-50 (2001). | Article | PubMed | ISI | Toba, G., Qui, J., Koushika, S. P. & White, K. Ectopic expression of Drosophila ELAV and human HuD in Drosophila wing disc cells reveals functional distinctions and similarities. J. Cell Sci. 115, 2413-2421 (2002). | PubMed | Van Buskirk, C. & Schupbach, T. half pint regulates alternative splice site selection in Drosophila. Dev. Cell 2, 343-353 (2002). | PubMed | ISI | Page-McCaw, P. S., Amonlirdviman, K. & Sharp, P. A. PUF60: a novel U2AF65-related splicing activity. RNA 5, 1548-1560 (1999). | Article | PubMed | ISI | Poleev, A., Hartmann, A. & Stamm, S. A trans-acting factor, isolated by the three-hybrid system, that influences alternative splicing of the amyloid precursor protein minigene. Eur. J. Biochem. 267, 4002-4010 (2000). | PubMed | ISI | Labourier, E., Adams, M. D. & Rio, D. C. Modulation of P-element pre-mRNA splicing by a direct interaction between PSI and U1 snRNP 70K protein. Mol. Cell 8, 363-373 (2001). | PubMed | ISI | Min, H., Turck, C. W., Nikolic, J. M. & Black, D. L. A new regulatory protein, KSRP, mediates exon inclusion through an intronic splicing enhancer. Genes Dev. 11, 1023-1036 (1997). | PubMed | ISI | Lynch, K. W. & Weiss, A. A model system for activation-induced alternative splicing of CD45 pre-mRNA in T cells implicates protein kinase C and Ras. Mol. Cell. Biol. 20, 70-80 (2000). | PubMed | ISI | ten Dam, G. B. et al. Regulation of alternative splicing of CD45 by antagonistic effects of SR protein splicing factors. J. Immunol. 164, 5287-5295 (2000). | PubMed | ISI | Wang, H. Y., Xu, X., Ding, J. H., Bermingham, J. R. Jr & Fu, X. D. SC35 plays a role in T cell development and alternative splicing of CD45. Mol. Cell 7, 331-342 (2001). | PubMed | ISI | Daoud, R., Da Penha Berzaghi, M., Siedler, F., Hubener, M. & Stamm, S. Activity-dependent regulation of alternative splicing patterns in the rat brain. Eur. J. Neurosci. 11, 788-802 (1999). | Article | PubMed | ISI | 56. Hofmann, Y., Lorson, C. L., Stamm, S., Androphy, E. J. & Wirth, B. Htra2- 1 stimulates an exonic splicing enhancer and can restore full-length SMN expression to survival motor neuron 2 (SMN2). Proc. Natl Acad. Sci. USA 97, 9618-9623 (2000). | PubMed | ISI | 57. Berke, J. D. et al. Dopamine and glutamate induce distinct striatal splice forms of Ania-6, an RNA polymerase II-associated cyclin. Neuron 32, 277-287 (2001). | PubMed | ISI | 58. Xie, J. & Black, D. L. A CaMK IV responsive RNA element mediates depolarization-induced alternative splicing of ion channels. Nature 410, 936-939 (2001). | Article | PubMed | ISI | 59. Black, D. L. Splicing in the inner ear: a familiar tune, but what are the instruments? Neuron 20, 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 165-168 (1998). | PubMed | ISI | Nilsen, T. W. Evolutionary origin of SL-addition trans-splicing: still an enigma. Trends Genet. 17, 678-680 (2001). | PubMed | ISI | Blumenthal, T. Trans-splicing and polycistronic transcription in Caenorhabditis elegans. Trends Genet. 11, 132-136 (1995). | PubMed | ISI | Eul, J., Graessmann, M. & Graessmann, A. Experimental evidence for RNA trans-splicing in mammalian cells. EMBO J. 14, 3226-3235 (1995). | PubMed | ISI | Caudevilla, C. et al. Heterologous HIV-nef mRNA trans-splicing: a new principle how mammalian cells generate hybrid mRNA and protein molecules. FEBS Lett. 507, 269-279 (2001). | PubMed | ISI | Caudevilla, C. et al. Natural trans-splicing in carnitine octanoyltransferase pre-mRNAs in rat liver. Proc. Natl Acad. Sci. USA 95, 12185-12190 (1998). | Article | PubMed | ISI | Takahara, T., Kanazu, S. I., Yanagisawa, S. & Akanuma, H. Heterogeneous Sp1 mRNAs in human HepG2 cells include a product of homotypic trans-splicing. J. Biol. Chem. 275, 3806738072 (2000). | PubMed | ISI | Frantz, S. A. et al. Exon repetition in mRNA. Proc. Natl Acad. Sci. USA 96, 5400-5405 (1999). | PubMed | ISI | Akopian, A. N. et al. Trans-splicing of a voltage-gated sodium channel is regulated by nerve growth factor. FEBS Lett. 445, 177-182 (1999). | PubMed | ISI | Caudevilla, C. et al. Localization of an exonic splicing enhancer responsible for mammalian natural trans-splicing. Nucleic Acids Res. 29, 3108-3115 (2001). | PubMed | ISI | Vellard, M. et al. C-myb proto-oncogene: evidence for intermolecular recombination of coding sequences. Oncogene 6, 505-514 (1991). | PubMed | ISI | Sullivan, P. M., Petrusz, P., Szpirer, C. & Joseph, D. R. Alternative processing of androgenbinding protein RNA transcripts in fetal rat liver. Identification of a transcript formed by trans splicing. J. Biol. Chem. 266, 143-154 (1991). | PubMed | ISI | Li, B. L. et al. Human acyl-CoA:cholesterol acyltransferase-1 (ACAT-1) gene organization and evidence that the 4.3-kilobase ACAT-1 mRNA is produced from two different chromosomes. J. Biol. Chem. 274, 11060-11071 (1999). | Article | PubMed | ISI | Hirayama, T., Sugino, H. & Yagi, T. Somatic mutations of synaptic cadherin (CNR family) transcripts in the nervous system. Genes Cells 6, 151-164 (2001). | Article | PubMed | ISI | Shimizu, A. & Honjo, T. Synthesis and regulation of trans-mRNA encoding the immunoglobulin epsilon heavy chain. FASEB J. 7, 149-154 (1993). | PubMed | ISI | Fujieda, S., Lin, Y. Q., Saxon, A. & Zhang, K. Multiple types of chimeric germ-line Ig heavy chain transcripts in human B cells: evidence for trans-splicing of human Ig RNA. J. Immunol. 157, 3450-3459 (1996). | PubMed | ISI | Chatterjee, T. K. & Fisher, R. A. Novel alternative splicing and nuclear localization of human RGS12 gene products. J. Biol. Chem. 275, 29660-29671 (2000). | PubMed | ISI | Finta, C. & Zaphiropoulos, P. G. Intergenic mRNA molecules resulting from trans-splicing. J. Biol. Chem. 277, 5882-5890 (2002). | PubMed | ISI | Labrador, M. et al. Protein encoding by both DNA strands. Nature 409, 1000 (2001). | Article | PubMed | ISI | Dorn, R., Reuter, G. & Loewendorf, A. Transgene analysis proves mRNA trans-splicing at the complex mod(mdg4) locus in Drosophila. Proc. Natl Acad. Sci. USA 98, 9724-9729 (2001). | PubMed | ISI | Rubin, G. M. et al. Comparative genomics of the eukaryotes. Science 287, 2204-2215 (2000). | Article | PubMed | ISI | Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860-921 (2001). | Article | PubMed | ISI | Venter, J. C. et al. The sequence of the human genome. Science 291, 1304-1351 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. (2001). | Article | PubMed | ISI | Daly, M. Estimating the human gene count. Cell 109, 283-284 (2002). | PubMed | ISI | Davidson, E. H. Genomic Regulatory Systems (Academic, New York, 2001). Suzuki, Y., Yamashita, R., Nakai, K. & Sugano, S. DBTSS: DataBase of human transcriptional start sites and full-length cDNAs. Nucleic Acids Res. 30, 328-331 (2002). | PubMed | ISI | Bono, H., Kasukawa, T., Furuno, M., Hayashizaki, Y. & Okazaki, Y. FANTOM DB: database of functional annotation of RIKEN mouse cDNA clones. Nucleic Acids Res. 30, 116-118 (2002). | PubMed | ISI | Kristiansen, T. Z. & Pandey, A. Resources for full-length cDNAs. Trends Biochem. Sci. 27, 266267 (2002). | PubMed | ISI | Sorek, R. & Amitai, M. Piecing together the significance of splicing. Nature Biotechnol. 19, 196 (2001). | PubMed | ISI | Modrek, B. & Lee, C. A genomic view of alternative splicing. Nature Genet. 30, 13-19 (2002). | Article | PubMed | ISI | Hu, G. K. et al. Predicting splice variant from DNA chip expression data. Genome Res. 11, 1237-1245 (2001). | PubMed | ISI | Shoemaker, D. D. et al. Experimental annotation of the human genome using microarray technology. Nature 409, 922-927 (2001). | Article | PubMed | ISI | Yeakly, J. M. et al. Profiling alternative splicing on fiber optic arrays. Nature Biotechnol. 20, 1-6 (2002). | PubMed | Brett, D., Pospisil, H., Valcarcel, J., Reich, J. & Bork, P. Alternative splicing and genome complexity. Nature Genet. 30, 29-30 (2002). | Article | PubMed | ISI | Schmucker, D. et al. Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell 101, 671-684 (2000). | PubMed | ISI | Missler, M. & Sudhof, T. C. Neurexins: three genes and 1001 products. Trends Genet. 14, 2026 (1998). | Article | PubMed | ISI | Ptashne, M. & Gann, A. Genes & Signals (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2002). Reed, R. & Hurt, E. A Conserved mRNA export machinery coupled to pre-mRNA splicing. Cell 108, 523-531 (2002). | PubMed | ISI | Acknowledgements. We thank P. Cramer, B. Graveley, A. Krainer, C. Nabholz and R. Reed for their comments on the manuscript, and R. Hellmiss for the illustrations. Figure 1 Exon recognition. The correct 5' (GU) and 3' (AG) splice sites are recognized by the splicing machinery on the basis of their proximity to exons. The exons contain exonic splicing enhancers (ESEs) that are binding sites for SR proteins. When bound to an ESE, the SR proteins recruit U1 snRNP to the downstream 5' splice site, and the splicing factor U2AF (65 and 35 kDa subunits) to the pyrimidine tract (YYYY) and the AG dinucleotide of the upstream 3' splice site, respectively. In turn, U2AF recruits U2 snRNP to the branchpoint sequence (A). Thus, the bound SR proteins recruit splicing factors to form a 'cross-exon' recognition complex. SR proteins also function in 'cross-intron' recognition by facilitating the interactions between U1 snRNP bound to the upstream 5' splice site and U2 snRNP bound to the branchpoint sequence. Figure 2 Regulation of alternative pre-mRNA splicing in the Drosophila sex-determination pathway. a, Alternative selection of 3' splice sites preceding exon 2 of tra pre-mRNA is regulated by the SXL protein. In males, the splicing factor U2AF binds to the proximal 3' splice site, leading to an mRNA containing a premature translational stop codon (UAG). In females, SXL binds to the proximal 3' splice site, thus preventing the binding of U2AF. Instead, U2AF binds to the distal 3' splice site, leading to an mRNA that encodes functional TRA protein. In all panels, the exons are indicated by coloured rectangles, while introns are shown as pale grey lines. b, Alternative inclusion of exon 3 of sxl pre-mRNA is regulated by SXL protein. In both males and females, the first step of the splicing reaction results in lariat formation at the branchpoint sequence upstream from the 3' splice site preceding exon 3. Subsequently, the second-step splicing factor SPF45 binds to the AG dinucleotide of this splice site. In males, SPF45 promotes the second step of the splicing reaction, leading to the inclusion of exon 3. In females, SXL binds to a sequence upstream of the AG dinucleotide, interacts with SPF45 and inhibits its activity. This prevents the second step of the splicing reaction, leading to the exclusion of exon 3 and splicing of exon 2 to exon 4. Seven constitutively spliced exons are not shown. c, Alternative splicing of dsx pre-mRNA is regulated by the assembly of heterotrimeric protein complexes on female-specific ESEs. The first three exons are constitutively spliced in both sexes. In males, the 3' splice site preceding exon 4 is not recognized by the splicing machinery, resulting in the exclusion of this exon, and splicing of exon 3 to exon 5. In females, the female-specific TRA protein promotes the binding of the SR protein RBP1, and the SR-like protein TRA2 to six copies of an ESE (indicated by green rectangles). These splicing enhancer complexes then recruit the splicing machinery to the 3' splice site preceding exon 4, leading to its inclusion in the mRNA. In females, polyadenylation (pA) occurs downstream of exon 4, whereas in males it occurs downstream of exon 6. 'S' designates the splicing machinery. Figure 3 Inducible alternative splicing of rat Slo pre-mRNA. The fraction of Slo mRNAs containing an exon called STREX (stress axis-regulated exon, shown in yellow) is regulated by neuronal activity. Depolarization of the plasma membrane increases the intracellular Ca2+ concentration, leading to the activation of CaMK IV. This, in turn, is thought to result in the binding of repressor proteins to silencer sequences, one located upstream and the other within STREX. These silencers are shown as red rectangles with hypothetical repressors bound to them. In the presence of the repressors, the STREX exon is excluded from Slo mRNA. The alternatively spliced mRNAs encode two protein isoforms (shown in blue). The channel encoded by mRNA containing the STREX exon has an additional (yellow) domain, and is more sensitive to intracellular Ca2+ concentration. For clarity, only the portion of Slo pre-mRNA containing the STREX exon is shown. Figure 4 Alternative trans-splicing in mammals and flies. a, Four cytochrome P450 3A genes are arranged in a cluster that spans 200 kilobases of genomic DNA. For simplicity, the genes have been designated 1–4 (they were designated CYP3A43, CYP3A4, CYP3A7, CYP3A5, respectively, in the original reference)76. Gene 1 is transcribed from one DNA strand, whereas genes 2–4 are transcribed from the opposite strand. The direction of transcription on each strand is indicated by the wavy arrows. The small coloured boxes are exons, and the pale grey lines are introns. Hybrid intergenic mRNAs are produced by trans-splicing between exon 1 of gene 1 and various exons in genes 2 and 4 to generate the mRNAs labelled A–E. b, The Drosophila mod(mdg4) locus spans 28 kilobases, and encodes 26 alternatively spliced mRNAs. Each mRNA is generated by splicing of four common exons (1–4, shown in red) to one of the exons located downstream (shown in yellow or blue). The yellow exons are transcribed from the same DNA strand as the common exons, while the blue exons are transcribed from the opposite DNA strand. The 5' splice site downstream from common exon 4 is indicated by the four slanted lines. Similarly, the 3' splice site upstream of each alternatively spliced exon is indicated by a single slanted line. An example of trans-splicing within this locus is shown at the bottom of the figure. Two trans-splicing precursors, containing exons 1–4 or 5 and 6, are transcribed from opposite DNA strands. The exons 4 and 5 are then joined by trans- splicing. 11 July 2002 Nature 418, 244 - 251 (2002); doi:10.1038/418244a <> RNA interference GREGORY J. HANNON Cold Spring Harbour Laboratory, 1 Bungtown Road, Cold Spring Harbour, New York 11724, USA (e-mail: hannon@cshl.org) A conserved biological response to double-stranded RNA, known variously as RNA interference (RNAi) or post-transcriptional gene silencing, mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes. RNAi has been cultivated as a means to manipulate gene expression experimentally and to probe gene function on a wholegenome scale. The phenomenon of RNAi was first discovered in the nematode worm Caenorhabditis elegans as a response to double-stranded RNA (dsRNA), which resulted in sequencespecific gene silencing1. Following on from the studies of Guo and Kemphues, who had found that sense RNA was as effective as antisense RNA for suppressing gene expression in worms2, Fire, Mello and colleagues1 were attempting to use antisense RNA as an approach to inhibit gene expression. Their breakthrough was to test the synergy of sense and antisense RNAs, and they duly found that the dsRNA mixture was at least tenfold more potent as a silencing trigger than were sense or antisense RNAs alone1. Silencing by dsRNAs had a number of remarkable properties — RNAi could be provoked by injection of dsRNA into the C. elegans gonad or by introduction of dsRNA through feeding either of dsRNA itself or of bacteria engineered to express it3. Furthermore, exposure of a parental animal to only a few molecules of dsRNA per cell triggered gene silencing throughout the treated animal (systemic silencing) and in its F1 (first generation) progeny (Fig. 1). Figure 1 Double-stranded RNA can be introduced experimentally to silence target genes of interest. Full legend High resolution image and legend (38k) From this discovery emerged the notion that a number of previously characterized, homology-dependent gene-silencing mechanisms might share a common biological root. Several years previously, Richard Jorgensen had been engineering transgenic petunias with the goal of altering pigmentation. But introducing exogenous transgenes did not deepen flower colour as expected. Instead, flowers showed variegated pigmentation, with some lacking pigment altogether (refs 4, 5, and reviewed in ref. 6). This indicated that not only were the transgenes themselves inactive, but also that the added DNA sequences somehow affected expression of the endogenous loci. This phenomenon, called co-suppression, can be produced by highly expressed, single-copy transgenes7, 8 or by transgenes, expressed at a more modest level, that integrate into the genome in complex, multicopy arrays9. In parallel, several laboratories found that plants responded to RNA viruses by targeting viral RNAs for destruction10-13. Notably, silencing of endogenous genes could also be triggered by inclusion of homologous sequences in a virus replicon. What is clear in retrospect is that both complex transgene arrays and replicating RNA viruses generate dsRNA. In plant systems, dsRNAs that are introduced from exogenous sources or that are transcribed from engineered inverted repeats are potent inducers of gene silencing (reviewed in ref. 14). But co-suppression phenomena are not restricted to plants: similar outcomes have been noted in unicellular organisms, such as Neurospora, and in metazoans, such as Drosophila, C. elegans and mammals15-18. In a few cases, silencing has been correlated with integration of transgenes as complex arrays that can produce dsRNA directly, although silencing can also be triggered by the presence of single-copy or dispersed elements18. What remains a mystery is how, and indeed whether, such elements produce the dsRNA silencing trigger that has become a hallmark of RNAi. It has been proposed that endogenous RNA-directed RNA polymerases (RdRPs) may recognize 'aberrant transcripts' derived from highly expressed loci and convert these into dsRNA19. Indeed, homologues of these enzymes have proven essential for silencing in C. elegans, fungi and plants, and this is discussed below. Genetic and biochemical studies have now confirmed that RNAi, co-suppression and virusinduced gene silencing share mechanistic similarities, and that the biological pathways underlying dsRNA-induced gene silencing exist in many, if not most, eukaryotic organisms (Fig. 1). What are the mechanisms by which dsRNAs induce silencing of homologous sequences, either exogenous or endogenous? What are the biological functions of these processes? And how are they related in evolutionarily divergent fungi, plants and animals? Silencing machinery operates at multiple levels In C. elegans, initial observations were consistent with dsRNA-induced silencing operating at the post-transcriptional level. Exposure to dsRNAs resulted in loss of corresponding messenger RNAs (mRNAs), and promoter and intronic sequences were largely ineffective as silencing triggers1. A post-transcriptional mode was also consistent with data from plant systems in which exposure to dsRNA20, for example in the form of an RNA virus, triggered depletion of mRNA sequences without an apparent effect on the rate of transcription21. Indeed, viral transcripts themselves were targeted, despite the fact that these were synthesized cytoplasmically by transcription of RNA genomes10. These studies led to the notion that RNAi induced degradation of homologous mRNAs, and this hypothesis has been validated by biochemical analysis. But the RNAi machinery affects gene expression through additional mechanisms. In plants, exposure to dsRNA induces genomic methylation of sequences homologous to the silencing trigger22. If the trigger shares sequence with a promoter, the targeted gene can become transcriptionally silenced23. Recent studies have suggested that the RNAi machinery may also affect gene expression at the level of chromatin structure in Drosophila, C. elegans and fungi (refs 18, 24–26, and R. Martienssen, T. Volpe, I. Hall and S. Grewal, unpublished data). Finally, in C. elegans, endogenously encoded inducers of the RNAi machinery (for example, lin-4) operate at the level of protein synthesis27. Although translational control by dsRNA has not been established definitively in other systems, the conservation of let-7 and related RNAs28 suggests that this regulatory mode may be a further common mechanism through which RNAi pathways control the expression of cellular genes. Mechanism of post-transcriptional gene silencing Our present understanding of the mechanisms underlying dsRNA-induced gene silencing is derived from genetic studies in C. elegans and plants and from biochemical studies of Drosophila extracts. In the latter case, Carthew and colleagues laid the foundations by showing that injection of dsRNA into Drosophila embryos induced sequence-specific silencing at the post-transcriptional level29. Sharp and colleagues then tested the possibility that Drosophila embryo extracts, previously used to study translational regulation, might be competent for RNAi30. Incubation of dsRNA in these cell-free lysates reduced their ability to synthesize luciferase from a synthetic mRNA. This correlated with destabilization of the mRNA and suggested that dsRNA might bring about silencing by triggering the assembly of a nuclease complex that targets homologous RNAs for degradation. This effector nuclease, now known as RISC (RNA-induced silencing complex), was isolated from extracts of Drosophila S2 cells in which RNAi had been triggered by treatment with dsRNA in vivo31. A key question was how this complex might identify cognate substrates. Fire and Mello had originally proposed that some derivative of the dsRNA would guide the identification of substrates for RNAi, and the first clue in the hunt for such 'guide RNAs' came from the study of silencing in plants. Hamilton and Baulcombe32 sought antisense RNAs that were homologous to genes being targeted by cosuppression. They found a 25-nucleotide RNA that appeared only in plant lines containing a suppressed transgene, and found that similar species appeared during virusinduced gene silencing. Similar small RNAs were produced from dsRNAs in Drosophila embryo extracts33, and partial purification of the RISC complex showed that these small RNAs co-fractionated with nuclease activity31. These findings forged a link between transgene co-suppression in plants and RNAi in animals. In addition, a model for RNAi and related silencing phenomenon began to emerge (Fig. 2). According to this model, initiation of silencing occurs upon recognition of dsRNA by a machinery that converts the silencing trigger to 21–25-nucleotide RNAs. These small interfering RNAs (siRNAs) are a signature of this family of silencing pathways and, by joining an effector complex RISC, they guide that complex to homologous substrates. Figure 2 Dicer and RISC (RNA-induced silencing complex). Full legend High resolution image and legend (64k) This convergence of observations from diverse experimental systems suggested that a conserved biochemical mechanism would lie at the core of homology-dependent genesilencing responses. However, the varied biology of dsRNA-induced silencing — for example, the heritable and systemic nature of silencing in C. elegans compared to apparently cell-autonomous, non-heritable silencing in Drosophila and mammals — suggested that this core machinery probably adapted to meet specific biological needs in different organisms. The initiation step The model outlined in Fig. 2 implies that the dsRNA silencing trigger is cleaved to produce siRNAs. Support for this emerged first from studies of Drosophila embryo extracts, which contained an activity capable of processing long dsRNA substrates into 22-nucleotide fragments33. Analysis of these RNAs showed that they were double stranded and contained 5'-phosphorylated termini33, 34. The quest for the enzyme that initiates RNAi led to the RNase III ribonuclease family, which displays specificity for dsRNAs and generates such termini. RNase III enzymes can be divided into three classes based upon domain structure: bacterial RNase III contains a single catalytic domain and a dsRNA-binding domain; Drosha family nucleases contain dual catalytic domains35; and a third family also contains dual catalytic domains and additional helicase and PAZ motifs36. Members of this third class of RNases were found to process dsRNA into siRNAs and were therefore proposed to initiate RNAi36. This family, now named the Dicer enzymes, are evolutionarily conserved, and proteins from Drosophila, Arabidopsis, the insect Spodoptera frugiperda, tobacco, C. elegans, mammals and Neurospora have all been shown to recognize and process dsRNA into siRNAs of a characteristic size for the relevant species (refs 36, 37, and A. M. Denli and G.J.H., unpublished data). Genetic evidence has also emerged from C. elegans and Arabidopsis that is consistent with Dicer acting in the RNAi pathway: Dicer is required for RNAi in the C. elegans germline37-39, and a hypomorphic allele of Carpel Factory can intensify the phenotypes of weak Argonaute-1 alleles in Arabidopsis (C. Kidner and R. Martienssen, personal communication). Recently, the structure of an RNase III catalytic domain has led to a model for the generation of 22-nucleotide RNAs by Dicer cleavage40 (Fig. 2). It is thought that bacterial RNase III functions as a dimeric enzyme and, in the structural model, antiparallel RNase III domains produce two compound catalytic centres, each of which is formed by contributions from both monomers. The sequences of Dicer and Drosha RNase III domains reveal deviations from the consensus in both enzymes. Introduction of these alterations into bacterial RNase III permitted a genetic test for domain function: defects were noted upon introduction of residues that form part of the catalytic centre from the second RNase III domain of Dicer family members. Antiparallel alignment of Dicer's RNase III motifs on a dsRNA substrate could produce four compound active sites, but the central two of these would be inactive. In this way, cleavage would occur at 22-base intervals, and subtle alterations in Dicer structure could alter the spacing of these catalytic centres and explain the species-specific variation in siRNA length (A. Denli and G.J.H, unpublished results). The effector step In the Drosophila system, RNAi is enforced by RISC, a protein–RNA effector nuclease complex that recognizes and destroys target mRNAs. The first subunit of RISC to be identified was the siRNA, which presumably identifies substrates through Watson–Crick base-pairing31. Zamore and colleagues have recently shown that RISC is formed in embryo extracts as a precursor complex of 250K; this becomes activated upon addition of ATP to form a 100K complex that can cleave substrate mRNAs41. Cleavage is apparently endonucleolytic, and occurs only in the region homologous to the siRNA. siRNAs are double-stranded duplexes with two-nucleotide 3' overhangs and 5'-phosphate termini33, 34, and this configuration is functionally important for incorporation into RISC complexes34, 41. However, single-stranded siRNAs should be most effective at seeking homologous targets, and one intriguing correlation with the transition of RISC zymogens to active enzymes is siRNA unwinding41. My laboratory has purified RISC from Drosophila S2 cells as a 500K ribonucleoprotein with slightly different characteristics31, 42. In embryo extracts, RISC* (the 100K active RISC species) cleaves its substrates endonucleolytically41. Intermediate cleavage products are never observed in even the most highly purified RISC preparations from S2 cells, suggesting the presence of an exonuclease in this enzyme complex. Therefore, the complex formed in vivo probably contains additional factors that account for observed differences in size and activity. Alternatively, RISC purified from S2 cells may become activated — perhaps changing size and subunit composition — upon incubation with ATP. RISC from S2 cells co-purifies with AGO2, a member of the Argonaute gene family42. Argonaute proteins were first identified in Arabidopsis mutants that produced altered leaf morphology43, and form a large, evolutionarily conserved gene family with representatives in most eukaryotic genomes, with the possible exception of Saccharomyces cerevisiae (reviewed in ref. 44). These proteins are characterized by the presence of two homology regions, the PAZ domain and the Piwi domain, the latter being unique to this group of proteins. The PAZ domain also appears in Dicer proteins, and may be important in the assembly of silencing complexes36. Argonaute proteins were linked to RNAi by genetic studies in C. elegans, whose genome contains >20 related genes. The rde-1 gene was isolated by Mello and colleagues25 from a mutant worm that was unable to sustain RNAi in germline or soma. Using genetic methods, Grishok and colleagues45 found a requirement for RDE-1 and RDE-4 for initiation of silencing in a parental animal; however, neither function was required for systemic silencing in F1 progeny. In contrast MUT-7 (ref. 46) and RDE-2 were both dispensable in the parent, but were required in their progeny. Rationalizing these results with the simple model proposed above is difficult. Indeed, RDE4 is a small dsRNA-binding protein, and both RDE-1 and RDE-4 can interact with C. elegans Dicer (H. Tabara et al., unpublished data). Perhaps RDE-4 initially recognizes dsRNA and delivers it to the Dicer enzyme. This would be consistent with the observation that siRNA levels are greatly reduced in worms that lack RDE-4 function, but are abundant in worms that lack RDE-1 (ref. 47). Similarly, in Neurospora, mutations in the Argonaute family member qde-2 eliminate quelling (transgene co-suppression), but do not alter accumulation of siRNAs48. Thus RDE-1, and perhaps other Argonaute proteins as well, might shuttle siRNAs to appropriate effector complexes (RISCs). Consistent with this notion, we have detected transient interactions in S2 cell extracts between Dicer and Argonaute family members (ref. 42, and A. Caudy, unpublished data). This model has implications for signal amplification and systemic silencing. Amplification and spreading of silencing One of the most provocative aspects of RNAi in C. elegans is its ability to spread throughout the organism, even when triggered by minute quantities of dsRNA1. Similar systemic silencing phenomena have been observed in plants, in which silencing could pervade a plant or even be transferred to a naive grafted scion49. Accounting for these phenomena requires firstly a system to pass a signal from cell to cell, and secondly a strategy for amplifying the signal. Recently, a phenomenon termed 'transitive RNAi' has provided some useful clues. Transitive RNAi refers to the movement of the silencing signal along a particular gene (Fig. 3). For example, in C. elegans, targeting the 3' portion of a transcript results in suppression of that mRNA and in the production of siRNAs homologous to the targeted region. In addition, siRNAs complementary to regions of the transcript upstream from the area targeted directly by the silencing trigger also appear and accumulate50. If these siRNAs are complementary to other RNAs, those are also targeted (hence, 'transitive' RNAi). Figure 3 Transitive RNAi. Full legend High resolution image and legend (60k) In both plants and C. elegans, dsRNA-induced silencing requires proteins similar in sequence to a tomato RNA-directed RNA polymerase (RdRP)51, which could be involved in amplifying the RNAi signal. However, only the tomato enzyme has been shown to possess polymerase activity, and biochemical studies will be required to establish definitively the role these proteins play in RNAi. In Arabidopsis, SDE1/SGS2 is required for transgene silencing, but not for virally induced gene silencing (VIGS)19, 52. This suggests that SDE1/SGS2 may act as an RdRP, as viral replicases could substitute for this function in VIGS. In Neurospora, QDE-1 is required for efficient quelling53. EGO-1 is essential for RNAi in the germline of C. elegans54, and another RdRP homologue, RRF1/RDE-9, is required for silencing in the soma50 (D. Conte and C. Mello, unpublished data). These genetic studies have led to a model for transitive RNAi in which siRNAs might prime the synthesis of additional dsRNA by RdRPs. RdRP activity has been reported recently from Drosophila embryo extracts55, although transitive RNAi has yet to be observed in flies. While numerous experiments suggest that an RdRP is not required for RNAi in Drosophila extracts, the possibility remains that such an enzyme might act, for example, in triggering RNAi by the production of dsRNA from dispersed, multicopy transgenes. The fact that RDE-1 and RDE-4 are required only for initiation of RNAi in parental C. elegans adds an additional layer of complexity to the model. Perhaps exogenous dsRNAs are recognized initially in manner that is distinct from recognition of secondary dsRNA, which may be produced by RdRPs. For example, the proposed function of RDE-4 in delivering dsRNA to Dicer could be substituted for secondary dsRNAs by another hypothetical protein. Alternatively, Dicer could exist in a stable complex with an RdRP, making dsRNA delivery unnecessary. The requirement for RRF-1/RDE-9 throughout the C. elegans soma — and the similar requirement for SDE1/SGS2 in plants — also suggests that most RNAi in these systems is driven by secondary siRNAs produced through the action of RdRPs. However, other possibilities also exist. Indeed, in plants, transitive RNAi travels in both 3' 5' and 5' 3' directions56, which is inconsistent with the simple notion of siRNAs priming dsRNA synthesis. Instead, one can imagine that genomic loci may serve as a reservoir for silencing. In some systems, it is known that exposure to dsRNA can produce alterations in chromatin structure, which could lead to the production of 'aberrant' mRNAs that are substrates for conversion to dsRNA by RdRPs. This model would permit bidirectional spread, as such an expansion of altered chromatin structure is an established phenomenon. Moreover, a similar model could explain co-suppression that is occasionally triggered by single-copy, dispersed transgenes. Finally, this model would be consistent with transitive effects that have been observed for both transcriptional and post-transcriptional silencing in Drosophila, which operate in the absence of any homology in the transcribed RNA, and thus differ from 'transitive RNAi' in C. elegans18, 24. But support for a genomebased amplification model remains elusive, as does the nature of the 'aberrant' RNAs that trigger siRNA formation and an explanation for how chromatin modifications could induce their production. Although these models suggest mechanisms for cell-autonomous amplification of the silencing signal, the character of the signal that transmits systemic silencing in plants and animals is unknown. Two candidates are siRNAs themselves or long dsRNAs, perhaps formed via RdRP-dependent amplification. Note that, in plants, two types of transmission must be considered. The first is short-range, cell-to-cell transmission. Plant cells are intimately connected through cytoplasmic bridges known as plasmodesmata. Movement of RNA and proteins via these cell–cell junctions is well known, and it is likely that either long dsRNA or siRNAs could be passed through these connections. But the silencing signal must also be passed over a longer range through the plant vasculature57. In this regard, studies of a viral silencing inhibitor have provided evidence against siRNAs being critical for systemic silencing in plants. Hc-Pro suppresses silencing and also interferes with the production of siRNAs from dsRNA triggers58. Expression of Hc-Pro does not interfere with transgene methylation, which results in transcriptional gene silencing (TGS) if present in the promoter and which may contribute to post-transcriptional gene silencing (PTGS) if present in the transcribed sequence. Hc-Pro expression in a silenced rootstock relieves silencing and inhibits siRNA production, but a systemic signal can still be passed from this rootstock to an engrafted scion lacking Hc-Pro expression. Recently, Hunter and colleagues identified a protein in C. elegans that is required for systemic silencing59. The sid-1 gene encodes a transmembrane protein that may act as a channel for import of the silencing signal. Expression of sid-1 is largely lacking from neuronal cells, perhaps explaining initial observations that C. elegans neurons were resistant to systemic RNAi. SID-1 homologues are absent from Drosophila, consistent with a lack of systemic transmission of silencing in flies, but are present in mammals, raising the possibility that some aspects of RNAi may act non-cell autonomously in mammals. Other components of the RNAi machinery A combination of genetics and biochemistry has led to much progress towards understanding the mechanism of PTGS, but many questions remain. In Drosophila embryo extracts, pre-RISC becomes activated upon unwinding of siRNAs in an ATP-dependent process. A number of different helicases have been identified in searches for RNAideficient mutants (for example, QDE-3, MUT6 and MUT-14), and any of these might be candidates for a RISC activator60-62. Additionally, the identities of RISC-associated nucleases that cleave targeted mRNAs remain elusive. Studies of RISC formed in embryo extracts suggest an endonuclease that cleaves the siRNA–mRNA hybrid near the middle of the duplex, while RISC formed in vivo may have additional exonuclease activities. The MUT-7 protein, which is essential for RNAi in the C. elegans germ line, has nuclease homology, but a Drosophila relative of this protein has not yet been found in RISC (ref. 46, and S. Hammond, unpublished data). The efficiency of RNAi suggests an active mechanism for searching the transcriptome for homologous substrates. Most Drosophila RISC might be associated with the ribosome31, and recent studies have extended this observation to trypanosomes (E. Ullu, unpublished data). Finally, relationships between the RNAi machinery and other aspects of RNA metabolism in the cell must be explored. For example, genetic evidence63 suggests a link between RNAi and nonsense-mediated decay, raising the possibility that the RNAi machinery may be important in destruction of improperly processed mRNAs or in the general regulation of mRNA stability. RNAi and the genome In plants, dsRNA induces genomic methylation at sites of sequence homology (ref. 22, reviewed in ref. 64). Methylation is asymmetric and is not restricted to CpG or CpXpG sequences. If methylation occurs in the coding sequence, it has no apparent effect on the transcription of the locus, although silencing still occurs at the post-transcriptional level. Methylation of the promoter sequence induces TGS23, which unlike PTGS is stable and heritable21. Thus, dsRNA can clearly trigger alterations at the genomic level, but the degree to which these alterations are relevant to PTGS remains uncertain. Recent studies have begun to generalize the notion of an intimate connection between the RNAi machinery and the genome, and to draw mechanistic links between PTGS and TGS. For example, in C. elegans, mut-7 and rde-2 mutations de-repress transgenes that are silenced at the level of transcription by a polycomb-dependent mechanism25. Polycombgroup proteins function by organizing chromatin into 'open' or 'closed' conformations, creating stable and heritable patterns of gene expression. Recently, Goldstein and colleagues found that the polycomb proteins MES-3, MES-4 and MES-6 are required for RNAi, at least under some experimental conditions26. Mutant worms were deficient in the RNAi response if high levels of dsRNA were injected, but were not deficient in the presence of limiting dsRNA. Of course, the effects of these mutants could be indirect, altering the expression of other elements or regulators of the RNAi pathway. However, links between altered chromatin structures and dsRNA-induced gene silencing have also emerged from plant and Drosophila systems. In particular, alterations of either methyltransferases (MET1) or chromatin remodelling complexes (for example, DDM1) can affect both the degree and persistence of silencing in Arabidopsis21, 65. Conversely, mutations in genes required for PTGS (for example, AGO1 and SGS2) decrease both cosuppression and transgene methylation66. Furthermore, mutation of piwi, a relative of the RISC component Argonaute-2, compromises co-suppression of dispersed transgenes in Drosophila at both the post-transcriptional and transcriptional levels24. Thus, one of the most fascinating and least-explored responses to dsRNA involves a possible recognition of genomic DNA by derivatives of the silencing trigger, possibly siRNAs. One model suggests that a variant, nuclear RISC carries a chromatin remodelling complex rather than a ribonuclease to its cognate target. Indeed, Martienssen, Grewal and colleagues have recently noted a requirement for relatives of Dicer and RISC components in the silencing of centromeric repeats in Schizosaccharomyces pombe (T. Volpe, C. Kidner, I. Hall, S. Grewal and R. Martienssen, personal communication). It seems therefore that a principal biological function of the RNAi machinery may be to form heterochromatic domains in the nucleus that are critical for genome organization and stability. Biological functions of RNAi Because target identification depends upon Watson–Crick base-pairing interactions, the RNAi machinery can be both flexible and exquisitely specific. Thus, this regulatory paradigm may have been adapted and adopted for numerous cellular functions. For example, in plants, RNAi forms the basis of VIGS, suggesting an important role in pathogen resistance. An elegant proof of this hypothesis comes from the genetic links between virulence and RNAi pathways (refs 52, 67, and reviewed in ref. 68). Many plant viruses encode suppressors of PTGS that are essential for pathogenesis, and these virulence determinants can be masked by host mutations in silencing pathways. RNAi has also been linked to the control of endogenous parasitic nucleic acids. In C. elegans, some RNAideficient strains are also 'mutators' owing to increased mobility of endogenous transposons25, 46. In many systems, transposons are silenced by their packaging into heterochromatin (reviewed in ref. 64). Therefore, it is tempting to speculate that RNAi may stabilize the genome by sequestering repetitive sequences such as mobile genetic elements, preventing transposition and making repetitive elements unavailable for recombination events that would lead to chromosomal translocations. However, it remains to be determined whether RNAi regulates transposons through effects at the genomic level or by post-transcriptionally targeting mRNAs (for example, those encoding transposases) that are required for transposition. A role for RNAi pathways in the normal regulation of endogenous protein-coding genes was originally suggested through the analysis of plants and animals containing dysfunctional RNAi components. Mutations in the Argonaute-1 gene of Arabidopsis, for example, cause pleiotropic developmental abnormalities that are consistent with alterations in stem-cell fate determination43. A hypomorphic mutation in Carpel Factory, an Arabidopsis Dicer homologue, causes defects in leaf development and overproliferation of floral meristems69. Mutations in Argonaute family members in Drosophila also impact normal development. In particular, mutations in Argonaute-1 have drastic effects on neuronal development70, and piwi mutants have defects in both germline stem-cell proliferation and maintenance71. This should not be interpreted as a demonstration that PTGS pathways regulate endogenous gene expression per se. In fact, separation-of-function ago1 mutants have recently been isolated that preferentially affect PTGS72 without affecting development. Mutations in Zwille, another Argonaute family member, also alter stem-cell maintenance73, and this occurs without perceptible impact on dsRNA-mediated silencing72. Thus, components of the RNAi machinery, and related gene products, may function in related but separable pathways of gene regulation. A possible mechanism underlying the regulation of endogenous genes by the RNAi machinery emerged from the study of C. elegans containing mutations in their single Dicer gene, DCR-1. Unlike most other RNAi-deficient worm mutants, dcr-1 animals were neither normal nor fertile: the mutation induced a number of phenotypic alterations in addition to its effect on RNAi37-39, 74. Intriguingly, Dicer mutants showed alterations in developmental timing similar to those observed in let-7 and lin-4 mutants. The lin-4 gene was originally identified as a mutant that affects larval transitions75, and let-7 was subsequently isolated as a similar heterochronic mutant28. These loci encode small RNAs, which are synthesized as 70-nucleotide precursors and post-transcriptionally processed to a 21-nucleotide mature form. Genetic and biochemical studies have indicated that these RNAs are processed by Dicer37-39, 74. The small temporal RNAs (stRNAs) encoded by let-7 and lin-4 are negative regulators of specific protein-coding genes, as might be expected if stRNAs trigger RNAi. However, stRNAs do not trigger mRNA degradation, but regulate expression at the translational level76, 77. This raised the possibility that stRNAs and RNAi might be linked only by the processing enzyme Dicer. However, Mello and colleagues demonstrated a requirement for Argonaute family proteins (that is, Alg-1 and Alg-2) in both stRNA biogenesis and stRNAmediated suppression39, which led to a model in which the effector complexes containing siRNAs and stRNAs are closely related, but regulate expression by distinct mechanisms (Fig. 4). Neither LIN-4 nor LET-7 forms a perfect duplex with its cognate target78. Thus, in one possible model an analogous RISC complex is formed containing either siRNAs or stRNAs. In the former case, cleavage is dependent upon perfect complementarity, while in the latter, cleavage does not occur, but the complex blocks ribosomal elongation. Alternatively, siRNAs and stRNAs may be discriminated and enter related but distinct complexes that target substrates for degradation or translational regulation, respectively. Consistent with this latter model is the observation that siRNAs or exogenously supplied hairpin RNAs that contain single mismatches with their substrates fail to repress, rather than simply shifting their regulatory mode to translational inhibition34, 79, 80. Figure 4 Small interfering RNAs versus small temporal RNAs. Full legend High resolution image and legend (27k) In this scenario, RISC may be viewed as a flexible platform upon which different regulatory modules may be superimposed (Fig. 5). The core complex would be responsible for receiving the small RNA from Dicer and using this as a guide to identify its homologous substrate. Depending upon the signal (for example, its structure and localization), different effector functions could join the core: in RNAi, nucleases would be incorporated into RISC, whereas in stRNA-mediated regulation, translational repressors would join the complex. Transcriptional silencing could be accomplished by the inclusion of chromatin remodelling factors, and one could imagine other adaptations might exist. Figure 5 A model for the mechanism of RNAi. Full legend High resolution image and legend (46k) Whether or not RISC is a flexible regulator becomes particularly important in light of recent findings that let-7 and lin-4 are archetypes of a large class of endogenously encoded small RNAs. Over 100 of these microRNAs or miRNAs have now been identified in Drosophila, C. elegans and mammals81-84, and although their functions are unknown, their prevalence hints that RNAi-related mechanisms may have pervasive roles in controlling gene expression. In this regard, a number of miRNAs from Drosophila are partially complementary to two sequences, the K box and the Brd box, that mediate posttranscriptional regulation of numerous mRNAs85. RNAi and genomics RNAi has evolved into a powerful tool for probing gene function. In C. elegans, testing the functions of individual genes by RNAi has now extended to analysis of nearly all of the worm's predicted 19,000 genes (J. Ahringer, unpublished data). Similar strategies are being pursued in other organisms, including plants (D. Baulcombe and P. Waterhouse, personal communication). Although it seemed for some time that deploying RNAi in mammalian systems would not be feasible, the first hint that the technology might work came when RNAi was demonstrated in early mouse embryos86, 87. But this appeared to be of limited utility, as mammalian somatic cells, but not some embryonic cells, exhibit nonspecific responses to dsRNA which would obscure sequence-specific silencing. One of these is the RNA-dependent protein kinase (PKR) pathway, which responds to dsRNA by phosphorylating EIF-2 and nonspecifically arresting translation88. Tuschl and colleagues then showed that siRNAs themselves could be used to induce effective silencing in many mammalian cells79. These small RNAs, which are chemically synthesized mimics of Dicer products, are presumably incorporated into RISC and target cognate substrates for degradation. The siRNAs are too small to induce nonspecific dsRNA responses such as PKR89. One drawback that siRNAs have is that their effects are transient, as mammals apparently lack the mechanisms that amplify silencing in worms and plants. In several systems, including plants, Drosophila, C. elegans and trypanosomes, RNAi has been made stable and heritable by enforced expression of the silencing trigger, usually as an inverted repeat sequence forming a hairpin structure in vivo90-95. We have reported mammalian cell lines in which genes are stably suppressed by RNAi through the expression of a 500-base-pair dsRNA96. However, this approach was limited to cell types that lacked generic responses to dsRNA such as the PKR pathway. Recently, we and others have shown that short hairpin RNAs (shRNAs) modelled on miRNAs can be used to manipulate gene expression experimentally80, 97, 98. These may be expressed in vivo from RNA polymerase III (Pol III) promoters to induce stable suppression in mammalian cells. The availability of stable triggers of RNAi builds upon the utility of siRNAs in several ways. Induced phenotypes can now be observed over long time spans. Stably engineered cells can be assayed either in vitro or in vivo, perhaps testing the angiogenic or metastatic potential of tumour cells in xenograft models. RNAi may potentially be used to create hypomorphic alleles rapidly in transgenic mice. If inducible Pol III promoters were used99, 100 , this could permit a powerful approach akin to the use of tissue-specific Gal4-drivers in Drosophila. Finally, shRNAs could be combined with existing high-efficiency gene delivery vehicles to create bona fide RNAi-based therapeutics. In this regard, we have successfully delivered shRNAs from replication-deficient retroviruses, and foresee numerous applications for ex vivo manipulation of stem cells based upon this paradigm. For example, a patient's own bone marrow stem cells could be engineered to resist HIV infection by targeting either the HIV RNA itself or receptors necessary for HIV infection (for example, CCR5). Furthermore, we see no conceptual barrier to incorporating this strategy for targeted suppression into adenovirus or herpesvirus-based delivery vehicles. Ultimately, the exquisite specificity of RNAi may make it possible to silence a diseasecausing mutant allele specifically, such as an activated oncogene, without affecting the normal allele. Perspective Over the past few years, the way in which cells respond to dsRNA by silencing homologous genes has revealed a new regulatory paradigm in biology. This response can be triggered in many different ways, ranging from experimental introduction of synthetic silencing triggers to the transcription of endogenous RNAs that regulate gene expression. We are only beginning to appreciate the mechanistic complexity of this process and its biological ramifications. RNAi has already begun to revolutionize experimental biology in organisms ranging from unicellular protozoans to mammals. RNAi has been applied on the whole-genome scale in C. elegans and this goal is being pursued in plant systems. My laboratory, as part of the larger cancer genomics effort, has undertaken to target, individually, every gene in the human genome using expressed shRNAs. This will permit large-scale loss-of function genetic screens and rapid tests for genetic interactions to be performed for the first time in mammalian cells. Such approaches hold tremendous promise for unleashing the dormant potential of sequenced genomes. References 1. Fire, A. et al. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391, 806-811 (1998). | Article | PubMed | ISI | 2. Guo, S. & Kemphues, K. J. par-1, a gene required for establishing polarity in C. elegans embryos, encodes a putative Ser/Thr kinase that is asymmetrically distributed. Cell 81, 611620 (1995). | PubMed | ISI | 3. Timmons, L. & Fire, A. Specific interference by ingested dsRNA. Nature 395, 854 (1998). | Article | PubMed | ISI | 4. van der Krol, A. R., Mur, L. A., de Lange, P., Mol, J. N. & Stuitje, A. R. Inhibition of flower pigmentation by antisense CHS genes: promoter and minimal sequence requirements for the antisense effect. Plant Mol. Biol. 14, 457-466 (1990). | PubMed | ISI | 5. Napoli, C. A., Lemieux, C., & Jorgensen, R. Introduction of a chimeric chalcone synthetase gene in Petunia results in reversible cosuppression of homologous genes in trans. Plant Cell 2, 279-289 (1990). | ISI | 6. Jorgensen, R. Altered gene expression in plants due to trans interactions between homologous genes. Trends Biotechnol. 8, 340-344 (1990). | PubMed | ISI | 7. Jorgensen, R. A., Cluster, P. D., English, J., Que, Q. & Napoli, C. A. Chalcone synthase cosuppression phenotypes in petunia flowers: comparison of sense vs. antisense constructs and single-copy vs. complex T-DNA sequences. Plant Mol. Biol. 31, 957-973 (1996). | PubMed | ISI | 8. Elmayan, T. & Vaucheret, H. Single copies of a strongly expressed 35S-driven transgene undergo post-transcriptional silencing. Plant J. 9, 787-797 (1996). | ISI | 9. Que, Q., Wang, H. Y., English, J. & Jorgensen, R. The frequency and degree of cosuppression by sense chalcone synthetase transgenes are dependent on promoter strength and are reduced by premature nonsense codons in the transgene coding sequence. Plant Cell 9, 1357-1368 (1997). | ISI | 10. Ruiz, M. T., Voinnet, O. & Baulcombe, D. C. Initiation and maintenance of virus-induced gene silencing. Plant Cell 10, 937-946 (1998). | PubMed | ISI | 11. Angell, S. M. & Baulcombe, D. C. Consistent gene silencing in transgenic plants expressing a replicating potato virus X RNA. EMBO J. 16, 3675-3684 (1997). | Article | PubMed | ISI | 12. Dougherty, W. G. et al. RNA-mediated virus resistance in transgenic plants: exploitation of a cellular pathway possibly involved in RNA degradation. Mol. Plant Microbe Interact. 7, 544-552 (1994). | PubMed | ISI | 13. Kumagai, M. H. et al. Cytoplasmic inhibition of carotenoid biosynthesis with virus-derived RNA. Proc. Natl Acad. Sci. USA 92, 1679-1683 (1995). | PubMed | ISI | 14. Bernstein, E., Denli, A. M. & Hannon, G. J. The rest is silence. RNA 7, 1509-1521 (2001). | PubMed | ISI | 15. Romano, N. & Macino, G. Quelling: transient inactivation of gene expression in Neurospora crassa by transformation with homologous sequences. Mol. Microbiol. 6, 3343-3353 (1992). | PubMed | ISI | 16. Fire, A., Albertson, D., Harrison, S. W. & Moerman, D. G. Production of antisense RNA leads to effective and specific inhibition of gene expression in C. elegans muscle. Development 113, 503-514 (1991). | PubMed | ISI | 17. Dernburg, A. F., Zalevsky, J., Colaiacovo, M. P. & Villeneuve, A. M. Transgene-mediated cosuppression in the C. elegans germ line. Genes Dev. 14, 1578-1583 (2000). | PubMed | ISI | 18. Pal-Bhadra, M., Bhadra, U. & Birchler, J. A. Cosuppression in Drosophila: gene silencing of Alcohol dehydrogenase by white-Adh transgenes is Polycomb dependent. Cell 90, 479-490 (1997). | PubMed | ISI | 19. Dalmay, T., Hamilton, A., Rudd, S., Angell, S. & Baulcombe, D. C. An RNA-dependent RNA polymerase gene in Arabidopsis is required for posttranscriptional gene silencing mediated by a transgene but not by a virus. Cell 101, 543-553 (2000). | PubMed | ISI | 20. de Carvalho, F. et al. Suppression of beta-1,3-glucanase transgene expression in homozygous plants. EMBO J. 11, 2595-2602 (1992). | PubMed | ISI | 21. Jones, L., Ratcliff, F. & Baulcombe, D. C. RNA-directed transcriptional gene silencing in plants can be inherited independently of the RNA trigger and requires Met1 for maintenance. Curr. Biol. 11, 747-757 (2001). | Article | PubMed | ISI | 22. Wassenegger, M., Heimes, S., Riedel, L. & Sanger, H. L. RNA-directed de novo methylation of genomic sequences in plants. Cell 76, 567-576 (1994). | PubMed | ISI | 23. Mette, M. F., Aufsatz, W., van der Winden, J., Matzke, M. A. & Matzke, A. J. Transcriptional silencing and promoter methylation triggered by double-stranded RNA. EMBO J. 19, 51945201 (2000). | Article | PubMed | ISI | 24. Pal-Bhadra, M., Bhadra, U. & Birchler, J. A. RNAi related mechanisms affect both transcriptional and posttranscriptional transgene silencing in Drosophila. Mol Cell 9, 315-327 (2002). | PubMed | ISI | 25. Tabara, H. et al. The rde-1 gene, RNA interference, and transposon silencing in C. elegans. Cell 99, 123-132 (1999). | PubMed | ISI | 26. Dudley, N. R., Labbe, J. C. & Goldstein, B. Using RNA interference to identify genes required for RNA interference. Proc. Natl Acad. Sci. USA 99, 4191-4196 (2002). | PubMed | ISI | 27. Wightman, B., Ha, I. & Ruvkun, G. Posttranscriptional regulation of the heterochronic gene lin14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75, 855-862 (1993). | PubMed | ISI | 28. Reinhart, B. J. et al. The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403, 901-906 (2000). | Article | PubMed | ISI | 29. Kennerdell, J. R. & Carthew, R. W. Use of dsRNA-mediated genetic interference to demonstrate that frizzled and frizzled 2 act in the wingless pathway. Cell 95, 1017-1026 (1998). | PubMed | ISI | 30. Tuschl, T., Zamore, P. D., Lehmann, R., Bartel, D. P. & Sharp, P. A. Targeted mRNA degradation by double-stranded RNA in vitro. Genes Dev. 13, 3191-3197 (1999). | Article | PubMed | ISI | 31. Hammond, S. M., Bernstein, E., Beach, D. & Hannon, G. J. An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404, 293-296 (2000). | Article | PubMed | ISI | 32. Hamilton, A. J. & Baulcombe, D. C. A species of small antisense RNA in posttranscriptional gene silencing in plants. Science 286, 950-952 (1999). | Article | PubMed | ISI | 33. Zamore, P. D., Tuschl, T., Sharp, P. A. & Bartel, D. P. RNAi: double-stranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals. Cell 101, 25-33 (2000). | PubMed | ISI | 34. Elbashir, S. M., Martinez, J., Patkaniowska, A., Lendeckel, W. & Tuschl, T. Functional anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo lysate. EMBO J. 20, 6877-6888 (2001). | PubMed | ISI | 35. Filippov, V., Solovyev, V., Filippova, M. & Gill, S. S. A novel type of RNase III family proteins in eukaryotes. Gene 245, 213-221 (2000). | Article | PubMed | ISI | 36. Bernstein, E., Caudy, A. A., Hammond, S. M. & Hannon, G. J. Role for a bidentate ribonuclease in the initiation step of RNA interference. Nature 409, 363-366 (2001). | Article | PubMed | ISI | 37. Ketting, R. F. et al. Dicer functions in RNA interference and in synthesis of small RNA involved in developmental timing in C. elegans. Genes Dev. 15, 2654-2659 (2001). | PubMed | ISI | 38. Knight, S. W. & Bass, B. L. A role for the RNase III enzyme DCR-1 in RNA interference and germ line development in Caenorhabditis elegans. Science 293, 2269-2271 (2001). | PubMed | ISI | 39. Grishok, A. et al. Genes and mechanisms related to RNA interference regulate expression of the small temporal RNAs that control C. elegans developmental timing. Cell 106, 23-34 (2001). | PubMed | ISI | 40. Blaszczyk, J. et al. Crystallographic and modeling studies of RNase III suggest a mechanism for double-stranded RNA cleavage. Structure (Camb.) 9, 1225-1236 (2001). | PubMed | ISI | 41. Nykanen, A., Haley, B. & Zamore, P. D. ATP requirements and small interfering RNA structure in the RNA interference pathway. Cell 107, 309-321 (2001). | PubMed | ISI | 42. Hammond, S. M., Boettcher, S., Caudy, A. A., Kobayashi, R. & Hannon, G. J. Argonaute2, a link between genetic and biochemical analyses of RNAi. Science 293, 1146-1150 (2001). | PubMed | ISI | 43. Bohmert, K. et al. AGO1 defines a novel locus of Arabidopsis controlling leaf development. EMBO J. 17, 170-180 (1998). | Article | PubMed | ISI | 44. Hammond, S. M., Caudy, A. A. & Hannon, G. J. Post-transcriptional gene silencing by doublestranded RNA. Nature Rev. Genet. 2, 110-119 (2001). | Article | PubMed | ISI | 45. Grishok, A., Tabara, H. & Mello, C. C. Genetic requirements for inheritance of RNAi in C. elegans. Science 287, 2494-2497 (2000). | Article | PubMed | ISI | 46. Ketting, R. F., Haverkamp, T. H., van Luenen, H. G. & Plasterk, R. H. mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner syndrome helicase and RNaseD. Cell 99, 133-141 (1999). | PubMed | ISI | 47. Parrish, S. & Fire, A. Distinct roles for RDE-1 and RDE-4 during RNA interference in Caenorhabditis elegans. RNA 7, 1397-1402 (2001). | PubMed | ISI | 48. Catalanotto, C., Azzalin, G., Macino, G. & Cogoni, C. Involvement of small RNAs and role of the qde genes in the gene silencing pathway in Neurospora. Genes Dev. 16, 790-795 (2002). | PubMed | ISI | 49. Palauqui, J. C., Elmayan, T., Pollien, J. M. & Vaucheret, H. Systemic acquired silencing: transgene-specific post-transcriptional silencing is transmitted by grafting from silenced stocks to non-silenced scions. EMBO J. 16, 4738-4745 (1997). | Article | PubMed | ISI | 50. Sijen, T. et al. On the role of RNA amplification in dsRNA-triggered gene silencing. Cell 107, 465-476 (2001). | PubMed | ISI | 51. Schiebel, W. et al. Isolation of an RNA-directed RNA polymerase-specific cDNA clone from tomato. Plant Cell 10, 2087-2101 (1998). | PubMed | ISI | 52. Mourrain, P. et al. Arabidopsis SGS2 and SGS3 genes are required for posttranscriptional gene silencing and natural virus resistance. Cell 101, 533-542 (2000). | PubMed | ISI | 53. Cogoni, C. & Macino, G. Gene silencing in Neurospora crassa requires a protein homologous to RNA-dependent RNA polymerase. Nature 399, 166-169 (1999). | Article | PubMed | ISI | 54. Smardon, A. et al. EGO-1 is related to RNA-directed RNA polymerase and functions in germline development and RNA interference in C. elegans. Curr. Biol. 10, 169-178 (2000). | Article | PubMed | ISI | 55. Lipardi, C., Wei, Q. & Paterson, B. M. RNAi as random degradative PCR: siRNA primers convert mRNA into dsRNAs that are degraded to generate new siRNAs. Cell 107, 297-307 (2001). | PubMed | ISI | 56. Fabian, E., Jones, L. & Baulcombe, D. C. Spreading of RNA targeting and DNA methylation in RNA silencing requires transcription of the target gene and a putative RNA dependent RNA polymerase. Plant Cell 14, 857-867 (2002). | PubMed | 57. Voinnet, O., Vain, P., Angell, S. & Baulcombe, D. C. Systemic spread of sequence-specific transgene RNA degradation in plants is initiated by localized introduction of ectopic promoterless DNA. Cell 95, 177-187 (1998). | PubMed | ISI | 58. Mallory, A. C. et al. HC-Pro suppression of transgene silencing eliminates the small RNAs but not transgene methylation or the mobile signal. Plant Cell 13, 571-583 (2001). | Article | PubMed | ISI | 59. Winston, W. M., Molodowitch, C. & Hunter, C. P. Systemic RNAi in C. elegans requires the putative transmembrane protein SID-1. Science 295, 2456-2459 (2002). | PubMed | ISI | 60. Cogoni, C. & Macino, G. Posttranscriptional gene silencing in Neurospora by a RecQ DNA helicase. Science 286, 2342-2344 (1999). | Article | PubMed | ISI | 61. Wu-Scharf, D., Jeong, B., Zhang, C. & Cerutti, H. Transgene and transposon silencing in Chlamydomonas reinhardtii by a DEAH-box RNA helicase. Science 290, 1159-1162 (2000). | Article | PubMed | ISI | 62. Tijsterman, M., Ketting, R. F., Okihara, K. L., Sijen, T. & Plasterk, R. H. RNA helicase MUT-14dependent gene silencing triggered in C. elegans by short antisense RNAs. Science 295, 694697 (2002). | PubMed | ISI | 63. Domeier, M. E. et al. A link between RNA interference and nonsense-mediated decay in Caenorhabditis elegans. Science 289, 1928-1931 (2000). | Article | PubMed | ISI | 64. Martienssen, R. A. & Colot, V. DNA methylation and epigenetic inheritance in plants and filamentous fungi. Science 293, 1070-1074 (2001). | Article | PubMed | ISI | 65. Furner, I. J., Sheikh, M. A. & Collett, C. E. Gene silencing and homology-dependent gene silencing in Arabidopsis: genetic modifiers and DNA methylation. Genetics 149, 651-662 (1998). | PubMed | ISI | 66. Fagard, M., Boutet, S., Morel, J. B., Bellini, C. & Vaucheret, H. AGO1, QDE-2, and RDE-1 are related proteins required for post-transcriptional gene silencing in plants, quelling in fungi, and RNA interference in animals. Proc. Natl Acad. Sci. USA 97, 11650-11654 (2000). | PubMed | ISI | 67. Voinnet, O., Lederer, C. & Baulcombe, D. C. A viral movement protein prevents spread of the gene silencing signal in Nicotiana benthamiana. Cell 103, 157-167 (2000). | PubMed | ISI | 68. Baulcombe, D. Viruses and gene silencing in plants. Arch. Virol. Suppl. 15, 189-201 (1999). | PubMed | 69. Jacobsen, S. E., Running, M. P. & Meyerowitz, E. M. Disruption of an RNA helicase/RNAse III gene in Arabidopsis causes unregulated cell division in floral meristems. Development 126, 5231-5243 (1999). | PubMed | ISI | 70. Kataoka, Y., Takeichi, M. & Uemura, T. Developmental roles and molecular characterization of a Drosophila homologue of Arabidopsis Argonaute1, the founder of a novel gene superfamily. Genes Cells 6, 313-325 (2001). | PubMed | ISI | 71. Cox, D. N. et al. A novel class of evolutionarily conserved genes defined by piwi are essential for stem cell self-renewal. Genes Dev. 12, 3715-3727 (1998). | PubMed | ISI | 72. Morel, J. B. et al. Fertile hypomorphic ARGONAUTE (ago1) mutants impaired in post- transcriptional gene silencing and virus resistance. Plant Cell 14, 629-639 (2002). | PubMed | ISI | 73. Moussian, B., Schoof, H., Haecker, A., Jurgens, G. & Laux, T. Role of the ZWILLE gene in the regulation of central shoot meristem cell fate during Arabidopsis embryogenesis. EMBO J. 17, 1799-1809 (1998). | Article | PubMed | ISI | 74. Hutvagner, G. et al. A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA. Science 293, 834-838 (2001). | Article | PubMed | ISI | 75. Lee, R. C., Feinbaum, R. L. & Ambros, V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843-854 (1993). | PubMed | ISI | 76. Olsen, P. H. & Ambros, V. The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation. Dev. Biol. 216, 671-680 (1999). | Article | PubMed | ISI | 77. Slack, F. J. et al. The lin-41 RBCC gene acts in the C. elegans heterochronic pathway between the let-7 regulatory RNA and the LIN-29 transcription factor. Mol. Cell 5, 659-669 (2000). | PubMed | ISI | 78. Ha, I., Wightman, B. & Ruvkun, G. A bulged lin-4/lin-14 RNA duplex is sufficient for Caenorhabditis elegans lin-14 temporal gradient formation. Genes Dev. 10, 3041-3050 (1996). | PubMed | ISI | 79. Elbashir, S. M. et al. Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411, 494-498 (2001). | Article | PubMed | ISI | 80. Paddison, P. J., Caudy, A. A., Bernstein, E., Hannon, G. J. & Conklin, D. S. Short hairpin RNAs (shRNAs) induce sequence-specific silencing in mammalian cells. Genes Dev. 16, 948958 (2002). | PubMed | ISI | 81. Lagos-Quintana, M., Rauhut, R., Lendeckel, W. & Tuschl, T. Identification of novel genes coding for small expressed RNAs. Science 294, 853-858 (2001). | PubMed | ISI | 82. Lau, N. C., Lim, L. P., Weinstein, E. G. & Bartel, D. P. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science 294, 858-862 (2001). | PubMed | ISI | 83. Lee, R. C. & Ambros, V. An extensive class of small RNAs in Caenorhabditis elegans. Science 294, 862-864 (2001). | PubMed | ISI | 84. Mourelatos, Z. et al. miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes Dev. 16, 720-728 (2002). | PubMed | ISI | 85. Lai, E. C. Micro RNAs are complementary to 3' UTR sequence motifs that mediate negative post-transcriptional regulation. Nature Genet. 30, 363-364 (2002). | Article | PubMed | ISI | 86. Wianny, F. & Zernicka-Goetz, M. Specific interference with gene function by double-stranded RNA in early mouse development. Nature Cell Biol. 2, 70-75 (2000). | Article | PubMed | ISI | 87. Svoboda, P., Stein, P., Hayashi, H. & Schultz, R. M. Selective reduction of dormant maternal mRNAs in mouse oocytes by RNA interference. Development 127, 4147-4156 (2000). | PubMed | ISI | 88. Gil, J. & Esteban, M. Induction of apoptosis by the dsRNA-dependent protein kinase (PKR): mechanism of action. Apoptosis 5, 107-114 (2000). | Article | PubMed | ISI | 89. Clarke, P. A. & Mathews, M. B. Interactions between the double-stranded RNA binding motif and RNA: definition of the binding site for the interferon-induced protein kinase DAI (PKR) on adenovirus VA RNA. RNA 1, 7-20 (1995). | PubMed | ISI | 90. Smith, N. A. et al. Total silencing by intron-spliced hairpin RNAs. Nature 407, 319-320 (2000). | Article | PubMed | ISI | 91. Tavernarakis, N., Wang, S. L., Dorovkov, M., Ryazanov, A. & Driscoll, M. Heritable and inducible genetic interference by double-stranded RNA encoded by transgenes. Nature Genet. 24, 180-183 (2000). | Article | PubMed | ISI | 92. Kennerdell, J. R. & Carthew, R. W. Heritable gene silencing in Drosophila using double- stranded RNA. Nature Biotechnol. 18, 896-898 (2000). | Article | PubMed | ISI | 93. LaCount, D. J., Bruse, S., Hill, K. L. & Donelson, J. E. Double-stranded RNA interference in Trypanosoma brucei using head-to-head promoters. Mol. Biochem. Parasitol. 111, 67-76 (2000). | Article | PubMed | ISI | 94. Shi, H. et al. Genetic interference in Trypanosoma brucei by heritable and inducible doublestranded RNA. RNA 6, 1069-1076 (2000). | Article | PubMed | ISI | 95. Wang, Z., Morris, J. C., Drew, M. E. & Englund, P. T. Inhibition of Trypanosoma brucei gene expression by RNA interference using an integratable vector with opposing T7 promoters. J. Biol. Chem. 275, 40174-40179 (2000). | Article | PubMed | ISI | 96. Paddison, P. J., Caudy, A. A. & Hannon, G. J. Stable suppression of gene expression by RNAi in mammalian cells. Proc. Natl Acad. Sci. USA 99, 1443-1448 (2002). | PubMed | ISI | 97. Brummelkamp, T. R., Bernards, R. & Agami, R. A system for stable expression of short interfering RNAs in mammalian cells. Science 21, 21 (2002). 98. Sui, G. et al. A DNA vector-based RNAi technology to suppress gene expression in mammalian cells. Proc. Natl Acad. Sci. USA 99, 5515-5520 (2002). | PubMed | ISI | 99. Meissner, W., Rothfels, H., Schafer, B. & Seifart, K. Development of an inducible pol III transcription system essentially requiring a mutated form of the TATA-binding protein. Nucleic Acids Res. 29, 1672-1682 (2001). | PubMed | ISI | 100. Ohkawa, J. & Taira, K. Control of the functional activity of an antisense RNA by a tetracyclineresponsive derivative of the human U6 snRNA promoter. Hum. Gene Ther. 11, 577-585 (2000). | Article | PubMed | ISI | Acknowledgements. I thank members of the Hannon laboratory for critical reading of the manuscript; J. Duffy for help in preparation of the figures; D. Baulcombe, M. Tijsterman, R. Carthew and S. Prasanth for providing the images for Fig. 1; fellow investigators who granted permission to discuss unpublished observations; and C. Mello and C. Sherr for providing motive and opportunity, respectively, for our early work on RNAi. G.J.H. is a Rita Allen Foundation scholar and is supported by an Innovator Award from the U.S. Army Breast Cancer Research Program. This work was supported in part by a grant from the NIH. Figure 1 Double-stranded RNA can be introduced experimentally to silence target genes of interest. In plants, silencing can be triggered, for example, by engineered RNA viruses or by inverted repeat transgenes. In worms, silencing can be triggered by injection or feeding of dsRNA. In both of these systems, silencing is systemic and spreads throughout the organism. a, A silencing signal moves from the veins into leaf tissue. Green is green fluorescent protein (GFP) fluorescence and red is chlorophyll fluorescence that is seen upon silencing of the GFP transgene. b, C. elegans engineered to express GFP in nuclei. Animals on the right have been treated with a control dsRNA, whereas those on the left have been exposed to GFP dsRNA. Some neuronal nuclei remain florescent, correlating with low expression of a protein required for systemic RNAi59. c, HeLa cells treated with an ORC6 siRNA and stained for tubulin (green) and DNA (red). Depletion of ORC6 results in accumulation of multinucleated cells. Stable silencing can also be induced by expression of dsRNA as hairpins or snap-back RNAs. d, Adult Drosophila express a hairpin homologous to the white gene (left), which results in unpigmented eyes compared with wild type (right). Figure 2 Dicer and RISC (RNA-induced silencing complex). a, RNAi is initiated by the Dicer enzyme (two Dicer molecules with five domains each are shown), which processes double-stranded RNA into 22-nucleotide small interfering RNAs36. Based upon the known mechanisms for the RNase III family of enzymes, Dicer is thought to work as a dimeric enzyme. Cleavage into precisely sized fragments is determined by the fact that one of the active sites in each Dicer protein is defective (indicated by an asterisk), shifting the periodicity of cleavage from 9–11 nucleotides for bacterial RNase III to 22 nucleotides for Dicer family members40. The siRNAs are incorporated into a multicomponent nuclease, RISC (green). Recent reports suggest that RISC must be activated from a latent form, containing a double-stranded siRNA to an active form, RISC*, by unwinding of siRNAs41. RISC* then uses the unwound siRNA as a guide to substrate selection31. b, Diagrammatic representation of Dicer binding and cleaving dsRNA (for clarity, not all the Dicer domains are shown, and the two separate Dicer molecules are coloured differently). Deviations from the consensus RNase III active site in the second RNase III domain inactivate the central catalytic sites, resulting in cleavage at 22-nucleotide intervals. Figure 3 Transitive RNAi. In transitive RNAi in C. elegans, silencing can travel in a 3' to 5' direction on a specific mRNA target50. The simplest demonstration comes from the creation of fusion transcripts. Consider a fragment of green fluorescent protein (GFP) fused 3' to a segment of UNC-22 (left). Targeting GFP abolishes fluorescence but also creates an unexpected, uncoordinated phenotype. This occurs because of the production of double-stranded RNA and consequently small interfering RNAs homologous to the endogenous UNC-22 gene. In a case in which GFP is fused 5' to the UNC-22 fragment (right), GFP dsRNA still ablates fluorescence but does not produce an uncoordinated phenotype. Figure 4 Small interfering RNAs versus small temporal RNAs. Double-stranded siRNAs of length 21–23 nucleotides are produced by Dicer from dsRNA silencing triggers. Characteristic of RNase III products, these have two-nucleotide 3' overhangs and 5'-phosphorylated termini. To trigger target degradation with maximum efficiency, siRNAs must have perfect complementarity to their mRNA target (with the exception of the two terminal nucleotides, which contribute only marginally to recognition). stRNAs, such as lin-4 and let-7, are transcribed from the genome as hairpin precursors. These are also processed by Dicer, but in this case, only one strand accumulates. Notably, neither lin-4 nor let-7 show perfect complementarity to their targets. In addition, stRNAs regulate targets at the level of translation rather than RNA degradation. It remains unclear whether the difference in regulatory mode results from a difference in substrate recognition or from incorporation of siRNAs and stRNAs into distinct regulatory complexes. Figure 5 A model for the mechanism of RNAi. Silencing triggers in the form of double-stranded RNA may be presented in the cell as synthetic RNAs, replicating viruses or may be transcribed from nuclear genes. These are recognized and processed into small interfering RNAs by Dicer. The duplex siRNAs are passed to RISC (RNA-induced silencing complex), and the complex becomes activated by unwinding of the duplex. Activated RISC complexes can regulate gene expression at many levels. Almost certainly, such complexes act by promoting RNA degradation and translational inhibition. However, similar complexes probably also target chromatin remodelling. Amplification of the silencing signal in plants may be accomplished by siRNAs priming RNAdirected RNA polymerase (RdRP)-dependent synthesis of new dsRNA. This could be accomplished by RISC-mediated delivery of an RdRP or by incorporation of the siRNA into a distinct, RdRP-containing complex. 11 July 2002 Nature 418, 252 - 258 (2002); doi:10.1038/418252a <> Emerging clinical applications of RNA BRUCE A. SULLENGER AND ELI GILBOA Department of Surgery, Duke University Medical Center, Durham, North Carolina 27710, USA (e-mail: b.sullenger@cgct.duke.edu) RNA is a versatile biological macromolecule that is crucial in mobilizing and interpreting our genetic information. It is not surprising then that researchers have sought to exploit the inherent properties of RNAs so as to interfere with or repair dysfunctional nucleic acids or proteins and to stimulate the production of therapeutic gene products in a variety of pathological situations. The first generation of the resulting RNA therapeutics are now being evaluated in clinical trials, raising significant interest in this emerging area of medical research. The concept of using RNA molecules as therapeutic agents is relatively new, but has received increasing attention during the past decade. Much of this interest stems from a variety of basic scientific discoveries that underscore the seminal role of RNA molecules in the utilization of genetic instructions in all living systems and the versatility of these molecules in nature. RNA molecules can adopt a wide variety of conformations and perform a range of cellular functions. Certain RNAs fold to form catalytic centres, whereas others have structures that allow them to make specific RNA–RNA, RNA–DNA or RNA– protein interactions. Such realizations have led translational researchers to attempt to exploit various facets of RNA biology and chemistry to combat human disease. Significant progress has been made towards this goal and the first RNA-based therapeutics are now being evaluated in clinical trials for the treatment of disorders ranging from cancer to infectious diseases. The therapeutic RNAs that have so far received the most attention can be grouped into four categories: gene inhibitors, gene amenders, protein inhibitors and immunostimulatory RNAs. Here we review the development of these various RNA-based approaches to therapy and provide an update on the progress towards moving this emerging class of molecular therapeutics into and through clinical evaluation. Finally, we will discuss the developmental difficulties that various RNA therapeutics still face and consider potential solutions to those problems. RNA-mediated inhibition of gene expression Regulation of gene expression by an RNA that is complementary to a target messenger RNA (mRNA) was first recognized as a naturally occurring process in prokaryotes1. These complementary RNAs, termed antisense RNAs, specifically recognize their target transcripts by forming base pairs with them in a sequence-dependent manner. The formation of this RNA duplex is believed to lead to the degradation of the target RNA or the inhibition of its translation. The ability to inhibit specific genes after gene transfer of antisense expression cassettes was first demonstrated almost two decades ago in bacteria by Pestka et al.2 and Coleman et al.3 and in eukaryotic cells by Izant and Wientraub4. Following these initial studies, numerous reports appeared that described the potential utility of antisense RNA for the inhibition of a wide array of genes in mammalian cells5. However, these studies also indicated that the efficacy of antisense-mediated gene inhibition was usually dependent on the presence of a considerable excess of antisense RNA to target RNA in the cell5. Therefore methods were sought to express large quantities of antisense RNAs in transduced cells6 or to generate antisense RNAs that could destroy multiple target RNAs and thus reduce the need for an excess of the inhibitory RNA. The discovery that certain RNAs can perform catalysis7, 8 has led to the development of a class of therapeutic RNAs called trans-cleaving ribozymes. Such ribozymes bind substrate RNAs through base-pairing interactions, cleave the bound target RNA, release the cleavage products and are recycled so that they can repeat this process multiple times in vitro (Fig. 1a). The observation that such ribozymes can be repeatedly targeted to cleave virtually any pathogenic transcript in vitro9, 10 led to much speculation about their potential therapeutic value in vivo11, 12. Much progress has been made towards assessing the potential utility of trans-cleaving ribozymes, with the hammerhead and hairpin ribozyme13 being the main focus of this translational effort (see review in this issue by Doudna and Cech, pages 222– 228). Figure 1 Applications of trans-cleaving ribozymes for gene inhibition. Full legend High resolution image and legend (69k) Several phase I and II clinical trials have been initiated using trans-cleaving ribozymes in a small number of patients with infectious diseases or cancer. In these studies the ribozymes have been delivered to the patients either by gene therapy methods or by direct injection of a synthetic ribozyme. The gene therapy-based trials have focused upon developing ribozyme-based treatments for individuals infected with the human immunodeficiency virus (HIV). Three separate groups have used retroviral vectors to introduce expression cassettes for anti-HIV ribozymes into CD4+ lymphocytes or CD34+ haematopoietic precursors ex vivo that have been taken from the infected patient or from an identical twin14-16 (Fig. 1b). The transduced cells are then infused into the patient and the engraftment and survival of the ribozyme-containing cells are monitored. Initial results from these studies suggest that transfer of ribozyme-encoding genes to HIVinfected individuals is well tolerated and transduced cells can persist in the patient15. Moreover, preliminary reports suggest that anti-HIV ribozyme-containing cells may possess a transient survival advantage in the patient compared with cells transduced with a control vector15. Unfortunately, such studies also indicate that gene transfer into long-term progenitors has not been accomplished because transduced cells decrease to below detection by one year after infusion (J. J. Rossi, personal communication). Larger clinical trials must now be performed to evaluate the efficacy of anti-HIV ribozymes. Critical factors that will influence the success of such trials will be the development of genetransfer systems that can efficiently transduce pluripotent haematopoietic stem cells and the generation of improved ribozyme expression cassettes that can increase the survival advantage of transduced cells. In this regard, encouraging preclinical studies suggest that co-localization of ribozymes with their viral target RNAs inside cells may enhance ribozyme activity in vivo17, 18, and use of combinations of inhibitory ribozymes and decoy RNAs may yield more potent inhibitors of HIV against which it will be difficult for the virus to develop resistance. Three different nuclease-resistant synthetic ribozymes are being evaluated in clinical trials12. Each of these trials uses a hammerhead ribozyme derivative that contains chemical modifications that greatly increase the ribozyme's stability in biological fluids19. Moreover, methods have been developed that enable large-scale synthesis of this new class of therapeutic agent under good manufacturing practice protocols20. All three of these synthetic ribozymes target RNAs whose expression is associated with the induction or progression of cancer and all three have shown promising results in preclinical cell and animal experiments21, 22. In 1998, the first of these ribozymes entered a phase I trial targeting flt-1 mRNA, which encodes the high-affinity receptor for the angiogenic protein vascular endothelial growth factor (VEGF). Results from this and two subsequent phase I trials show that daily intravenous or subcutaneous delivery of this compound is well tolerated and that plasma levels could be maintained for prolonged periods after subcutaneous delivery12. Currently its therapeutic efficacy is being evaluated in phase II trials for breast and colorectal cancer. The other two synthetic ribozymes to enter clinical trials target mRNA of human epidermal growth factor receptor type 2, which is overexpressed in many breast cancers, and hepatitis C virus (HCV) RNA, which is associated with liver cirrhosis and hepatocellular carcinoma. Results from these initial efficacy studies will provide the first significant insight into the long-term utility of trans-cleaving ribozymes as therapeutic agents. The critical factors that will most likely determine the success of these synthetic ribozyme efficacy trials are the ability to deliver ribozymes efficiently into the appropriate cells in vivo, and the level and duration of target-gene inhibition that is required to alter disease pathophysiology and slow disease progression. For ribozymes to benefit individuals with chronic disorders such as cancer and HCV (or HIV) infection, long-term, high-level inhibition of target transcripts will probably be required. This may be difficult to achieve in practice, especially when targeting highly expressed viral RNAs. For example, we observed that a hammerhead ribozyme was able to inhibit the replication of a murine retrovirus by up to 90% by co-localizing the ribozyme with its viral target inside cells, but we found that higher levels of inhibition were difficult to achieve even when a large excess of the ribozyme was present in the cells17. Thus, the utility of trans-cleaving ribozymes may be limited to conditions were a modest reduction in pathogenic gene expression will result in therapeutic efficacy or to combination therapies were the ribozyme can act in concert with other therapeutics (for example, antiviral or chemotherapeutic agents) that largely control the pathogen. Finally, two additional RNA-based strategies for gene inhibition in mammalian cells have recently been described. First, in many eukaryotic cells, expression of an mRNA can be inhibited by a double-stranded RNA that corresponds to the sequence of the targeted transcript. This process, known as RNA interference (RNAi; see review in this issue by Hannon, pages 244–251), has been shown to function in mammalian cells after introduction of short ( 21 nucleotides), synthetic duplex RNAs23 or expression of similar-length transcripts by RNA polymerase III (Pol III) promoters24-27. Using this Pol III expression approach, Lee and colleagues26 have shown that RNAi can inhibit HIV gene expression by up to 4 logs in co-transfection experiments. These and other results have raised much interest about the potential utility of the RNAi approach for therapeutic applications. Second, in contrast to trans-cleaving ribozymes and RNAi, which target pathogenic RNA for destruction, mobile group II introns can be targeted to insert into and inactivate pathogenic DNA. The obvious advantage of targeting DNA rather than RNA is that such disruption would be a one-time event that would effectively knock out 100% of the RNAs issuing from the disrupted gene. Lambowitz and colleagues have shown28 that a mobile group II intron can be retargeted to insert specifically into HIV pro-viral DNA or the HIV co-receptor gene CCR5. Gene disruption was efficient in bacteria where the HIV pol gene could be disrupted in over 60% of the cells. Group II mobilization into extra-chromosomal copies of the HIV pol and CCR genes in mammalian cells has been shown to be possible28, but mobilization into genomic DNA has yet to be reported. Translational researchers must now begin to determine whether these two gene-inhibition strategies can be utilized for therapeutic applications. Information gleaned from the ongoing clinical trials evaluating trans-cleaving ribozymes will undoubtedly facilitate the clinical development of this next generation of RNA-based gene inhibitors. RNA-mediated repair of genetic instructions Genetic instructions are usually revised as they are converted from DNA to RNA to protein in human cells. Most of this revision occurs at the RNA level when splicing removes intron sequences from precursor transcripts and ligates together flanking exon sequences to generate mature RNAs. Interestingly, many of the molecules involved in revising RNA messages are RNAs themselves. Several recent studies exploit this facet of RNA biology and describe the development of an intriguing new class of therapeutic RNAs that can perform trans-splicing to repair clinically relevant mutant transcripts. The concept of RNA repair has received much attention as a novel approach to gene therapy. Compared with more traditional strategies, RNA repair should engender the regulated expression of the corrected gene product and simultaneously reduce the expression of the mutant gene product. This property makes RNA repair an attractive strategy to attempt in the treatment of genetic disorders associated with mutant genes that are highly regulated or that encode deleterious mutant proteins. The initial studies focused on RNA repair used a trans-splicing version of a group I ribozyme to repair mutant lacZ transcripts in bacteria29 and mammalian cells30. These studies showed that the ribozyme was able to repair the mutant RNA by recognizing the target transcript by base pairing with it, cleaving off mutant sequences and ligating a wildtype sequence onto the cleavage product (Fig. 2a). More recently, trans-splicing ribozymes have been generated that can amend mutant transcripts associated with myotonic dystrophy31 and many cancers (mutant p53 tumour-suppressor transcripts)32 in mammalian cell lines, and with sickle cell anaemia in erythrocyte precursors isolated from patients with sickle cell disease33. Figure 2 Trans-splicing-mediated repair of mutant transcripts. Full legend High resolution image and legend (76k) A second, related approach to RNA repair uses spliceosomes to revise mutant transcripts by trans-splicing34. In this case the spliceosome performs the trans-splicing reaction upon a pre-trans-splicing RNA molecule (termed a PTM) and a target RNA (Fig. 2b). An expression cassette for the PTM is delivered to the cell, whereas the spliceosome and target RNA are supplied by the cell. Puttaraju et al. first demonstrated that spliceosome-mediated RNA trans-splicing (SMaRT) could be used to reprogram human chorionic gonadotropin -polypeptide mRNAs and to repair mutant lacZ transcripts in cell culture35. Subsequently, encouraging results have shown that the SMaRT approach can repair a clinically relevant fraction of a mutant cystic fibrosis transmembrane conductance regulator (CFTR) transcript (CFTR F508) in human cystic fibrosis airway-epithelial cells grown in culture or in animal xenografts36. Such repair resulted in partial restoration of Cl- transport in the CFTRdeficient cells to 12–15% of the level observed in wild-type CFTR-containing cells. A critical question requiring further study for both ribozyme- and spliceosome-mediated repair of mutant RNAs is the specificity of RNA revision. The specificity of spliceosomal trans-splicing seems to be low, at least in certain instances, with the spliceosome transsplicing the PTM sequence onto many unintended target RNAs in mammalian cells37. Initial studies suggested that the specificity of ribozyme-mediated repair may also be low30; but more recent investigations have described modifications to the ribozyme to address the issue of target specificity directly32, 38-40. Definitive studies are now warranted to address this question directly and to facilitate the development of more specific trans-splicing agents if needed. Even though some issues concerning RNA repair remain unresolved, the preclinical studies performed so far on therapeutic applications of trans-splicing are encouraging. They demonstrate that trans-splicing can amend mutant transcripts associated with a variety of human diseases and repair target RNAs with efficiencies that would be expected to be clinically beneficial, at least for treating many recessive disorders. RNA as a protein antagonist Many small RNAs can fold into three-dimensional structures that allow them to bind target proteins with high affinity and specificity. Several RNA viruses such as HIV use this property of RNA to recruit viral and host proteins to perform essential functions in viral replication. For example, HIV uses small-structured RNA elements termed the transactivation response region (TAR) and Rev response element (RRE) to recruit the viral regulatory proteins Tat and Rev to control viral gene expression. The use of smallstructured RNAs to directly bind and inhibit the activity of a pathogenic protein was first explored using the HIV TAR sequence (Fig. 3a). Expression of TAR 'decoy' RNAs in CD4+ T cells was shown to competitively inhibit Tat binding to the viral TAR RNA and render cells highly resistant to HIV replication (Fig. 3b)41. Subsequent studies demonstrated that RRE also could act as a decoy RNA to block Rev activity and inhibit HIV replication42. Figure 3 RNA ligand-mediated inhibition of protein function. Full legend High resolution image and legend (66k) These observations suggested that short RNA decoys might be useful therapeutic agents for inhibiting HIV replication in vivo and have led to a phase I gene therapy-based clinical trial designed to test safety and feasibility. In this trial, retroviral vectors were used to introduce expression cassettes for RRE decoys into haematopoietic progenitors that had been isolated from, and re-infused into, HIV-infected paediatric patients43 (Fig. 1b). This initial study showed that gene transfer and expression of RRE decoys was well tolerated, but as with other HIV gene-therapy trials, the results emphasize the need for improved gene-transfer techniques to transduce pluripotent haematopoietic progenitors43. The observation that TAR and RRE decoy RNAs could be used to competitively inhibit viral protein function and replication suggested that other small-structured RNA molecules might be able to bind pathogenic target proteins and inhibit their activity. Concurrent with this observation, work from two groups suggested that iterative in vitro selection methods could be used to isolate high-affinity RNA ligands from large pools of randomized RNA sequences (vast RNA shape libraries) that could bind to proteins and small molecules44, 45. The resulting RNA ligands were termed aptamers by Ellington and Szostak45 and the selection process was named SELEX (systematic evolution of ligands by exponential enrichment) by Tuerk and Gold44 (Fig. 3c). SELEX has now been used to identify RNA aptamers that can bind to and inhibit the activity of a wide variety of proteins (for reviews, see refs 46, 47). The affinities of aptamers for their targets are similar to the affinities achieved by monoclonal antibodies for their antigens (Kd values typically in the low nanomolar to high picomolar range)46. In contrast to antibodies, however, aptamers can be chemically synthesized to produce large quantities of these compounds for in vivo experimentation and clinical trials. Moreover, as with trans-cleaving ribozymes, synthetic aptamers can be modified to have greatly enhanced plasma stability and circulating halflives and seem to exhibit low toxicity and immunogenicity in vivo47-49. So far only a few aptamers have been evaluated in animal models of disease (reviewed in refs 47, 50) and two of these have received the most attention. A DNA aptamer to thrombin functions as a potent anticoagulant that is able to maintain the patency of an extracorporeal circuit in sheep and replace heparin in a canine cardiopulmonary bypass model51-53. A 2'flouro-modified RNA aptamer to VEGF-165 is able to inhibit neovascularization in a rat corneal-pocket angiogenesis assay and block glomerular endothelial cell proliferation and apoptosis in rats with mesangioproliferative nephritis49, 54. This VEGF aptamer is also the first therapeutic aptamer to be administered to humans and is currently being evaluated in a phase II/III clinical trial in patients with age-related macular degeneration. In this instance, the synthetic aptamer is being injected directly into the vitreous humour of the eye to assess treatment safety and efficacy. Factors that are likely to influence the trial success include the retention time of the aptamer in the vitreous humour and the relative importance of VEGF-165 on the progression of age-related macular degeneration in these patients. The observation that the VEGF aptamer remains active and can be recovered from the vitreous humour of rhesus monkeys 28 days after injection55 is encouraging with regard to prospects for long-term ocular retention of the compound. Because most existing drugs are protein antagonists, the ultimate success of aptamers as therapeutic agents will probably depend on how well they compete with other classes of therapeutic compounds. In particular, the use of monoclonal antibodies has garnered increasing interest from both physicians and the pharmaceutical industry. Monoclonal antibodies are being generated and tested against many of the same proteins and for similar indications as most of the aptamers that are in pre-clinical development. Moreover, several antibodies have already progressed to market and are being used for the treatment of a variety of disorders. Therefore, in this competitive landscape, properties of aptamers that distinguish them from monoclonal antibodies (and other protein inhibitors) will have to be exploited for aptamers to penetrate the therapeutic market and ultimately fulfil their potential and become broadly useful pharmaceutical agents. Immunotherapy using mRNA-transfected dendritic cells Specific active immunotherapy of cancer — stimulating the patient's immune system to recognize and eliminate tumour cells — is emerging as a promising modality for treating cancer recurrence and low-volume metastatic disease. There is considerable evidence that the cytotoxic T lymphocyte (CTL) arm of the immune response is crucial in controlling tumour growth. CTLs recognize short (8–10 amino acids) antigenic peptides in association with major histocompatibility complex (MHC) molecules displayed on the cell surface. The peptides are generated in the cytoplasm via the proteolytic action of the multi-unit proteolytic complex, the proteosome. They are shuttled from the cytoplasm to the endoplasmic reticulum, where they associate with nascent MHC molecules, and are then transported to the cell surface for recognition by CTLs56. Naive CTLs generated in the thymus undergo an activation process to acquire the ability to kill their targets or secrete so-called 'effector' cytokines such as interferon- . Bone marrowderived dendritic cells (DCs) displaying the appropriate MHC–peptide complex on the cell surface are the primary cell type capable of activating naive CTLs57. Once activated, the CTL can recognize and kill any somatic cell presenting the MHC–peptide complex. Thus, to stimulate a CTL response against the tumour, the tumour antigens have to reach the cytoplasm of a DC. In the cancer patient the capture of tumour antigens by DCs and stimulation of tumour-specific CTLs is presumed to be inefficient and a limiting factor in stimulating protective immunity. One approach to stimulate effective CTL responses in cancer patients would be to reconstruct this process in vitro, that is, to isolate DCs from the patient, load them with tumour antigens in such a manner that the antigens reach the cytoplasm, and inject the antigen-loaded cells back into the patient. This approach has been accomplished by incubating DCs with peptides and proteins, or by transfecting the cells with DNA constructs. Transfecting DCs with mRNA-encoding (tumour) antigens is yet another way to load DCs with antigens58. mRNA can be isolated directly from tumour cells or synthesized in vitro from complementary DNA templates. DCs transfected with mRNA-encoding specific antigens or total tumour-derived RNA elicited potent CTL responses and tumour immunity in mice58-64, and DCs generated from healthy volunteers or from cancer patients transfected with tumour RNA stimulated CTL responses in culture62, 64-74. Initial studies have used cationic lipids to facilitate the uptake of RNA by DCs58, 63. Remarkably, incubation of DCs with RNA alone was also sufficient to sensitize them to stimulate CTLs70-72, 74. This is despite the low efficiency of RNA transfer (measured by gene expression), and no doubt reflects the sensitivity of the immune system to recognize minute amounts of antigens not detectable by conventional means. Recently an improved method for DC transfection with RNA was developed using electroporation, which rivals the best transfection methods for nucleic acids66, 69. Just how efficient is the mRNA transfection protocol? Using functional end points such as CTL priming or induction of tumour immunity in mice, studies comparing several loading techniques — including loading DCs with peptides and proteins, transfection with cDNA plasmids or transduction with vaccinia vectors — found that mRNA loading was invariably superior65, 66, 68, 69. Additionally, it is often not fully appreciated that the generation of mRNA-encoding specific antigens is a simple process. Given the sequence, a cDNA template can be generated from the cell by reverse transcription followed by amplification using the polymerase chain reaction and transcribed into RNA in a matter of a few hours. This can be compared to the task of generating the corresponding protein or identifying the antigenic peptides. Use of mRNA-encoded antigens for cancer vaccination offers another potentially useful benefit. Only a handful of tumour antigens have been discovered so far, mostly from melanoma patients, and there is growing evidence that many of these antigens are not well suited for vaccination75, 76. The alternative option is to vaccinate with tumour-derived antigenic mixtures, an approach which animal studies suggest is surprisingly effective. In reality, however, it is not possible to obtain sufficient tumour tissue from most cancer patients to generate the amount of antigens thought necessary for an effective vaccination protocol. Consequently, a significant proportion (perhaps most) of cancer patients would not benefit from current vaccination strategies, regardless of how effective they might be. Use of mRNA as source of antigen offers a solution that has been shown to work62, 71. Biologically active RNA can be amplified from microscopic amounts of tumour tissue (for instance, from frozen sections or from needle biopsies) to provide a virtually inexhaustible amount of antigen from practically every patient. Perhaps no less important is that use of mRNA-encoded antigens can address the concern associated with the emergence of treatment-resistant tumour variants which lost the antigens targeted by vaccination, a common occurrence in cancer therapy. Using amplification protocols, sufficient RNA can be readily generated from a microscopic amount of a newly emerging antigen-loss tumour variant, before it is too late. In summary, the preclinical experience suggests that cancer vaccination with tumour RNAtransfected DCs may constitute a highly effective and broadly applicable treatment for patients with recurring cancer. Whether such observations can be translated to the clinic remains to be seen, but hints from initial clinical trials are not discouraging. The primary drawback of DC-based vaccination is that it is a customized form of cell therapy; DCs (and tumour RNA) are generated from each patient and require in vitro manipulation of the cells before re-injection into the patient (Fig. 4). This adds cost and complexity to the treatment and remains a challenge owing to the yet unproven nature of this new paradigm in human therapy. The expectation, which we believe is well founded, is that improved outcome to otherwise intractable diseases will offset the added cost and complexity associated with this form of therapy. A phase I clinical trial in patients with metastatic prostate cancer vaccinated with prostate-specific antigen (PSA) RNA-transfected DCs (PSA is a common tumour-associated antigen in patients with prostate cancer) was recently concluded77. The DC therapy seems to be safe and, despite the advanced state of the disease, all patients have responded immunologically to the vaccine, that is, they have all exhibited PSA-specific T-cell responses. Surprisingly, six out of seven available patients exhibited a (very) modest clinically related response, a small but statistically significant impact on the blood PSA levels. This modest effect is unlikely to translate to clinical benefit to the patients, but suggests that the approach of vaccination with RNA-loaded DCs deserves further consideration. A second trial in renal cancer patients vaccinated with tumour RNA-transfected DCs was recently completed and is being evaluated. Figure 4 Treatment of cancer patients with tumour RNAtransfected dendritic cells (DCs). Full legend High resolution image and legend (45k) Perspectives In the past five years, a number of clinical trials have been initiated to begin to evaluate the safety and efficacy of a variety of innovative RNA-based therapeutic strategies. RNA therapeutics that can inhibit gene expression, block protein function or induce potent immune (CTL) responses have all entered clinical trials. Moreover, the RNA therapeutic developmental pipeline is burgeoning and the next generation of RNA-based therapies is quickly making its way through pre-clinical studies. The ultimate clinical utility of any of these treatment modalities is obviously still unclear. Nevertheless, the impressive diversity of therapeutic RNAs that are being developed and the breadth of clinical indications that one can foresee treating with this new class of therapeutic agents is remarkable. Just as nature has evolved elegant strategies that utilize RNA molecules to perform a variety of essential biological functions for the initiation and maintenance of life, we are confident that clinical researchers will develop innovative strategies that harness the utility of RNA molecules for the protection and improvement of human life. References 1. Green, P. J., Pines, O. & Inouye, M. The role of antisense RNA in gene regulation. Annu. Rev. Biochem. 55, 569-597 (1986). | PubMed | ISI | 2. Pestka, S., Daugherty, B. L., Jung, V., Hotta, K. & Pestka, R. K. Anti-mRNA: specific inhibition of translation of single mRNA molecules. Proc. Natl Acad. Sci. USA 81, 7525-7528 (1984). | PubMed | ISI | 3. Coleman, J., Green, P. J. & Inouye, M. The use of RNAs complementary to specific mRNAs to regulate the expression of individual bacterial genes. Cell 37, 429-436 (1984). | PubMed | ISI | 4. Izant, J. G. & Weintraub, H. Constitutive and conditional suppression of exogenous and endogenous genes by anti-sense RNA. Science 229, 345-352 (1985). | PubMed | ISI | 5. van der Krol, A. R., Mol, J. N. & Stuitje, A. R. Modulation of eukaryotic gene expression by complementary RNA or DNA sequences. Biotechniques 6, 958-976 (1988). | PubMed | ISI | 6. Sullenger, B. A., Lee, T. C., Smith, C. A., Ungers, G. E. & Gilboa, E. Expression of chimeric tRNA-driven antisense transcripts renders NIH 3T3 cells highly resistant to Moloney murine leukemia virus replication. Mol. Cell. Biol. 10, 6512-6523 (1990). | PubMed | ISI | 7. Kruger, K. et al. Self-splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell 31, 147-157 (1982). | PubMed | ISI | 8. Guerrier-Takada, C., Gardiner, K., Marsh, T., Pace, N. & Altman, S. The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell 35, 849-857 (1983). | PubMed | ISI | 9. Uhlenbeck, O. C. A small catalytic oligoribonucleotide. Nature 328, 596-600 (1987). | PubMed | ISI | 10. Haseloff, J. & Gerlach, W. L. Simple RNA enzymes with new and highly specific endoribonuclease activities. Nature 334, 585-591 (1988). | PubMed | ISI | 11. Cech, T. R. Ribozymes and their medical implications. J. Am. Med. Assoc. 260, 3030-3034 (1988). | ISI | 12. Usman, N. & Blatt, L. M. Nuclease-resistant synthetic ribozymes: developing a new class of therapeutics. J. Clin. Invest. 106, 1197-1202 (2000). | PubMed | ISI | 13. Symons, R. H. Small catalytic RNAs. Annu. Rev. Biochem. 61, 641-671 (1992). | PubMed | ISI | 14. Bauer, G. et al. Inhibition of human immunodeficiency virus-1 (HIV-1) replication after transduction of granulocyte colony-stimulating factor-mobilized CD34+ cells from HIV-1-infected donors using retroviral vectors containing anti-HIV-1 genes. Blood 89, 2259-2267 (1997). | PubMed | ISI | 15. Wong-Staal, F., Poeschla, E. M. & Looney, D. J. A controlled, Phase 1 clinical trial to evaluate the safety and effects in HIV-1 infected humans of autologous lymphocytes transduced with a ribozyme that cleaves HIV-1 RNA. Hum. Gene Ther. 9, 2407-2425 (1998). | PubMed | ISI | 16. Amado, R. G. et al. A phase I trial of autologous CD34+ hematopoietic progenitor cells transduced with an anti-HIV ribozyme. Hum. Gene Ther. 10, 2255-2270 (1999). | Article | PubMed | ISI | 17. Sullenger, B. A. & Cech, T. R. Tethering ribozymes to a retroviral packaging signal for destruction of viral RNA. Science 262, 1566-1569 (1993). | PubMed | ISI | 18. Lee, N. S., Bertrand, E. & Rossi, J. mRNA localization signals can enhance the intracellular effectiveness of hammerhead ribozymes. RNA 5, 1200-1209 (1999). | Article | PubMed | ISI | 19. Beigelman, L. et al. Chemical modification of hammerhead ribozymes. Catalytic activity and nuclease resistance. J. Biol. Chem. 270, 25702-25708 (1995). | PubMed | ISI | 20. Wincott, F. et al. Synthesis, deprotection, analysis and purification of RNA and ribozymes. Nucleic Acids Res. 23, 2677-2684 (1995). | PubMed | ISI | 21. Pavco, P. A. et al. Antitumor and antimetastatic activity of ribozymes targeting the messenger RNA of vascular endothelial growth factor receptors. Clin. Cancer Res. 6, 2094-2103 (2000). | PubMed | ISI | 22. Macejak, D. G. et al. Inhibition of hepatitis C virus (HCV)-RNA-dependent translation and replication of a chimeric HCV poliovirus using synthetic stabilized ribozymes. Hepatology 31, 769-776 (2000). | PubMed | ISI | 23. Elbashir, S. M. et al. Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411, 494-498 (2001). | Article | PubMed | ISI | 24. Sui, G. et al. A DNA vector-based RNAi technology to suppress gene expression in mammalian cells. Proc. Natl Acad. Sci. USA 99, 5515-5520 (2002). | PubMed | ISI | 25. Miyagishi, M. & Taira, K. U6 promoter-driven siRNA with four uridine 3' overhangs efficiently suppress targeted gene expression in mammalian cells. Nature Biotechnol. 20, 497-500 (2002). | Article | PubMed | ISI | 26. Lee, N. S. et al. Expression of small interfering RNAs targeted against HIV-1 rev transcripts in human cells. Nature Biotechnol. 20, 500-505 (2002). | Article | PubMed | ISI | 27. Paul, C. P. et al. Effective expression of small interfering RNA in human cells. Nature Biotechnol. 20, 505-508 (2002). | Article | PubMed | ISI | 28. Guo, H. et al. Group II introns designed to insert into therapeutically relevant DNA target sites in human cells. Science 289, 452-457 (2000). | Article | PubMed | ISI | 29. Sullenger, B. A. & Cech, T. R. Ribozyme-mediated repair of defective mRNA by targeted, transsplicing. Nature 371, 619-622 (1994). | PubMed | ISI | 30. Jones, J. T., Lee, S. W. & Sullenger, B. A. Tagging ribozyme reaction sites to follow transsplicing in mammalian cells. Nature Med. 2, 643-648 (1996). | PubMed | ISI | 31. Phylactou, L. A., Darrah, C. & Wood, M. J. Ribozyme-mediated trans-splicing of a trinucleotide repeat. Nature Genet. 18, 378-381 (1998). | PubMed | ISI | 32. Watanabe, T. & Sullenger, B. A. Induction of wild-type p53 activity in human cancer cells by ribozymes that repair mutant p53 transcripts. Proc. Natl Acad. Sci. USA 97, 8490-8494 (2000). | Article | PubMed | ISI | 33. Lan, N., Howrey, R. P., Lee, S. W., Smith, C. A. & Sullenger, B. A. Ribozyme-mediated repair of sickle beta-globin mRNAs in erythrocyte precursors. Science 280, 1593-1596 (1998). | Article | PubMed | ISI | 34. Puttaraju, M., Jamison, S. F., Mansfield, S. G., Garcia-Blanco, M. A. & Mitchell, L. G. Spliceosome-mediated RNA trans-splicing as a tool for gene therapy. Nature Biotechnol. 17, 246-252 (1999). | Article | PubMed | ISI | 35. Puttaraju, M., DiPasquale, J., Baker, C. C., Mitchell, L. G. & Garcia-Blanco, M. A. Messenger RNA repair and restoration of protein function by spliceosome-mediated RNA trans-splicing. Mol. Ther. 4, 105-114 (2001). | Article | PubMed | ISI | 36. Liu, X. et al. Partial correction of endogenous F508 CFTR in human cystic fibrosis airway epithelia by spliceosome-mediated RNA trans-splicing. Nature Biotechnol. 20, 47-52 (2002). | Article | PubMed | ISI | 37. Kikumori, T., Cote, G. J. & Gagel, R. F. Promiscuity of pre-mRNA spliceosome-mediated trans splicing: a problem for gene therapy? Hum. Gene Ther. 12, 1429-1441 (2001). | Article | PubMed | ISI | 38. Kohler, U., Ayre, B. G., Goodman, H. M. & Haseloff, J. Trans-splicing ribozymes for targeted gene delivery. J. Mol. Biol. 285, 1935-1950 (1999). | PubMed | ISI | 39. Ayre, B. G., Kohler, U., Goodman, H. M. & Haseloff, J. Design of highly specific cytotoxins by using trans-splicing ribozymes. Proc. Natl Acad. Sci. USA 96, 3507-3512 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. (1999). | PubMed | ISI | Zarrinkar, P. P. & Sullenger, B. A. Optimizing the substrate specificity of a group I intron ribozyme. Biochemistry 38, 3426-3432 (1999). | Article | PubMed | ISI | Sullenger, B. A., Gallardo, H. F., Ungers, G. E. & Gilboa, E. Overexpression of TAR sequences renders cells resistant to human immunodeficiency virus replication. Cell 63, 601-608 (1990). | PubMed | ISI | Lee, T. C., Sullenger, B. A., Gallardo, H. F., Ungers, G. E. & Gilboa, E. Overexpression of RREderived sequences inhibits HIV-1 replication in CEM cells. New Biol. 4, 66-74 (1992). | PubMed | ISI | Kohn, D. B. et al. A clinical trial of retroviral-mediated transfer of a rev-responsive element decoy gene into CD34+ cells from the bone marrow of human immunodeficiency virus-1-infected children. Blood 94, 368-371 (1999). | PubMed | ISI | Tuerk, C. & Gold, L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249, 505-510 (1990). | PubMed | ISI | Ellington, A. D. & Szostak, J. W. In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822 (1990). | PubMed | ISI | Gold, L., Polisky, B., Uhlenbeck, O. & Yarus, M. Diversity of oligonucleotide functions. Annu. Rev. Biochem. 64, 763-797 (1995). | PubMed | ISI | White, R. R., Sullenger, B. A. & Rusconi, C. P. Developing aptamers into therapeutics. J. Clin. Invest. 106, 929-934 (2000). | PubMed | ISI | Jellinek, D. et al. Potent 2'-amino-2'-deoxypyrimidine RNA inhibitors of basic fibroblast growth factor. Biochemistry 34, 11363-11372 (1995). | PubMed | ISI | Ruckman, J. et al. 2'-Fluoropyrimidine RNA-based aptamers to the 165-amino acid form of vascular endothelial growth factor (VEGF165). Inhibition of receptor binding and VEGF-induced vascular permeability through interactions requiring the exon 7-encoded domain. J. Biol. Chem. 273, 20556-20567 (1998). | Article | PubMed | ISI | Hicke, B. J. & Stephens, A. W. Escort aptamers: a delivery service for diagnosis and therapy. J. Clin. Invest. 106, 923-928 (2000). | PubMed | ISI | Bock, L. C., Griffin, L. C., Latham, J. A., Vermaas, E. H. & Toole, J. J. Selection of singlestranded DNA molecules that bind and inhibit human thrombin. Nature 355, 564-566 (1992). | PubMed | ISI | Griffin, L. C., Tidmarsh, G. F., Bock, L. C., Toole, J. J. & Leung, L. L. In vivo anticoagulant properties of a novel nucleotide-based thrombin inhibitor and demonstration of regional anticoagulation in extracorporeal circuits. Blood 81, 3271-3276 (1993). | PubMed | ISI | DeAnda, A. Jr et al. Pilot study of the efficacy of a thrombin inhibitor for use during cardiopulmonary bypass. Ann. Thoracic Surg. 58, 344-350 (1994). | ISI | Ostendorf, T. et al. Specific antagonism of PDGF prevents renal scarring in experimental glomerulonephritis. J. Am. Soc. Nephrol. 12, 909-918 (2001). | PubMed | ISI | Drolet, D. W. et al. Pharmacokinetics and safety of an anti-vascular endothelial growth factor aptamer (NX1838) following injection into the vitreous humor of rhesus monkeys. Pharm. Res. 17, 1503-1510 (2000). | PubMed | ISI | Yewdell, J. W., Norbury, C. C. & Bennink, J. R. Mechanisms of exogenous antigen presentation by MHC class I molecules in vitro and in vivo: implications for generating CD8+ T cell responses to infectious agents, tumors, transplants, and vaccines. Adv. Immunol. 73, 1-77 (1999). | PubMed | ISI | Banchereau, J. & Steinman, R. M. Dendritic cells and the control of immunity. Nature 392, 245252 (1998). | Article | PubMed | ISI | Boczkowski, D., Nair, S. K., Snyder, D. & Gilboa, E. Dendritic cells pulsed with RNA are potent antigen-presenting cells in vitro and in vivo. J. Exp. Med. 184, 465-472 (1996). | PubMed | ISI | Koido, S. et al. Induction of antitumor immunity by vaccination of dendritic cells transfected with MUC1 RNA. J. Immunol. 165, 5713-5719 (2000). | PubMed | ISI | 60. Granstein, R. D., Ding, W. & Ozawa, H. Induction of anti-tumor immunity with epidermal cells pulsed with tumor-derived RNA or intradermal administration of RNA. J. Invest. Dermatol. 114, 632-636 (2000). | Article | PubMed | ISI | 61. Zhang, W. et al. Enhanced therapeutic efficacy of tumor RNA-pulsed dendritic cells after genetic modification with lymphotactin. Hum. Gene Ther. 10, 1151-1161 (1999). | Article | PubMed | ISI | 62. Boczkowski, D., Nair, S. K., Nam, J. H., Lyerly, H. K., & Gilboa, E. Induction of tumor immunity and cytotoxic T lymphocyte responses using dendritic cells transfected with messenger RNA amplified from tumor cells. Cancer Res. 60, 1028-1034 (2000). | PubMed | ISI | 63. Ashley, D. M. et al. Bone marrow-generated dendritic cells pulsed with tumor extracts or tumor RNA induce antitumor immunity against central nervous system tumors. J. Exp. Med. 186, 1177-1182 (1997). | PubMed | ISI | 64. Nair, S. K. et al. Induction of cytotoxic T cell responses and tumor immunity against unrelated tumors using telomerase reverse transcriptase RNA transfected dendritic cells. Nature Med. 6, 1011-1017 (2000) | Article | PubMed | ISI | 65. Weissman, D. et al. HIV gag mRNA transfection of dendritic cells (DC) delivers encoded antigen to MHC class I and II molecules, causes DC maturation, and induces a potent human in vitro primary immune response. J. Immunol. 165, 4710-4717 (2000). | PubMed | ISI | 66. Van Tendeloo, V. et al. Highly efficient gene delivery by mRNA electroporation in human hematopoietic cells: superiority to lipofection and passive pulsing of mRNA and to electroporation of plasmid cDNA for tumor antigen loading of dendritic cells. Blood 98, 49-56 (2001). | Article | PubMed | ISI | 67. Su, Z., Peluso, M. V., Raffegerst, S. H., Schendel, D. J. & Roskrow, M. A. The generation of LMP2a-specific cytotoxic T lymphocytes for the treatment of patients with Epstein-Barr viruspositive Hodgkin disease. Eur. J. Immunol. 31, 947-958 (2001). | Article | PubMed | ISI | 68. Strobel, I. et al. Human dendritic cells transfected with either RNA or DNA encoding influenza matrix protein M1 differ in their ability to stimulate cytotoxic T lymphocytes. Gene Ther. 7, 20282035 (2000). | PubMed | ISI | 69. Saeboe-Larssen, S., Fossberg, E. & Gaudernack, G. mRNA-based electrotransfection of human dendritic cells and induction of cytotoxic T lymphocyte responses against the telomerase catalytic subunit (hTERT). J. Immunol. Meth. 259, 191-203 (2002). | ISI | 70. Heiser, A. et al. Human dendritic cells transfected with renal tumor RNA stimulate polyclonal Tcell responses against antigens expressed by primary and metastatic tumors. Cancer Res. 61, 3388-3393 (2001). | PubMed | ISI | 71. Heiser, A. et al. Induction of polyclonal prostate cancer-specific CTL using dendritic cells transfected with amplified tumor RNA. J. Immunol. 166, 2953-2960 (2001). | PubMed | ISI | 72. Heiser, A. et al. Human dendritic cells transfected with RNA encoding prostate-specific antigen stimulate prostate-specific CTL responses in vitro. J. Immunol. 164, 5508-5514 (2000). | PubMed | ISI | 73. Thornburg, C., Boczkowski, D., Gilboa, E. & Nair, S. K. Induction of cytotoxic T lymphocytes with dendritic cells transfected with human papillomavirus E6 and E7 RNA: implications for cervical cancer immunotherapy. J. Immunother. 23, 412-418 (2000). | Article | PubMed | ISI | 74. Nair, S. K. et al. Induction of primary carcinoembryonic antigen (CEA)-specific cytotoxic T lymphocytes in vitro using human dendritic cells transfected with RNA. Nature Biotechnol. 16, 364-369 (1998). | PubMed | ISI | 75. Srivastava, P. K. Do human cancers express shared protective antigens? Or the necessity of remembrance of things past. Semin. Immunol. 8, 295-302 (1996). | Article | PubMed | 76. Gilboa, E. The makings of a tumor rejection antigen. Immunity 11, 263-270 (1999). | PubMed | ISI | 77. Heiser, A. et al. Autologous dendritic cells transfected with prostate-specific antigen RNA stimulate CTL responses against metastatic prostate tumors. J. Clin. Invest. 109, 409-417 (2002). | PubMed | ISI | Figure 1 Applications of trans-cleaving ribozymes for gene inhibition. a, Trans-cleaving ribozymes can bind pathogenic target RNAs through base-pairing interactions, cleave the target, release the reaction products and repeat this process with multiple turnover. b, Clinical trials of HIV gene therapy with trans-cleaving ribozymes and decoy RNAs. Haematopoietic cells that express CD4 or CD34 on their surface are isolated from the patient and transduced with retroviral vectors containing an HIV inhibitory gene (ribozyme or decoy; green virus) or a control gene (red virus). Cells transduced with the vectors are re-introduced into the patient and the relative engraftment and survival of cells containing the HIV inhibitory gene (green) and control vector (red) are determined. Figure 2 Trans-splicing-mediated repair of mutant transcripts. a, Ribozyme-mediated repair. Trans-splicing ribozymes recognize mutant RNAs upstream of a mutation site (Xm). The mutant RNA is cleaved and an exon with a wild-type sequence (Xwt) is ligated onto the cleavage product to generate a corrected transcript. b, Spliceosome-mediated RNA trans-splicing repair of a mutant transcript. A target RNA containing a mutation in its third exon (Xm) normally undergoes cissplicing to generate an mRNA that encodes a defective gene product. A pre-trans-splicing RNA molecule (PTM,red) with a wild-type exon 3 (Xwt) can impede the cis-splicing process and engender trans-splicing to yield an mRNA with a correct sequence. Figure 3 RNA ligand-mediated inhibition of protein function. a, Trans-activation response region (TAR) decoy-mediated inhibition of HIV. During HIV replication, the viral Tat protein (blue) binds the viral TAR RNA (vTAR; green) and trans-activates viral gene expression and replication (top). TAR decoy RNAs (red) can compete for Tat binding and competitively inhibit the Tat–vTAR interaction and stop viral trans-activation and replication (bottom). b, Expression of TAR decoys can render cells resistant to HIV. TAR decoy- and control vector-containing CD4+ T cells were challenged with HIV-1 and viral spread through the cultures was monitored by immunofluorescent staining of cells (yellow cells) at various days following infection. Adapted with permission from ref. 41. c, Isolation of RNA aptamers that bind target proteins using the SELEX (systematic evolution of ligands by exponential enrichment) process. After several rounds of selection, the RNAs remaining in the selected pool are cloned and sequenced to identify the high-affinity RNA aptamers for the target protein of interest. Figure 4 Treatment of cancer patients with tumour RNA-transfected dendritic cells (DCs). In an outpatient setting, a tumour sample is removed from the patient and used to generate, and if necessary amplify, tumour RNA. Blood cells are obtained from the patient by leukapheresis and immature DCs are generated by culturing monocytes for 5–6 days in the presence of cytokines. Immature DCs are transfected with RNA and cultured another 24 hours in the presence of additional cytokines to mature. The antigen-loaded DCs can be cryopreserved for subsequent infusions into the patient. 21 September 2000 Nature 407, 327 - 339 (2000) © Macmillan Publishers Ltd. <> Structure of the 30S ribosomal subunit BRIAN T. WIMBERLY*†, DITLEV E. BRODERSEN*†, WILLIAM M. CLEMONS JR*†‡, ROBERT J. MORGAN-WARREN*†, ANDREW P. CARTER*†, CLEMENS VONRHEIN§, THOMAS HARTSCH & V. RAMAKRISHNAN† † MRC Laboratory of Molecular Biology , Hills Road, Cambridge CB2 2QH, UK ‡ Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, Utah 84132, USA § Global Phasing Ltd., Sheraton House , Castle Park, Cambridge CB3 0AX, UK Göttingen Genomics Laboratory, Institut für Mikrobiologie und Genetik, Georg-August-Universität Göttingen, Grisebachstr. 8, D37077 Göttingen, Germany * These authors contributed equally to this work Correspondence and requests for materials should be addressed to V.R. (e-mail: ramak@mrc-lmb.cam.ac.uk). Coordinates have been deposited in the Protein Data Bank, accession number 1FJF. Coordinates of individual components will be made available on http://alf1.mrc-mb.cam.ac.uk/∼ramak/30S. Genetic information encoded in messenger RNA is translated into protein by the ribosome, which is a large nucleoprotein complex comprising two subunits, denoted 30S and 50S in bacteria. Here we report the crystal structure of the 30S subunit from Thermus thermophilus, refined to 3 Å resolution. The final atomic model rationalizes over four decades of biochemical data on the ribosome, and provides a wealth of information about RNA and protein structure, protein–RNA interactions and ribosome assembly. It is also a structural basis for analysis of the functions of the 30S subunit, such as decoding, and for understanding the action of antibiotics. The structure will facilitate the interpretation in molecular terms of lower resolution structural data on several functional states of the ribosome from electron microscopy and crystallography. Protein synthesis is a complex, multistep process that requires, in addition to the ribosome, several extrinsic GTP-hydrolysing protein factors during each of the main stages of initiation, elongation and termination. The 30S ribosomal subunit has a crucial role in decoding mRNA by monitoring base pairing between the codon on mRNA and the anticodon on transfer RNA; the 50S subunit catalyses peptide-bond formation. Despite several decades of work, the molecular details of the process are poorly understood, and the elucidation of the mechanism of translation is one of the fundamental problems in molecular biology today. A recent collection of articles summarizes the field1. An important contribution was made by Yonath and co-workers2, who showed that structures as large as the 50S ribosomal subunit would form crystals that diffract beyond 3 Å resolution. Originally it was not clear that phase information from such a large asymmetric unit could be obtained to high resolution, but the development of bright, tunable synchrotron radiation sources, large and accurate area detectors, vastly improved crystallographic computing, and the advent of cryocrystallography have all contributed to making structural studies of the ribosome more tractable. In our work, the use of anomalous scattering from the LIII edges of lanthanides and osmium has also played a critical role in obtaining phases3. The 30S ribosomal subunit (hereafter referred to as 30S) from Thermus thermophilus was originally crystallized by the Puschino group in 2-methyl-2,4-pentanediol (MPD)4, and by Yonath and co-workers5 in a mixture of ethyl-butanol and ethanol. The MPD crystal form originally diffracted to about 9–12 Å resolution6, 7. The diffraction limit of these crystals did not improve beyond 7 Å resolution for almost a decade, but more recently both Yonath et al.8, 9 and we3 obtained crystals of the MPD form that exhibit significantly improved diffraction. However, unlike the crystals obtained by the Yonath group9, our crystals do not require soaking in tungsten clusters or heat treatment to obtain high-resolution diffraction. Last year, we described the structure of the 30S at 5.5 Å resolution3. We placed all seven proteins whose structures were known at the time, inferred the structure of protein S20 to be a three-helix bundle, traced the fold of an entire domain of 16S RNA, and identified a long RNA helix at the interface that contains the decoding site of the 30S. Proteins S5 and S7 were also placed in electron density maps of the 30S obtained by Yonath et al.9. We have now solved and refined the structure of the 30S at 3 Å resolution. The structure contains all of the ordered regions of 16S RNA and 20 associated proteins, constituting over 99% of the total 16S RNA and 95% of the ribosomal proteins, with the missing parts being exclusively at the termini of RNA or polypeptide chains. Here we describe the overall structural organization of the 30S subunit, and in an accompanying paper10 we describe functional insights gleaned from the structure and the interaction of antibiotics bound to the 30S. A more detailed analysis of the structure will be presented elsewhere. Results Crystallographic statistics are presented in Table 1. Experimentally phased maps clearly showed main-chain density for RNA and protein, individual bases (of sufficient quality to distinguish purines from pyrimidines) and large well-ordered side chains of proteins (Fig. 1). The structure was initially built from these experimental maps and was rebuilt after a round of refinement. The current model consists of nucleotides 5–1,511 of Thermus thermophilus 16S RNA (corresponding to 5–1,534 of Escherichia coli 16S RNA)11 and all of the ordered regions of the associated 20 proteins. These proteins correspond to E. coli proteins S2–S20 and a small 26-residue peptide, Thx (ref. 12). Thermus does not contain S21, and in our work S1 was removed from the 30S before crystallization. The model has been refined against 3.05 Å native data, resulting in an R/Rfree of 0.208/0.252 with good geometry. For the proteins, 95.7% of the residues were in the core or allowed regions of the Ramachandran plot, 2.4% in the generously allowed region and 1.9% in the disallowed region. Figure 1 Electron density maps of the 30S. Full legend High resolution image and legend (43k) Overview of the 30S The secondary structure diagram for 16S RNA is shown in Fig. 2a, along with the definitions for the standard helix numbering H1–H45 (ref. 13). The sequence numbering for E. coli is used throughout; the main difference is that Thermus has a shorter H6 and H10, and insertions in H9 and H33a. Insertions in Thermus relative to E. coli are indicated in the coordinates by a letter code following the practice for transfer RNA. Figure 2 Overview of the 30S structure. Full legend High resolution image and legend (48k) The overall shape of the 30S is very similar to the model derived from negatively stained electron microscopy samples and to more recent cryo-electron microscopy reconstructions14, and appears to be closer to the 50S-bound form than the different free 30S form15. The shape is largely determined by the RNA component; none of the gross morphological features is all protein. In the canonical 'front' view from the 50S, the tertiary fold of 16S RNA (Fig. 2b) shows the head with a beak pointing leftwards, the body with the shoulder at top left and the spur at lower left, and the platform at top right. Individual secondary structure domains (Fig. 2a) make up each of these morphological features (Fig. 2b), consistent with proposals made in previous modelling studies13, 16, 17. The 5' domain makes up the bulk of the body; the central domain most of the platform, and the 3' major domain constitutes the bulk of the head. The 3' minor domain is the only significant exception to this rule, as it is part of the body at the subunit interface. The four domains of the 16S RNA secondary structure radiate from a central point in the neck region of the subunit, and are especially tightly associated in this area, which is functionally the most important region of the 30S ribosomal subunit. The distribution of proteins and RNA in the 30S is asymmetric, as was predicted from neutron scattering18. The proteins are concentrated in the top, sides and back of the 30S (Fig. 2c, d). None of the proteins binds entirely inside an RNA domain, although S20 binds between two domains (the 3' minor domain and 5' domain). The 50S interface is largely free of protein, with the exception of S12 which lies near the decoding site at the top of the long H44 that runs down the interface. Other proteins lie at the periphery of the subunit interface, allowing them to make contact with the 50S subunit. A movie of the structure is available in the Supplementary Information ( page 340). The structure of the RNA. The secondary structure of 16S RNA contains over 50 regular double helices connected by irregular single-stranded loops (Fig. 2a). In the crystal structure, many of these formally single-stranded loop regions are in fact only slightly irregular double-stranded extensions of neighbouring regular helices. Thus, most of 16S RNA may be described as helical or approximately helical, and it is useful to consider the RNA structure as a three-dimensional arrangement of helical elements. Interactions between helical elements include vertical co- axial stacking of helices neighbouring in sequence, and horizontal packing of helices, usually between their minor grooves. Co-axial stacking of helices is very common: the helices of 16S RNA are organized into 13 groups of co-axially stacked helices and 23 unstacked helices, for a total of 36 helical elements. The packing of these helical elements largely determines the overall fold of each of the four domains of 16S RNA. Short singlestranded RNA segments make idiosyncratic long-range interactions to stabilize the packing of helical elements. Proteins also help stabilize the RNA tertiary structure by binding to two or more RNA helical elements, as described below. Helix packing interactions. There are three types of helix–helix packing in the structure, all of which use the wide and shallow minor groove as an interaction surface (Fig. 3). The most common packing mode is docking of the minor grooves of two helices, as shown for the interaction of H6 with H8 (Fig. 3a). Usually one or both helices are distorted from canonical A-form helical geometry in order to create a larger and more complementary interaction surface. Such helical distortions are caused by the base-pair geometries of noncanonical base pairs and sometimes by a bulged-out base. Noncanonical pairs involving adenines, especially sheared G A, A A, and reverse-Hoogsteen U A and C A pairs, are particularly common and are often adjacent to create a cross-strand adenine stack motif. This motif widens the minor groove to make it more accessible, and it also pushes the adenine bases into the minor groove to facilitate hydrogen bonding to sugar and base functional groups from the packing partner helix. Helix–helix docking results in an extensive and intimate interaction stabilized by a dense network of hydrogen bonds from adenine base nitrogens and 2' OH moieties to the 2' OH, guanine NH2 and pyrimidine O2 atoms from the partner helix (Fig. 3a, with adenines in red). Some of the distorted adenine-rich structures used to mediate this mode of helix–helix packing are recurrent structural motifs (for example, the common S-turn motif19). This minor-groove packing mode is not limited to helix–helix interactions: a similar mode is often seen between single-stranded adenines and a regular double helix, particularly the docking of the last three nucleotides of GNRA hairpin loops against the minor groove of a helix. Occasionally a single unpaired adenine base packs against the minor groove of a regular helix. Figure 3 Different modes of interhelical packing in 16S RNA. Full legend High resolution image and legend (83k) A second and less common form of helix packing involves the insertion of a ridge of phosphates into the minor groove of another helix. This packing mode is stabilized by hydrogen bonds between the ridge of phosphate oxygens and a layer of 2' OH and guanine base NH2 groups, as is shown for the interaction between the phosphate backbone of H7 with the minor groove of H21 (Fig. 3b). These guanine NH2 groups (red in Fig. 3b) are often made more accessible by the geometry of G U wobble pairs, which places this moiety further into the minor groove compared with Watson–Crick base pairs. This phosphate-ridge packing mode creates fewer hydrogen bonds and buries less surface area than the minor-groove mode described above, and it may be less stable. The third and least common mode of helix packing uses an unpaired purine base to mediate the perpendicular packing of one helix against the minor groove of another helix. Three examples of this mode are present in 16S RNA: H27 against the minor groove of a helical stack made up of H1 and H28 ( Fig. 3c); H34 against the H35–H34–H38 stack; and H44 against H28. Like the phosphate-ridge mode, this end-on packing may be less stable than the minor-groove packing. Significantly, all of these end-on packing interactions involve functionally important helices, and in one case (H27 against H1/H28) it appears likely that the packing interaction may change as a result of a conformational switch in H27 (ref. 20 ). Overview of the domains of 16S RNA Stacking and packing of the helical elements of 16S RNA generates three compact domains (5' domain, central domain and 3' major domain) and one extended domain (3' minor domain). Packing interactions between the domains create the functionally important 50S and transfer RNA/messenger RNA binding sites. Here we provide an overview of the structure of each domain; details will be published elsewhere. The 5' domain. The 5' domain is the RNA component of the body. It contains 19 double helices packed as a wedge-shaped mass of RNA that tapers to a single layer of double helices near the top (Fig. 4). Like the other domains, it is rather longer along the subunit interface than in the perpendicular direction. The 5' domain can be divided into three subdomains, roughly corresponding to the upper, lower and middle thirds of the secondary structure of the domain (Fig. 4b–d). These subdomains make up the top and left-hand, the middle and the lower right-hand sides of the body, respectively, in the view from 50S. The spur at the bottom of the 30S is formed by H6, which is known to vary in length across species21. Figure 4 Structure of the 5' domain of 16S RNA. Full legend High resolution image and legend (77k) A particularly striking feature is the H16–H17 co-axial stack, which is almost 120 Å long and forms the left-hand border of the body (Fig. 4b), with H16 reaching out to the head. There are also two examples of sharp bends in helices. H18 is sharply bent to accommodate the functionally important 530 pseudoknot (Fig. 4b), which packs against the central pseudoknot at the H18–H1 interface. H11 contains two sharp bends that allow its conserved terminal hairpin loop to pack against H7 (Fig. 4d). Both bends are stabilized by short-range minor-groove to minor-groove packing contacts. Finally, there is an unusual packing interaction between the highly conserved UACG and GAAA tetraloops at the ends of H8 and H14 near the subunit interface. The central domain. The central domain is the RNA component of the platform. Its fold based on our previous 5.5 Å model3, and the high-resolution structures of parts of it22, 23 are in excellent agreement with our current structure. It contains nine helical elements folded into a W-shape in the 50S view (Fig. 5). Two long single-stranded segments of RNA, the 570 and 820 loops, are also important structural elements. The central domain is dominated by the long stack of H21–H22–H23, which forms the outer arms of the W. At one end, H21 wraps around the back of the 5' domain; at the other, H22, H23 and the roughly parallel H24 form the bulk of the platform. The tip of the platform consists of H23B and H24A, whose conserved and functionally important hairpin loops (the 690 and 790 loops) are tightly packed. This arrangement requires sharp bends between H23 and H23B, and between H24 and H24A. The H23–H23B bend is stabilized by short-range minor groove/minor groove packing interactions. The H24–H24A bend is more unusual in that the bend is towards the major groove, which places a ridge of H24A phosphates in the major groove of H24. This majorgroove bend is stabilized partly by short-range base–base and base–backbone interactions in the major groove of the bend, and partly by long-range interactions between the bent H24/H24A minor groove and the minor groove of H23. Figure 5 Structure of the central domain of 16S RNA. Full legend High resolution image and legend (58k) The 3' major domain. The 3' major domain is the RNA component of the head. The left-hand side of the head tapers to a beak made of RNA on the 50S side and protein on the solvent side (Fig. 6a). The 3' major domain consists of 15 helical elements, which can be roughly divided into three subdomains (Figs 6b–d). The upper subdomain is an extended structure in the part of the head farthest from the 50S subunit, and makes relatively few packing contacts with other RNA. The lower and middle subdomains are more globular and are more intimately packed together, and make up the front-right and front-left portions of the head, respectively. The middle subdomain includes the RNA portion of the beak. Figure 6 Structure of the 3' major and 3' minor domains of 16S RNA. Full legend High resolution image and legend (68k) In contrast to the extensive stacking of neighbouring helices seen in the 5' domain and the central domain, most of the helices in the 3' major domain do not stack on a neighbouring helix. An exception is the H35–H36–H38–H39 stack that dominates the upper subdomain (Fig. 6b) and stretches from the top to the bottom of the head. Significantly, the functionally important helices H31 and H34 are quite irregular and make only rather weak packing interactions with other RNA helices (Fig. 6a, c, d). The 3' minor domain. The 3' minor domain consists of just two helices at the subunit interface (Fig. 6e). H44 is the longest single helix in the subunit, and stretches from the bottom of the head to the bottom of the body. It projects prominently from the body for interaction with the 50S subunit. H45 is roughly perpendicular to H44, with its conserved GGAA hairpin loop packed against H44 and available for interaction with the large subunit. The 30S proteins and their interaction with 16S RNA The structures of the proteins S2–S20 and Thx are summarized in Table 2 and shown with surrounding RNA and proteins in Figs 7–9. The structures of proteins solved in isolation (see refs in Table 2) were very useful in initial interpretation of the map3. In general, previous biochemical data on hydroxyl-radical footprinting24 and ultravioletinduced crosslinking (summarized in ref. 25 and http://www.mpimg-berlindahlem.mpg.de/ ag_ribo/ag_brimacombe/drc) agree well with the structure, and were useful as a guide to interpreting the fold at lower resolution. Finally, the structure is in good agreement with the neutron map26, with S13, S14, S16, S19 and S20 being outliers. Of these, only S20 is located in a completely different region of the 30S from that predicted by the neutron map. Figure 7 Proteins from the central and 5' domains. Full legend High resolution image and legend (141k) Figure 9 Proteins from the head. Full legend High resolution image and legend (186k) The proteins generally contain of one or more globular domains. The same types of folds are frequently found in different proteins, such as the -barrel in S12 and S17, and helices packed against a -sheet as observed in S3, S10, S6 and S11. It is interesting that these domains often interact with RNA in very different ways. The -sheet of S11 packs flat against the minor groove of RNA, but in S6 the edge of the sheet interacts with RNA. In S3, one of the / -domains makes no contact with RNA at all. In addition to a globular domain nearly all the proteins contain long extensions. These can be helical such as the -hairpin in S2 or the carboxy-terminal helices in S13; long hairpins or loops protruding from S10 or S17; or extended carboxy- or amino-terminal tails seen in proteins S4, S9, S11, S12, S13 and S19. These extensions make intimate contact with RNA, and were generally not visible in isolated structures because they were disordered in the absence of RNA. The extensions reach far into the surrounding RNA, and allow single proteins to contact several RNA elements, which is probably important for the stabilization of the RNA tertiary fold. The extensions are particularly well suited for this task because they are narrow, allowing close approach of different RNA elements, and they have the basic residues required to neutralize the charge repulsion of the RNA backbone. An extension of this principle is seen in Thx. This small peptide fits into a cavity between multiple RNA elements in the top of the head, and its positive charge stabilizes the organization of these elements. Proteins are often found bound to junctions between helices. S4 binds to the 5-way junction between H3, H4, H16, H17, H18 in the 5' domain, and S7 binds tightly to the junction of H29, H30, H41, H42 in the 3' major domain. Both proteins are important in early stages of 30S assembly of the body and head respectively27. Similarly, proteins S8, S15 and S17 bind to the three-way junction formed by H20, H21 and H22, and are important for the assembly of the central domain23, 28. Thus, protein binding to helical junctions is important for initiating the correct tertiary fold of RNA. Many of the proteins also contact RNA elements in different domains, helping to organize the overall structure of the 30S. For example, S17 contacts H7 and H11 in the 5' domain, and H21 in the central domain, while S20 mediates the contact between H44 in the 3' minor domain with the 5' domain at the base of the body. Much of the contact between the head and body is mediated by proteins, such as S2 and S5. In general the structure rationalizes biochemical studies on 30S assembly28, 29, although there are also discrepancies. For example, it is unlikely that the binding of S20 in the lower part of the 30S could influence the binding of S13 in the head. The implications of the structure for ribosome assembly will be published elsewhere. In addition to the intimate contacts with RNA, a number of protein–protein interactions are seen. S3, S10 and S14 form a tight cluster held together by hydrophobic interactions. They are wedged into a V-shaped gap between two RNA domains in the head, thus helping to stabilize its structure. In contrast, other protein clusters such as S4–S5–S8 are held together by predominantly electrostatic and hydrogen-bonding interactions, and the S4–S5 interface may undergo some rearrangement during 30S function, as discussed in the accompanying paper10. The globular domain of S12 on the interface side is connected by a long extension which snakes through the body of the RNA, and then folds into a short helix which makes contacts with S8 and S17 on the back side. A similar (although less extensive) interaction is observed between the C-terminal tail of S9 and the proteins S14 and S10 on the other side of the head. The tethering together of proteins that are on opposite sides of an RNA region may help hold the RNA structure together. The proteins contain a number of special features that are probably required for the stability of Thermus 30S at high temperatures. An obvious example is the extra C-terminal helix in S17, which increases the RNA contacts made by this protein. Other examples are the Znbinding motifs found in S14 and the N terminus of S4. The Zn-binding cysteine residues are not conserved, but the surrounding residues are, indicating that metal-ion coordination is used to give extra stability to a domain that is held together by hydrophobic contacts in mesophiles such as E. coli. Conclusions This high-resolution structure of the 30S ribosomal subunit is significant for several reasons. It will allow the rationalization in structural terms of four decades of biochemical efforts to elucidate the mechanism of protein synthesis. As a first step, functional insights from the structure and a description of 30S interactions with antibiotics are presented in the accompanying paper10. The structure will facilitate the design of testable models for various aspects of ribosome function, and it will also be a basis for the interpretation of lower resolution electron-microscopic or X-ray crystallographic maps of ribosomes in different functional states. The structure of the 30S also greatly increases our current database of RNA structure and protein–RNA interactions, and may provide general rules and improve the accuracy of prediction methods for RNA tertiary structure. Methods Crystallization of the 30S subunit We obtained crystals by a straightforward optimization of Trakhanov et al.4 with respect to pH, and concentrations of Mg2+ ions and MPD. The final conditions were 250 mM KCl, 75 mM NH4Cl, 25 mM MgCl2 and 6 mM 2mercaptoethanol in 0.1 M potassium cacodylate or 0.1 M MES (2-N-morpholinoethanesulphonic acid) at pH 6.5 with 13–17% MPD as the precipitant. We noticed initially that the 30S crystals completely lacked ribosomal protein S1 (J. L. C. May and V.R., unpublished results), so we removed S1 selectively from the 30S before crystallization, which improved both size and reproducibility. The crystals grew to the maximum size in about 6 weeks at 4 °C. The largest crystals, which were required for high-resolution data collection, grew to a size of 80–100 80–100 200–300 µm. The activity of redissolved crystals in poly(U)-directed protein synthesis was comparable to that of freshly isolated 30S subunits (data not shown). Data collection Crystals were transferred to 26% MPD by vapour diffusion in two steps over a period of six days. All solutions (except for those containing osmium hexammine or osmium pentammine) also contained 1 mM cobalt hexammine in the cryoprotectant. Crystals were flash-cooled by plunging into liquid nitrogen, and data collection was done in a cryostream at 90–100 K. A large fraction of crystals was screened at beamlines 9.6 or 14.1 at the SRS at Daresbury Laboratories, using two short exposures at least 40 degrees apart. These crystals were then analysed for diffraction limits, cell dimensions and mosaic spread. Only crystals of similar cell dimensions and with reasonable mosaic spread were used for data collection. Potential derivatives were screened on beamlines X25 at the NSLS at Brookhaven National Laboratory and BM-14 at the ESRF (Grenoble). Data to about 4.5 Å were obtained from X25. High-resolution data were collected at SBC 19ID at the APS in Argonne National Laboratory, and ID14-4 at the ESRF. In all cases, derivative data were collected at the peak of the fluorescence at the LIII edge to maximize anomalous differences. At X25 and SBC 19ID, the kappa goniostat was used to rotate precisely about a mirror plane so that small anomalous differences could be measured accurately despite radiation decay and the use of multiple crystals. Each crystal typically yielded 3–10 degrees of data. Data were integrated and scaled using HKL-2000 (ref. 30). Structure determination Previously determined phases at 5.5 Å (ref. 3) were used to locate heavy atom sites using anomalous difference Fourier maps. Initially, these sites were used for phasing to 3.35 Å using the program SOLVE31, followed by density modification with SOLOMON32, using the procedure implemented in SHARP33. Optimization of the various parameters in the procedure was required to obtain interpretable maps. The RNA and some of the proteins were built using the SOLVE maps. The sequence of Thermus thermophilus 16S RNA11 and both previously available and unpublished (Göttingen Thermus Genome Sequencing Project) sequences for the proteins were used to build the structure. Improved maps were obtained by calculating experimental phases using SHARP33 followed by density modification and phase extension to 3.05 Å with SOLOMON32 and DM34. The improved maps allowed us to build all the remaining ordered parts of the structure. The model was built using O (ref. 35), and refined using the program CNS36. Maximum likelihood refinement was used, initially with both amplitudes and experimental phase probability distributions to 3.35 Å, and subsequently with amplitudes to 3.05 Å. Details of the experimental protocols will be published elsewhere. Received 14 July 2000; accepted 10 August 2000 References 1. Garrett, R. A. et al. (eds) The Ribosome. Structure, Function, Antibiotics and Cellular Interactions (ASM, Washington DC, 2000). 2. von Böhlen, K. et al. Characterization and preliminary attempts for derivatization of crystals of large ribosomal subunits from Haloarcula marismortui diffracting to 3 Å resolution. J. Mol. Biol. 222, 11-15 (1991). | PubMed | ISI | 3. Clemons, W. M. et al. Structure of a bacterial 30S ribosomal subunit at 5.5 Å resolution. Nature 400, 833-840 (1999). | Article | PubMed | ISI | 4. Trakhanov, S. D. Crystallization of 70S ribosomes and 30S ribosomal subunits from Thermus thermophilus. FEBS Lett. 220, 319-322 (1987). | ISI | 5. Glotz, C. et al. Three-dimensional crystals of ribosomes and their subunits from eu- and archaebacteria. Biochem. Int. 15, 953-960 (1987). | PubMed | ISI | 6. Yonath, A. et al. Characterization of crystals of small ribosomal subunits. J. Mol. Biol. 203, 831834 (1988). | PubMed | ISI | 7. Yusupov, M. M., Tischenko, S. V., Trakhanov, S. D., Ryazantsev, S. N. & Garber, M. B. A new crystalline form of 30S ribosomal subunits from Thermus thermophilus. FEBS Lett. 238, 113115 (1988). | ISI | 8. Yonath, A. Crystallographic studies on the ribosome, a large macromolecular assembly exhibiting severe nonisomorphism, extreme beam sensitivity and no internal symmetry. Acta Crystallogr. A 54, 945-955 (1998). | Article | PubMed | ISI | 9. Tocilj, A. et al. The small ribosomal subunit from Thermus thermophilus at 4.5 Å resolution: pattern fittings and the identification of a functional site. Proc. Natl Acad. Sci. USA 96, 1425214257 (1999). | Article | PubMed | ISI | 10. Carter, A. P. et al. Functional insights from the structure of the 30S ribosomal subunit and its interaction with antibiotics. Nature 407, 340-348 (2000). | Article | PubMed | ISI | 11. Hartmann, R. K. & Erdmann, V. A. Thermus thermophilus 16S rRNA is transcribed from an isolated transcription unit. J. Bacteriol. 171, 2933-2941 (1989). | PubMed | ISI | 12. Choli, T., Franceschi, F., Yonath, A. & Wittmann-Liebold, B. Isolation and characterization of a new ribosomal protein from the thermophilic eubacteria, Thermus thermophilus, T. aquaticus and T. flavus. Biol. Chem. Hoppe Seyler 374, 377-383 (1993). | PubMed | ISI | 13. Mueller, F. & Brimacombe, R. A new model for the three-dimensional folding of Escherichia coli 16S ribosomal RNA. I. Fitting the RNA to a 3D electron microscopic map at 20 Å. J. Mol. Biol. 271, 524-544 (1997). | Article | PubMed | ISI | 14. Gabashvili, I. S., Agrawal, R. K., Grassucci, R. & Frank, J. Structure and structural variations of the Escherichia coli 30S ribosomal subunit as revealed by three-dimensional cryo-electron microscopy. J. Mol. Biol. 286, 1285-1291 (1999). | Article | PubMed | ISI | 15. Gabashvili, I. S. et al. Solution structure of the E. coli 70S ribosome at 11.5 Å resolution. Cell 100, 537-549 (2000). | PubMed | ISI | 16. Stern, S., Weiser, B. & Noller, H. F. Model for the three-dimensional folding of 16S ribosomal 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. RNA. J. Mol. Biol. 204, 447-481 (1988). | PubMed | ISI | Malhotra, A. & Harvey, S. C. A quantitative model of the Escherichia coli 16S RNA in the 30S ribosomal subunit. J. Mol. Biol. 240, 308-340 (1994). | Article | PubMed | ISI | Ramakrishnan, V. Distribution of protein and RNA in the 30S ribosomal subunit. Science 231, 1562-1564 (1986). | PubMed | ISI | Wimberly, B., Varani, G. & Tinoco, I. The conformation of loop E of eukaryotic 5S ribosomal RNA. Biochemistry 32, 1078-1087 (1993). | PubMed | ISI | Lodmell, J. S. & Dahlberg, A. E. A conformational switch in Escherichia coli 16S ribosomal RNA during decoding of messenger RNA. Science 277, 1262-1267 (1997). | Article | PubMed | ISI | Gutell, R. R. in Ribosomal RNA Structure, Evolution, Processing and Function in Protein Biosynthesis (eds Dahlberg, A. E. &&amp&amp&amp&amp; Zimmermann, R. A.) 111-128 (CRC, Boca Raton, 1996). Nikulin, A. et al. Crystal structure of the S15-rRNA complex. Nature Struct. Biol. 7, 273-277 (2000). | Article | PubMed | ISI | Agalarov, S. C., Sridhar Prasad, G., Funke, P. M., Stout, C. D. & Williamson, J. R. Structure of the S15,S6,S18-rRNA complex: assembly of the 30S ribosome central domain. Science 288, 107-113 (2000). | Article | PubMed | ISI | Powers, T. & Noller, H. F. Hydroxyl radical footprinting of ribosomal proteins on 16S rRNA. RNA 1, 194-209 (1995). | PubMed | ISI | Mueller, F. & Brimacombe, R. A new model for the three-dimensional folding of Escherichia coli 16S ribosomal RNA. II. The RNA-protein interaction data. J. Mol. Biol. 271, 545-565 (1997). | Article | PubMed | ISI | Capel, M. S. et al. A complete mapping of the proteins in the small ribosomal subunit of Escherichia coli. Science 238, 1403-1406 (1987). | PubMed | ISI | Nowotny, V. & Nierhaus, K. H. Assembly of the 30S subunit from Escherichia coli ribosomes occurs via two assembly domains which are initiated by S4 and S7. Biochemistry 27, 7051-7055 (1988). | PubMed | ISI | Stern, S., Powers, T., Changchien, L. -M. & Noller, H. F. RNA-protein interactions in 30S ribosomal subunits: folding and function of 16S rRNA. Science 244, 783-790 (1989). | PubMed | ISI | Held, W. A., Ballou, B., Mizushima, S. & Nomura, M. Assembly mapping of 30S ribosomal proteins from Escherichia coli. Further studies. J. Biol. Chem. 249, 3103-3111 (1974). | PubMed | ISI | Otwinowski, Z. & Minor, W. in Methods in Enzymology (eds Carter, C. W. & Sweet, R. M.) 307325 (Academic, New York, 1997). Terwilliger, T. & Berendzen, J. Automated MAD and MIR structure determination. Acta Crystallogr. D 55, 849-861 (1999). | Article | PubMed | ISI | Abrahams, J. P. Bias reduction in phase refinement by modified interference functions: introducing the gamma correction. Acta Crystallogr. D 53, 371-376 (1997). | Article | ISI | de la Fortelle, E. & Bricogne, G. in Methods in Enzymology (eds Carter, C. W. & Sweet, R. M.) 472-493 (Academic, New York, 1997). Cowtan, K. & Main, P. Miscellaneous algorithms for density modification. Acta Crystallogr. D 54, 487-493 (1998). | Article | ISI | Jones, T. A. & Kjeldgaard, M. Electron-density map interpretation. Methods Enzymol. 277B, 173-207 (1997). | ISI | Brünger, A. T. et al. Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D 54, 905-921 (1998). | Article | PubMed | ISI | Carson, M. Ribbons 2.0. J. Appl. Cryst. 24, 958-961 (1991). | Article | ISI | Davies, C., Gerstner, R. B., Draper, D. E., Ramakrishnan, V. & White, S. W. The crystal structure of ribosomal protein S4 reveals a two-domain molecule with an extensive RNA-binding 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. surface: one domain shows structural homology to the ETS DNA-binding motif. EMBO J. 17, 4545-4558 (1998). | Article | PubMed | ISI | Markus, M. A., Gerstner, R. B., Draper, D. E. & Torchia, D. A. The solution structure of ribosomal protein S4 delta41 reveals two subdomains and a positively charged surface that may interact with RNA. EMBO J. 17, 4559-4571 (1998). | Article | PubMed | ISI | Ramakrishnan, V. & White, S. W. Structure of ribosomal protein S5 reveals sites of interaction with 16S RNA. Nature 358, 768-771 (1992). | PubMed | ISI | Lindahl, M. et al. Crystal structure of the ribosomal protein S6 from Thermus thermophilus. EMBO J. 13, 1249-1254 (1994). | PubMed | ISI | Wimberly, B. T., White, S. W. & Ramakrishnan, V. The structure of ribosomal protein S7 at 1.9 Å resolution reveals a beta-hairpin motif that binds double-stranded nucleic acids. Structure 5, 1187-1198 (1997). | PubMed | ISI | Hosaka, H. et al. Ribosomal protein S7: a new RNA-binding motif with structural similarities to a DNA architectural factor. Structure 5, 1199-1208 (1997). | PubMed | ISI | Davies, C., Ramakrishnan, V. & White, S. W. Structural evidence for specific S8-RNA and S8protein interactions within the 30S ribosomal subunit: ribosomal protein S8 from Bacillus stearothermophilus at 1.9 Å resolution. Structure 4, 1093-1104 (1996). | PubMed | ISI | Nevskaya, N. et al. Crystal structure of ribosomal protein S8 from Thermus thermophilus reveals a high degree of structural conservation of a specific RNA binding site. J. Mol. Biol. 279, 233-244 (1998). | Article | PubMed | ISI | Berglund, H., Rak, A., Serganov, A., Garber, M. & Härd, T. Solution structure of the ribosomal RNA binding protein S15 from Thermus thermophilus. Nature Struct. Biol. 4, 20-23 (1997). | PubMed | ISI | Clemons, W. M., Davies, C., White, S. W. & Ramakrishnan, V. Conformational variability of the N-terminal helix in the structure of ribosomal protein S15. Structure 6, 429-438 (1998). | PubMed | ISI | Allard, P. Another piece of the ribosome: Solution structure of S16 and its location in the 30S subunit. Structure 8, 875-882 (2000). | Article | PubMed | ISI | Golden, B. L., Hoffman, D. W., Ramakrishnan, V. & White, S. W. Ribosomal protein S17: characterization of the three-dimensional structure by 1H- and 15N-NMR. Biochemistry 32, 12812-12820 (1993). | PubMed | ISI | Helgstrand, M. et al. Solution structure of the ribosomal protein S19 from Thermus thermophilus. J. Mol. Biol. 292, 1071-1081 (2000). | Article | Acknowledgements. This work was supported by the Medical Research Council (UK) and a US National Institutes of Health grant to V.R. and S. W. White. Beamlines at Argonne and Brookhaven were supported by the US Department of Energy. D.E.B. was supported by an EMBO long-term postdoctoral fellowship,and W.M.C. by an NIH predoctoral fellowship. We thank B. S. Brunschwig and M. H. Chou for gifts of osmium hexammine and osmium bipyridine; T. Terwilliger for help with phasing using SOLVE; T. A. LeafJones for providing us a version of O with RNA tools; and our colleagues at the LMB for their advice and encouragement. We are indebted to A. Joachimiak, S. L. Ginell, R. Ravelli, S. McSweeney, G. Leonard, A. Thompson, H. Lewis, L. Berman, M. Papiz, S. Girdwood and M. MacDonald for help and advice on synchrotron beamlines. Figure 1 Electron density maps of the 30S. The experimental maps shown here were obtained by phasing with SHARP followed by density modification using SOLOMON and DM (see text); refined maps represent A-weighted (2mFo- DFc) maps using the final model. a, b, Experimental and refined maps respectively, of a loop of RNA. c, d, Experimental and refined maps of a strand of protein showing amino-acid side chains. These and other figures were made using RIBBONS37. Figure 2 Overview of the 30S structure. a, Secondary structure diagram of 16S RNA (modified with permission from http://www.rna.icmb.utexas.edu/CSl/2STR/Schematics/e.coli16s.27.5.5.schem.ps; see also ref. 21), showing the definition of the various helical elements used throughout the text. The numbering and diagram correspond to the E. coli sequence. Red, 5' domain; green, central domain; orange, 3' major domain; cyan, 3' minor domain. b, Stereo view of the tertiary structure of 16S RNA from our refined model, showing the 50S or 'front' view, with the same colouring for the domains. H, head; Be, beak; N, neck; P, platform; Sh, shoulder; Sp, spur; Bo, body. c, d, Front (50S) and back sides of the 30S. Grey, RNA; blue, proteins. Figure 3 Different modes of interhelical packing in 16S RNA. a, The common minor-groove to minor-groove packing mode is often stabilized by a layer of adenosines (red), which mediate most of the hydrogen bonds between two helices (magenta and yellow). b, The phosphate ridge to minor groove mode. Usually this mode is stabilized by hydrogen bonds between guanine N2 groups (red) and phosphate oxygens. c, The rare end-on mode of packing uses an unpaired purine (leftmost yellow base) to mediate packing of two helices at right angles to each other. Figure 4 Structure of the 5' domain of 16S RNA. a, Stereo view of the entire 5' domain, with an inset on the right showing its location in the 30S subunit. The upper (b), middle (c) and lower (d) subdomains are shown separately next to corresponding parts of the secondary structure diagrams. The colours in the secondary structure diagrams match those in the structure in this and Figs 5 and 6. Figure 5 Structure of the central domain of 16S RNA. a, Stereo view of the domain with secondary structure diagram and inset showing its location in the 30S. b, Secondary structure diagram for the central domain. c, Central portion of the domain, rotated relative to the other domains for clarity. Figure 6 Structure of the 3' major and 3' minor domains of 16S RNA. a, Stereo view of the 3' major domain with inset showing its location in the 30S. b–d, The upper, middle and lower parts of the 3' major domain, with corresponding secondary structure diagrams. e, Stereo view of the 3' minor domain, with secondary structure diagram and inset showing its location in the 30S. Figure 7 Proteins from the central and 5' domains. The colouring for the various RNA elements in this and subsequent figures is the same as in Figs 3–6. a, The S6–S18–S11 complex; b, S8; c, the S15–S17–S8 complex; d, S16; e, S20. In Figs 7–9 the stereo views have been rotated relative to the insets for clarity. Figure 8 Proteins near the functional centre of the 30S. a, S4, with the Zn ion shown as a sphere; b, S5; c, S12. Figure 9 Proteins from the head. a, S7; b, S9; c, the S10–S14 complex, with the Zn ion in S14 shown as a sphere; d, S3 on top of S10 and S14; e , the S13–S19 complex; f, S2; g, Thx. 17 September 1998 Nature 395, 260 - 263 (1998) <> RNA-catalysed nucleotide synthesis PETER J. UNRAU1 AND DAVID P. BARTEL1 Whitehead Institute for Biomedical Research, and Department of Biology, MIT, 9 Cambridge Center, Cambridge, Massachusetts 02142, USA Correspondence and requests for materials should be addressed to D.P.B. The nine active sequences have been deposited in GenBank under accession numbers AF051883–51891. The 'RNA world' hypothesis proposes that early life developed by making use of RNA molecules, rather than proteins, to catalyse the synthesis of important biological molecules1. It is thought, however, that the nucleotides constituting RNA were scarce on early Earth1-4. RNA-based life must therefore have acquired the ability to synthesize RNA nucleotides from simpler and more readily available precursors, such as sugars and bases. Plausible prebiotic synthesis routes have been proposed for sugars5, sugar phosphates6 and the four RNA bases7-11, but the coupling of these molecules into nucleotides, specifically pyrimidine nucleotides, poses a challenge to the RNA world hypothesis1-3. Here we report the application of in vitro selection to isolate RNA molecules that catalyse the synthesis of a pyrimidine nucleotide at their 3' terminus. The finding that RNA can catalyse this type of reaction, which is modelled after pyrimidine synthesis in contemporary metabolism, supports the idea of an RNA world that included nucleotide synthesis and other metabolic pathways mediated by ribozymes. In modern metabolism, pyrimidine nucleotides are synthesized from activated ribose (pRpp) and pyrimidine bases (such as uracil or orotate). For example, uracil phosphoribosyltransferase (UPRT) catalyses the reaction shown in Fig. 1. The chemistry of this reaction, nucleophilic attack on carbon after release of pyrophosphate, is central to the biosynthesis of nucleotides and amino acids (histidine and tryptophan), yet absent from known ribozyme reactions. The reaction differs from known RNA-catalysed reactions in other key respects: it occurs by an SN1 mechanism (involving the stabilization of an oxocarbocation at the reaction centre, C1' of ribose)12,13, and uracil is significantly smaller than the smallest ribozyme substrates. Figure 1 UPRT-catalysed synthesis of uridine 5'phosphate from pRpp and uracil. Full legend High resolution image and legend (16k) To explore the ability of RNA to promote nucleotide synthesis, we performed an iterative selection in vitro to isolate from random sequences ribozymes that synthesized a pyrimidine nucleotide at their 3' terminus (Fig. 2). The initial pool of sequences contained >1.5 1015 different RNA molecules, each with 228 random positions. To begin each selective round, pRpp was attached to the 3' end of pool RNA. RNA-pRpp conjugates were incubated with a uracil analogue, 4-thiouracil (4SUra), to allow those sequences capable of glycosidic bond formation the opportunity to link their tethered ribose to 4SUra. RNAs attached to the newly synthesized nucleotide, 4-thiouridine (4SU), were then enriched, amplified, and subjected to another round of selection–amplification. 4SU was chosen as the desired product because the thione at position 4 interacts strongly with a number of thiophilic reagents14, facilitating the efficient enrichment of RNAs with a single 4SU nucleotide. In other respects, sulphur substitution of the 4-keto oxygen of uracil has little effect; tautomeric patterns of the N1 and N3 protons are comparable, whereas the sulphur substitution lowers the pKa of uracil by less than one unit to 8.4 (refs 15, 16). After four rounds of selection–amplification, ribozyme activity was readily detected (Fig. 3). Pool activity increased another 50,000-fold in response to seven additional selective rounds that included shorter incubation times, lower 4S Ura concentration and mutagenic amplification (Fig. 3). Figure 2 In vitro selection scheme. Full legend High resolution image and legend (82k) Figure 3 Increased ribozyme activity with successive rounds of selection. Full legend High resolution image and legend (73k) Ribozymes that had undergone 11 rounds of selection were cloned, and 35 random clones were sequenced. A family of 25 related sequences generated by the mutation of a single ancestral sequence dominated the round-11 isolates and was designated family A (Fig. 4a). The remaining isolates represented two other families, family B (eight isolates) and family C (two isolates). Restriction analysis of PCR DNA from each round of selection17 indicated that these were the only three families of nucleotide-synthesizing ribozymes to emerge to detectable levels ( 4% of the population) during the entire course of the selection and evolution (data not shown). Figure 4 Analysis of ribozyme sequences and their catalytic proficiencies. Full legend High resolution image and legend (70k) Synthesis of authentic tethered 4SU was confirmed for representatives of each ribozyme family by nearest-neighbour analysis of end-labelled product (Fig. 5). The ribozymesynthesized nucleotide precisely comigrated with the 4-thiouridine 3'-phosphate (4SUp) standard in all six separation systems tested: two polyacrylamide gel systems and four thinlayer chromatography (TLC) systems. An analysis utilizing a two-dimensional TLC system known to resolve the many different modified bases of ribosomal RNA18 is shown (Fig. 5). Figure 5 Two-dimensional TLC analysis18 of ribozyme-synthesized 4SU after labelling and diges. Full legend High resolution image and legend (48k) Isolates from each family promoted nucleotide formation with apparent second-order rate constants (kcat/Km values) up to 107 times greater than our upper bound on the uncatalysed reaction rate (Fig. 4). In attempts to detect the uncatalysed reactions, a radiolabelled pRppderivatized oligonucleotide was incubated with 4SUra and the reaction mixture was resolved on APM gels. 4SU synthesis was not detected, even though these assays would have readily detected a reaction as slow as 6 10-7 M-1 min-1. The rates of a family A isolate fit well to a Michaelis–Menten curve with an apparent Km of 28 4 mM and a kcat of 0.13 0.012 min-1 (means s.e.m.; Fig. 4b), although it should be noted that solubility constraints prevented rate measurements with 4SUra concentrations above 14 mM. Representatives from the other two sequence families had comparable kcat/Km values but did not begin to display saturable behaviour at soluble concentrations of 4SUra (Fig. 4b), suggesting poorer binding to 4SUra but faster catalysis on encountering 4SUra. All three ribozymes had high specificity for the 4SUra substrate. No thio-containing product was detected on APM gels after body-labelled ribozyme-pRpp was incubated with any of six other thio-substituted pyrimidine bases (2-thiouracil, 2,4-thiouracil, 2-thiocytosine, 2thiopyrimidine, 2-thiopyridine and 5-carboxy-2-thiouracil; limits of detection, 4 10-3 to 1 10-4 M-1 min-1, depending on whether the product migrated to an area of the gel with high or low background). The family A isolate reacted very slowly with radiolabelled uracil, with a rate (2 10-4 M-1 min-1) comparable to that of the pool with 4SUra after four rounds of selection. Protein enzymes that synthesize pyrimidine nucleotides are thought to catalyse an SN1 reaction by stabilizing an oxocarbocation at the C1' carbon of the reaction centre12,13. One challenge for these enzymes is to avoid the hydrolysis of pRpp to ribose 5'-phosphate and pyrophosphate, which occurs if water rather than the pyrimidine base is proximal to the highly reactive carbocation. The enzyme that synthesizes orotidine (6-carboxyuridine) avoids hydrolysing pRpp by excluding water from the active site and by promoting carbocation formation only after a conformational change induced by binding of orotate (6carboxyuracil)12,13. As with the metabolic enzymes, our ribozymes were selected to avoid pRpp hydrolysis; any catalyst that hydrolysed its pRpp moiety before 4SUra addition (for example, during the T4 RNA ligase incubation needed to attach the pRpp) would have been inactive for 4SU synthesis. Thus, we examined the degree to which the ribozymes promoted the hydrolysis of the tethered pRpp. Ribozymes from the three families promoted the hydrolysis of their pRpp moiety at rates 12–23 times faster than uncatalysed hydrolysis (Fig. 4a). Nevertheless their kcat values for 4SU formation were 60 times faster than their rates of catalysed hydrolysis. Either RNA does have the ability to exploit highly reactive reaction intermediates or these ribozymes have employed a new strategy for promoting glycosidic bond formation, that is, by stabilizing a transition state with more SN2 character. As with many ribozymes and the metabolic enzymes that synthesize nucleotides, all three ribozyme families required divalent cations for activity. For each round of selection, Mg2+ (25 mM), Ca2+ (1 mM) and Mn2+ (0.5 mM) had been provided. However, Ca2+ was dispensable for all three families. Although the three families were active in the presence of Mn2+ as the only divalent cation, all preferred Mg2+ over Mn2+ as the major divalent cation. The family A isolate did not require Mn2+, with only a twofold decrease in activity observed in the absence of Mn2+. The family B and C ribozymes required Mn2+, with the activity in the presence of 25 mM Mg2+ reaching a plateau at 1 mM Mn2+. The family B ribozyme did not require Mn2+ for stimulating pRpp hydrolysis, suggesting that for this family Mn2+ has a role in the binding or proper orientation of the 4SUra, consistent with the thiophilic nature of Mn2+ compared with Mg2+ and Ca2+ (ref. 19). Ribozymes of an RNA world would have needed to promote numerous reactions involving small organic molecules20. An important question in this regard is whether RNA can perform covalent chemistry with substrates smaller than purine nucleosides. Ribonucleosides as small as adenosine and 2-aminopurine (Mr 267.2) are substrates for self-splicing intron derivatives21,22. 4SUra (Mr 128.1) is half the size of these nucleoside substrates and within the size range of the smallest aptamer targets, valine and arginine23 (Mr 117.2 and 174.2, respectively). The findings that a catalytic RNA can specifically recognize and utilize such a substrate and that RNA can efficiently promote the chemistry of glycosidic bond formation support the prospects of ribozyme-based metabolic pathways in the RNA world. Another important step would be to generate catalytic sequences capable of using not one but two small-molecule substrates. With regard to this goal it is encouraging that after optimization by evolution and engineering in vitro, a ribozyme motif initially selected on the basis of a reaction using an attached nucleoside triphosphate was able to promote a reaction using free nucleoside triphosphates24. Similarly, our new ribozymes offer a basis for developing catalysts that synthesize 4SU from two small molecules, 4SUra and pRpp. Methods Pool construction and amplification. The double-stranded DNA pool withsequence TTCTAATACGACTCACTATAGGACCGAGAAGTTACCC-N76-CCTTGG-N76-GGCACCN76- ACGCACATCGCAGCAAC (italics, T7 promoter; -N76-, 76-nucleotide randomsequence segment) was constructed starting with three synthetic single-stranded pools, as described previously17,25. The phosphoramidite ratio for random-sequence DNA synthesis was normalized to account for differing coupling rates (0.26:0.25:0.29:0.20 dA:dC:dG:dT molar ratio). Sequencing 2,030 random-sequence positions from arbitrary clones verified that the nucleotide composition was nearly equal (529:519:482:500 A:C:G:T). The DNA pool was transcribed in vitro and 16.4 nmol RNA (six copies, on average, of each pool sequence) were used for the first round of selection. Tethering pRpp and p4SU to RNA. pRpp was linked to RNA by exploiting the finding that specificity for the donor substrate of T4 RNA ligase can be bypassed by preadenylylation26 (Fig. 2a). Adenylylated pRpp (AppRpp) was synthesized by reacting 300 mM adenosine 5'-phosphorimidazolide27 with 600 mM pRpp in 200 mM MgCl2 for 2 h at 50 °C. After ligation (2–4 M gel-purified RNA, 65 M HPLC-purified AppRpp, 50 mM HEPES pH 8.3, 10 mM MgCl2, 3.3 mM dithiothreitol (DTT), 10 g/ml BSA, 8.3% v/v glycerol, 1 U l-1 enzyme from Pharmacia, for 4 h at room temperature) and extraction with phenol–chloroform, RNA was recovered by precipitation with ethanol. The efficacy of ligation was verified by using matrix-assisted mass spectrometry of a 9-nt ligation product before and after treatment with alkaline phosphatase, which removed the 1' pyrosphosphate. At each round, the efficiency of ligation to the pool RNA was determined by the gel mobility of 3'-terminal fragments cleaved by a DNA enzyme28 targeted to the 3' constant segment. RNAs extended with 4SU (for use as synthetic standards) were generated, using App4SU instead of AppRpp. Ribozyme selection. 4SUra was synthesized29 and further purified by reverse-phase HPLC. For each round of selection the RNA-pRpp pool was incubated with 4SUra (Fig. 2b) using the concentrations and incubation times outlined in Fig. 3 ( 0.3 M pool RNA, 50 mM Tris-HCl pH 7.5, 150 mM KCl, 25 mM MgCl2, 1 mM CaCl2, 0.5 mM MnCl2 at 23 °C). Ribozyme reactions were stopped by the addition of one volume of gel loading buffer (90% formamide, 50 mM EDTA). RNA with a 3'-terminal 4SU was resolved from other species on denaturing APM gels14,30 (8 M urea/5% polyacrylamide gels, cast with 80 M Nacryloylaminophenylmercuric acetate). During each round of selection, the ribozyme reaction was split in half. One half contained radiolabelled pool RNA that facilitated the detection and purification of emergent ribozymes; the other half contained unlabelled pool and a trace amount of radiolabelled synthetic standard (an RNA pool with a 3'-terminal 4SU but with primer-binding sequences incompatible with reverse transcription and PCR) that was used to locate and monitor the recovery of reacted RNA. The gel fragment containing RNA-4SU was excised (Fig. 2c), and RNA was eluted in 300 mM NaCl, 20 mM DTT, then precipitated with ethanol. After round 1, RNA containing 4SU was further purified on a second mercury gel and then biotinylated (Fig. 2d) by resuspending precipitated RNA in 25 mM potassium phosphate pH 8.4, 3 mM iodoacetyl-LC-biotin (Pierce), 50% v/v dimethylformamide. After 3 h at room temperature in the dark, the reaction was quenched with DTT and diluted 10-fold; RNA was then twice precipitated with ethanol to remove excess biotin. About 75% of material end-labelled with 4SU was biotinylated, as judged by a streptavidin gel-shift assay. Capture of biotinylated RNA, reverse transcription, PCR amplification and transcription in vitro were as described previously25. Ribozyme assays. Unless stated otherwise, ribozyme isolates were assayed under conditions similar to those of the selection (0.5 M 32P-labelled ribozyme RNA, 50 mM N,N-(bis-2-hydroxyethyl)-2-aminoethanesulphonic acid (BES) pH 7.5, 150 mM KCl, 25 mM MgCl2, 0.5 mM MnCl2 at 23 °C). The rate of 4SU synthesis (kobs) at a given 4SUra concentration was determined by the best fit of k and to the equation F = (1 - e-kt), where F is the fraction reacted (ascertained by phosphorimaging of the APM gel), k = kobs + khyd, = Fmaxkobs/(kobs + khyd), khyd is the rate of pRpp hydrolysis, and t is time. Fmax, the maximal active fraction (typically 0.15–0.20) was determined by using time courses with 4SUra in excess of 4 mM. Factors contributing to the low Fmax values included: incomplete pRpp ligation (lowering Fmax by 10–30%), 3' heterogeneity from untemplated residues added during transcription (lowering Fmax by 40–50%), and ribozyme misfolding. The khyd values were determined by varying 4SUra concentration and observing differences in the asymptotic fraction reacted. The khyd values were confirmed independently by monitoring the inactivation of RNA-pRpp in the absence of 4SUra. The rate of uridine synthesis was determined by using isolate a15 and a random-sequence control RNA, both activated with pRpp. Each RNA (7 M) was incubated with 0.5 mCi [5,6-3H]uracil (NEN) in standard buffer conditions. At 0, 2, 4 and 8 h, 160 l aliquots were quenched by the addition of EDTA and unlabelled uracil. RNA was filtered (Centricon spin filters; Amicon), purified on a polyacrylamide gel and precipitated with ethanol. Uracil counts associated with the RNA were determined by scintillation (Formula-989 fluid, Packard) and corrected for RNA recovery. A similar approach with [2-14C]uracil indicated a comparable rate (within twofold). Additional analysis of RNA recovered from the scintillation fluid (precipitation with NaCl and six volumes of ethanol, and resuspension with unlabelled uridine) confirmed that the incorporated counts were from 3'-terminal uridine synthesis; RNA was basehydrolysed and nucleotides were separated by reverse-phase HPLC and scintillationcounted. The uncatalysed reaction rates were examined with a 9-nt RNA-pRpp conjugate that was 32P-labelled at its 5' terminus. After incubation with 6.4 mM 4SUra for 4 days in the buffer used for the selection, no RNA-p4SU product was observed on APM gels. A single gel would have readily detected uncatalysed RNA-p4SU formation with a rate 2 10-6 M-1 min-1 (correcting for the RNA-pRpp lost to hydrolysis), whereas a serial APM gel analysis lowered detection limits to 6 10-7 M-1 min-1. The RNA-pRpp hydrolysed during the 4-day time course indicated an uncatalysed khyd of 9 10-5 min-1. Received 13 March 1998; accepted 20 July 1998 References 1. Joyce, G. F. & Orgel, L. E.in The RNA World (eds Gesteland, R. F. & Atkins, J. F.) 1-25 (Cold Spring Harbor Lab., Cold Spring Harbor, NY, 1993). 2. Fuller, W. D., Sanchez, R. A. & Orgel, L. E. Studies in prebiotic synthesis VI. Synthesis of purine nucleosides. J. Mol. Biol. 67, 25-33 (1972). | PubMed | ISI | 3. Fuller, W. D., Sanchez, R. A. & Orgel, L. E. Studies in prebiotic synthesis VII. Solid-state synthesis of purine nucleosides. J. Mol. Evol. 1, 249-257 (1972). | PubMed | ISI | 4. Larralde, R., Robertson, M. P. & Miller, S. L. Rates of decomposition of ribose and other sugars: implications for chemical evoution. Proc. Natl Acad. Sci. USA 92, 8158-8160 (1995). | PubMed | ISI | 5. Mizuno, T. & Weiss, A. H. Synthesis and utilization of formose sugars. Adv. Carbohyd. Chem. Biochem. 29, 173-227 (1974). 6. Pitsch, S., Eschenmoser, A., Gedulin, B., Hui, S. & Arrhenius, G. Mineral induced formation of sugar phosphates. Origin Life Evol. Biosphere 25, 297-334 (1995). | ISI | 7. Oro, J. Mechanism of synthesis of adenine from hdyrogen cyanide under possible primitive earth conditions. Nature 191, 1193-1194 (1961). | ISI | 8. Sanchez, R. A., Ferris, J. P. & Orgel, L. E. Studies in prebiotic synthesis. II. Synthesis of purine precursors and amino acids from aqueous hydrogen cyanide. J. Mol. Biol. 30, 223-253 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. (1967). | PubMed | ISI | Ferris, J. P., Sanchez, R. A. & Orgel, L. E. Studies in prebiotic synthesis. 3. Synthesis of pyrimidines from cyanoacetylene and cyanate. J. Mol. Biol. 33, 693-704 (1968). | PubMed | ISI | Stoks, P. G. & Schwartz, A. W. Uracil in carbonaceous meteorites. Nature 282, 709-710 (1979). | ISI | Robertson, M. P. & Miller, S. L. An efficient prebiotic synthesis of cytosine and uracil. Nature 375, 772-774 (1995). | PubMed | ISI | Bhatia, M. B., Vinitsky, A. & Grubmeyer, C. Kinetic mechanism of orotate phosphoribosyltransferase from Salmonella typhimurium. Biochemistry 29, 10480-10487 (1990). | PubMed | ISI | Tao, W., Grubmeyer, C. & Blanchard, J. S. Transition state structure of Salmonella typhimurium orotate phosphoribosyltransferase. Biochemistry 35, 14-21 (1996). | Article | PubMed | ISI | Igloi, G. L. Interaction of tRNAs and of phosphorothioate-substituted nucleic acids with an organomercurial. Probing the chemical environment of thiolated residues by affinity electrophoresis. Biochemistry 27, 3842-3849 (1988). | PubMed | ISI | Wierzchowski, K. L., Litonska, E. & Shugar, D. Infrared and ultraviolet studies on the tautomeric equilibria in aqueous medium between monoanionic species of uracil, thymine, 5-fluorouracil, and other 2,4-diketopyrimidines. J. Am. Chem. Soc. 87, 4621-4629 (1965). | PubMed | ISI | Psoda, A., Kazimierczuk, Z. & Shugar, D. Structure and tautomerism of the neutral and monoanionic forms of 4-thiouracil derivatives. J. Am. Chem. Soc. 96, 6832-6839 (1974). | PubMed | ISI | Bartel, D. P. & Szostak, J. W. Isolation of new ribozymes from a large pool of random sequences. Science 261, 1411-1418 (1993). | PubMed | ISI | Gray, M. W. The presence of 2'-O-methylpseudouridine in the 18S + 26S ribosomal ribonucleates of wheat embryo. Biochemistry 13, 5453-5463 (1974). | PubMed | ISI | Jaffe, E. K. & Cohn, M. Diastereomers of the nucleoside phosphorothioates as probes of the structure of the metal nucleotide substrates and of the nucleotide binding site of yeast hexokinase. J. Biol. Chem. 254, 10839-10845 (1979). | PubMed | ISI | Benner, S. A., Ellington, A. D. & Tauer, A. Modern metabolism as a palimpsest of the RNA world. Proc. Natl Acad. Sci. USA 86, 7054-7058 (1989). | PubMed | ISI | Michel, F., Hanna, M., Green, R., Bartel, D. P. & Szostak, J. W. The guanosine binding site of the Tetrahymena ribozyme. Nature 342, 391-395 (1989). | PubMed | ISI | Been, M. D. & Perrotta, A. T. Group I intron self-splicing with adenosine: evidence for a single nucleoside-binding site. Science 252, 434-437 (1991). | PubMed | ISI | Gold, L., Polisky, B., Uhlenbeck, O. & Yarus, M. Diversity of oligonucleotide functions. Annu. Rev. Biochem. 64, 763-797 (1995). | PubMed | ISI | Ekland, E. H. & Bartel, D. P. RNA-catalysed RNA polymerization using nucleoside triphosphates. Nature 382, 373-376 (1996). | PubMed | ISI | Ekland, E. H. & Bartel, D. P. The secondary structure and sequence optimization of an RNA ligase ribozyme. Nucleic Acids Res. 23, 3231-3238 (1995). | PubMed | ISI | England, T. E., Gumport, R. I. & Uhlenbeck, O. C. Dinucleoside pyrophosphates are substrates for T4-induced RNA ligase. Proc. Natl Acad. Sci. USA 74, 4839-4842 (1977). | PubMed | ISI | Lohrmann, R. & Orgel, L. E. Preferential formation of (2'-5')-linked internucleotide bonds in nonenzymatic reactions. Tetrahedron 34, 853-855 (1978). | Article | ISI | Santoro, S. W. & Joyce, G. F. Ageneral purpose RNA-cleaving DNA enzyme. Proc. Natl Acad. Sci. USA 94, 4262-4266 (1997). | Article | PubMed | ISI | Mizuno, Y., Ikehara, M. & Watanabe, K. A. Potential antimetabolites. I. Selective thiation of uracil and 1,2,4-triazine-3,5(2H,4H)-dione (6-azauracil). Chem. Pharmac. Bull. 10, 647-652 (1962). | ISI | Wecker, M., Smith, D. & Gold, L. In vitro selection of a novel catalytic RNA: characterization of a sulfur alkylation reaction and interaction with a small peptide. RNA 2, 982-994 (1996). | PubMed | ISI | Acknowledgements. We thank J. Stubbe, P. Zamore and members of the lab for helpful comments on the manuscript, and G. Joyce for providing the sequence of the RNA-cleaving DNA enzyme28 before publication. This work was supported by an MRC (Canada) postdoctoral fellowship to P.J.U. and a grant from the Searle Scholars Program/The Chicago Community Trust to D.P.B Figure 1 UPRT-catalysed synthesis of uridine 5'-phosphate from pRpp and uracil. Figure 2 In vitro selection scheme. Symbols: PPi, pyrophosphate; B, biotin. a, Tethering pool RNA to pRpp by using T4 RNA ligase and adenylylated pRpp (AppRpp). b, 4SU synthesis promoted by active sequences within the initial pool. c, Enrichment of reacted sequences by using APM gels. One lane (left) contained radiolabelled pool RNA. The other lane (right) contained radiolabelled synthetic standard to mark the location of the reacted RNA. After the first round of selection, catalysts were enriched by serial purification on two APM gels. d, Further enrichment of 4SUcontaining sequences by derivation with iodoacetyl-LC-biotin (Pierce) and capture with streptavidin magnetic beads. e, Amplification of enriched RNA by reverse transcription, PCR and transcription. The amplified RNA was then subjected to another round of selection in vitro. Figure 3 Increased ribozyme activity with successive rounds of selection. The upper bound for the uncatalysed rate is also plotted (triangle). Rounds 4 to 6 included error-prone PCR amplification, as described17. From rounds 7 to 10 the stringency of the selection was increased exponentially by decreasing both 4SUra concentration (from 8 mM to 40 M) and incubation time (from 18 h to 7.5 min). Reaction rates were calculated by dividing the observed initial rate by the 4SUra incubation concentration Figure 4 Analysis of ribozyme sequences and their catalytic proficiencies. a, Three families of ribozymes with rates of 4SU production and pRpp hydrolysis. Sequence analysis is summarized by family trees, with each branch terminus representing one of the 35 sequenced isolates. Similarity to the family consensus sequence varied from 94 to 86%, as indicated by the horizontal length of the branches. Family A has two dominant subfamilies, which presumably emerged through preferential expansion of superior early lineages. Because the family C consensus sequence could not be fully determined from only two isolates, similarity to the consensus is reported as a range (dashed lines). b, Rate of nucleotide formation (kobs) as a function of 4SUra concentration for isolates a15 (circles), b01 (squares) and c05 (diamonds). The rates of isolates b01 and c05 fit well to linear functions indicating kcat/Km values of 1.29 0.03 and 0.67 0.02 M-1 min-1, respectively. The line shown for isolate a15 is the nonlinear least-squares fit of a Michaelis–Menten curve to the data and suggests an apparent Km of 28 4 mM and a kcat of 0.13 0.012 min-1 (mean s.e.m.), although these values must be viewed with caution because solubility constraints prevented the examination of 4S Ura concentrations above 14 mM. For example, we cannot discount the possibility that 4SUra was beginning to occupy an inhibitory site rather than the catalytic site. The linear behaviour of the other two isolates suggests that 4SUra within this concentration range does not aggregate or affect metal-ion availability. Figure 5 Two-dimensional TLC analysis18 of ribozyme-synthesized 4SU after labelling and diges. tion with nuclease. Ribozyme product RNA was extended by one nucleotide by using -32Pcordycepin triphosphate (3'-deoxyATP) and poly(A) polymerase, then purified on an APM gel. Ribonuclease T2 digestion reduced all the labelled material into nucleoside 3'-phosphates, with the labelled phosphate residing on the ribozyme-synthesized nucleotide. Digests were spotted on 10 10 cm cellulose TLC plates (Baker flex) presoaked in 1:10 saturated-(NH4)2SO4:H2O. The first axis was run in 80% v/v ethanol, and after air-drying for 5 min, the second axis of 40:1 saturated(NH4)2SO4:isopropanol was run. The reference panel shows the migration of 4SUp, Up, Ap, Cp and Gp. Unlabelled or body-labelled 4SU-containing RNA was included as carrier in the digestion (top and bottom panels, respectively). Carrier RNA was generated by transcription, with 4SUTP replacing UTP. January 2000 Volume 7 Number 1 pp 28 - 33 Ribozyme-catalyzed tRNA aminoacylation Nick Lee1, Yoshitaka Bessho1, Kenneth Wei1, Jack W. Szostak2 & Hiroaki Suga1 1. Department of Chemistry, State University of New York at Buffalo, Buffalo, New York 14260-3000, USA. 2. Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. Correspondence should be addressed to H Suga. e-mail: hsuga@acsu.buffalo.edu The RNA world hypothesis implies that coded protein synthesis evolved from a set of ribozyme catalyzed acyl-transfer reactions, including those of aminoacyl-tRNA synthetase ribozymes. We report here that a bifunctional ribozyme generated by directed in vitro evolution can specifically recognize an activated glutaminyl ester and aminoacylate a targeted tRNA, via a covalent aminoacyl-ribozyme intermediate. The ribozyme consists of two distinct catalytic domains; one domain recognizes the glutamine substrate and self-aminoacylates its own 5'-hydroxyl group, and the other recognizes the tRNA and transfers the aminoacyl group to the 3'-end. The interaction of these domains results in a unique pseudoknotted structure, and the ribozyme requires a change in conformation to perform the sequential aminoacylation reactions. Our result supports the idea that aminoacyl-tRNA synthetase ribozymes could have played a key role in the evolution of the genetic code and RNA-directed translation. According to the RNA world hypothesis, the evolution of coded protein synthesis was a critical step in the transition from the RNA world to modern biological systems1-3. Several lines of experimental evidence support the postulate that ribosomal RNA (rRNA) participates directly in protein synthesis, and it has been argued that primitive versions of the translation apparatus were fully RNA-based4, 5. In nature, the aminoacylation of tRNA is now catalyzed solely by protein enzymes, the aminoacyl-tRNA synthetases (aaRSs), whose amino acid specificities determine the genetic code6. Therefore, a completely RNA-based protein synthesis system would require a set of RNA molecules capable of synthesizing aminoacyl-tRNAs in place of the protein aaRSs. The accuracy of aminoacyl-tRNA synthesis relies on the specificity of the synthetases for both amino acid and tRNA1, 79. Ribozyme synthetases should be capable of similar specificity. There are now many examples of specific amino acid recognition by RNA aptamers10-12. Furthermore, a number of aminoacyl-transferase ribozymes have been isolated by directed in vitro evolution, including selfaminoacylating RNAs13, 14, 3'- to 2'- or 5'-acyl-transferases15, 16, and amide and peptide synthases17, 18. However, no ribozyme has been reported with the ability to specifically aminoacylate a distinct substrate RNA, such as a tRNA. Here we report the in vitro evolution of such a ribozyme. Aminoacyl-transfer from donor substrate to tRNA In previous work16, 19, 20 we isolated and characterized an acyl-transferase ribozyme by selecting for enhanced transfer of an N-biotinyl-l-methionyl (Biotin-l-Met) group from the 3' end of a donor hexanucleotide, 5'-pCAACCA-3', to the 5'hydroxyl group of the ribozyme. Further experiments allowed us to delete non-essential regions of the original sequence, leading to a smaller version of the acyl-transferase ribozyme (82 nt), referred to as ATRib (Fig. 1a), that retains essentially full catalytic activity. The secondary structure of ATRib consists of four stems (P1–P4) and three hairpin-loops (L2– L4) arranged in a cloverleaf structure. Since ATRib was selected to self-modify its own 5'-hydroxyl group, it acts as a single-turnover ribozyme. However, since the acyl transfer reaction is energetically neutral, the ribozyme-catalyzed reaction should be readily reversible. This led us to test whether another oligoribonucleotide 'acylacceptor', containing a sequence that would allow it to interact with the internal guide sequence (IGS) on the ribozyme, could accept the acyl group from acylated ribozyme (Fig. 1b). This 'ping-pong' process would potentially allow the ribozyme to act as a multiple-turnover catalyst. We demonstrated this concept using tRNAfMet with an ATRib having a 5'-UGGUU-3' IGS (Fig. 2a). The deletion of G82 of ATRib (Fig. 1a) was essential for the ribozyme to exhibit catalytic activity for aminoacylation of tRNA. Under single turnover conditions (lanes 1–7), ~30% of tRNAfMet was acylated within 30 min. Under optimized multiple turnover conditions, conversion of 30–40% acyl-tRNAfMet was observed in 120 min (Fig. 2a, lane 8), corresponding to 1.5–2 turnovers. Because tRNA acylation results in only a small change in molecular weight, visualization of the new product by gel electrophoresis required the use of N-biotinyl-l-methionine as a substrate and addition of streptavidin to shift the mobility by binding to Biotin-L-Met-tRNAfMet (compare lanes 7 to lane 9). In the absence of either ribozyme or donor oligonucleotide, no acyl-tRNAfMet was detected (lanes 10 and 11), indicating that aminoacylation of tRNAfMet was catalyzed by the ribozyme. The ribozyme-catalyzed tRNA charging reaction described above involves tRNA recognition by base-pairing to five complementary bases of the IGS. The five bases of the tRNA that are recognized include the CCA-3' sequence common to the 3' ends of all tRNAs, the so-called discriminator base (position 73, A73), and A72. On the basis of this proposed base-pairing, the ribozyme should be able to discriminate between some but not all tRNAs. The degree of ribozymecatalyzed aminoacylation of tRNA is indeed directly related to the sequence complementarity (Fig. 2b). In contrast, the aminoacyl group of the substrate is not a critical determinant of binding. The ribozyme could transfer phenylalanyl and leucyl groups to tRNAPhe with almost equal efficiency as the methionyl group (Fig. 2c). Evolution of an amino acid recognition domain Specificity of modern biological tRNA synthetases resides largely in the amino acid activation reaction. Addition of an amino acid-specific activation or charging domain to ATRib could therefore potentially generate a ribozyme with the fundamental properties of a true tRNA synthetase. In the scheme illustrated in Fig. 3a, the ribozyme reacts with an activated amino acid substrate to form a covalent acylintermediate that subsequently reacts with an acceptor tRNA. In principle this could be a multiple-turnover process. In designing a selection procedure to implement this process, we wished to ensure that the ribozyme would recognize the activated amino acid substrate primarily through interactions with the amino acid, and not the activating group. We therefore avoided aminoacyl-adenylates as substrates, and focused instead on cyanomethyl ester (CME), a simple leaving group with no hydrogen bond donors or acceptors to interact with the ribozyme. Background acylation of pool RNA is negligible at 5 mM amino acid-CME and physiological pH19. We chose l-glutaminyl-CME as our initial substrate, because its amide side chain should be a suitable recognition element for RNA ( Fig. 3b). Selection was done with N-biotinyl-lglutaminyl-CME (Biotin-l-Gln-CME) in which the biotin is a selectable tag by streptavidin to isolate the rare active RNAs from the RNA population. The design of the RNA pool was driven by the need to place the random sequence segment in a region accessible for the 5'-hydroxyl acylation without interfering with formation of the catalytic core of ATRib. On the basis of our structural studies of ATRib, the RNA pool was designed to have a 70 nt random sequence flanked by the 80 nt ATRib sequence at the 5'-end and a 20 nt constant sequence at the 3'-end (Fig. 3a). The selection was designed in three phases (Fig. 3c), beginning with a simple selection for self-acylation of the ribozyme (phase I), followed by a more specific selection for acylation on the 5'-hydroxyl (phase II), and ending with selection for retention of the original oligonucleotide-ribozyme acyl-transfer reaction (phase III). Our goal was an RNA pool that is 'ambidextrously' active in self-aminoacylation of its own 5'-hydroxyl group using both substrates, and which would therefore be likely to contain ribozymes capable of transferring the glutaminyl group first to itself and then to a tRNA. We carried out 12 rounds of the above selection (9, 1, and 2 rounds, respectively). The course of each phase of the selection was monitored by a streptavidin-dependent mobilitygel-shift assay (Fig. 3d–f). The 12th-round RNA pool exhibited almost equal activity toward both Biotin-l-Gln-CME and Biotinl-Phe-3'-ACCAAC-5' (Fig. 3f). A total of 27 clones from pool 12 RNA were sequenced and aligned (Fig. 4a). We identified three sequence classes (classes I, II and III) and 14 unique sequences. Two clones from each of the classes, and all unique clones, were individually tested for activity with both substrates. Class I ribozymes exhibited activity with both substrates, whereas those in classes II and III exhibited preferential activity for the acyl-hexanucleotide rather than the glutamine substrate. We also identified five ambidextrous clones from the unique sequences. The remaining clones showed activity solely, or predominantly towards one substrate. We selected the AD02 ribozyme in class I for further studies. The AD02 ribozyme exhibits self-aminoacylating activity with both Biotin-l-Gln-CME and Biotin-l-Phe-3'-ACCAAC-5' substrates (Fig. 4b, lanes 1 and 6, and 2 and 7, respectively). To explore the amino acid specificity of the ribozyme, we tested self-aminoacylation with three non-cognate Biotinl-laminoacyl-CME substrates, phenylalanine, leucine, and valine (Fig. 4b , lanes 3–5, and lanes 8–10). Selfaminoacylating activity with all of these substrates is considerably reduced relative to the cognate substrate, indicating that the new catalytic domain has specificity for the glutamine side chain. We therefore refer to it as the glutamine-recognition (QR) domain. Secondary structure of the Gln-recognition domain A secondary structure model of the QR domain predicted by the Zuker algorithm21 (Fig. 5a) consists of three major stems referred to as P5, P6 (a and b) and P7. The AD02 ribozyme consists of the ATRib and the QR domains, connected by a single stranded stretch of five or six contiguous As. The 9 nt P6b loop (L6b) contains the sequence 5'-UAACCA-3', which is complementary to the IGS of the ATRib domain. To test the proposed secondary structure, we segmented these RNA domains at the poly-A stretch, and tested whether the independent QR domain, referred to as QRtrans, could aminoacylate the 5'-hydroxyl group of ATRib ( Fig. 5b). Aminoacylation of ATRib did indeed take place (lanes 1 and 2), although it occurred at a four-fold reduced rate compared to the intact ribozyme. This result clearly suggests that the poly-A stretch acts simply as a linker to hold the two domains together. Control experiments in the absence of QRtrans, or using 5'-triphosphate-ATRib, did not yield aminoacyl-ATRib (data not shown). We then constructed two ATRib mutants and three QRtrans mutants to test the proposed base-pairing between the IGS and L6b. All mutants that disrupt or reduce the interaction diminished yields of the aminoacyl-ATRib (lanes 3–8, 11–12, and 13–16). However, compensatory mutations that restored the L6b–IGS pairing fully recovered the activity (lanes 9 and 10). These results show that basepairing between the IGS of ATRib and L6b of the QR domain is essential for QR-mediated aminoacylation of ATRib, suggesting that the ribozyme has a pseudoknotted secondary structure in the self-aminoacylation step. Ribozyme-catalyzed aminoacylation of tRNA Next, we wished to see if the AD02 ribozyme was capable of aminoacylating tRNA using N-biotinyl-l-glutaminyl-CME as a substrate. The Michaelis-Menten parameters for glutaminylCME in the self-acylating reaction were determined to be kcat = 1.95 10-3 min-1 and Km = 158 M. For subsequent work, we set the concentration of the glutamine substrate at 5 mM. However, even when the ribozyme was pre-incubated with the glutamine substrate for 3 h (during which time it becomes largely self-acylated) and then mixed with a 10-fold excess of tRNAfMet, the yield of aminoacylated tRNA was barely above background. One possible explanation for the observed low activity is that the ribozyme, once aminoacylated at its 5'hydroxyl group, is incapable of binding tRNA due to cooperative interactions of the QR domain with the 5'glutaminyl group and the IGS. Attempts to use mutations to decrease the strength of the QR domain interaction with the IGS unfortunately resulted in loss of self-acylating activity. We therefore used heat-induced denaturation to make the IGS open for binding to tRNA. After four rounds of thermocycling, ~4% of the total input tRNA fMet was aminoacylated (Fig. 5c, lanes 1–5,), corresponding to 0.4 ribozyme equivalents. Appearance of the product band on the gel was dependent on the presence of streptavidin (lane 6), consistent with the product being the expected biotinyl-l-glutaminyl-tRNA fMet. A product of identical mobility was generated by ATRib-catalyzed phenylalanyl transfer from the hexanucleotide to tRNAfMet (lane 11). A control experiment that was performed with glutaminyl-CME as substrate and with ATRib in place of the AD02 ribozyme (lane 7) yielded only the background level of aminoacyl-tRNA, showing that the QR domain of the ribozyme is essential for aminoacylation activity with glutaminyl-CME as substrate. Similarly, using Biotin-l-Phe-CME as substrate, which is inactive for self-acylation of the AD02, yielded only a background level of aminoacylated tRNA (lane 12). Addition of a competitor, non-radiolabeled pCAACCA, for the IGS strongly inhibited the aminoacylation on tRNA (lanes 8–10), consistent with the necessity of the ATRib domain in catalysis. Conclusion We have evolved a ribozyme that contains two distinct catalytic domains with different activities. These domains act sequentially to transfer an aminoacyl group first to the ribozyme itself, and then to tRNA, thus acting as an aminoacyl-tRNA synthetase. Two novel strategies were employed in the course of generating this ribozyme. First, a ribozyme that had been previously isolated as a standard selfmodifying ribozyme was shown to act as a true multiple turnover catalyst simply by providing distinct donor and acceptor substrate oligonucleotides. The approach demonstrated here may be widely applicable for readily reversible reactions. Second, we generated a complex ribozyme with two active sites having distinct activities by performing two sequential stages of directed evolution. This may be a useful strategy for evolving ribozymes that catalyze sequential transformations on a substrate. A conformational change must occur between the two sequential reactions catalyzed by the bifunctional ribozyme, since both reactions require base-pairing to the IGS. This conformational rearrangement is reminiscent of the changes that must occur between the first and second reactions of the self-splicing group I and group II introns22-26, as well as the more complex rearrangements that occur within the ribosome and the spliceosome27-31. In our selected ribozyme, the necessary conformational change appears to be rate limiting overall. This raises an intriguing and challenging problem for future experiments to evolve ribozymes capable of switching between two conformations. Although the activity of our aminoacyl tRNA synthetase-like ribozyme is modest, it is likely a sub-optimal catalyst because it was selected from a population that represented a very sparse sample of sequence space. Further directed evolution should yield ribozymes with greater catalytic activity and possibly higher amino acid and tRNA specificities. Since the in vitro evolution of aaRS-like ribozymes can be performed with any desired amino acid and tRNA, this strategy is a powerful method for the generation of useful catalysts for the synthesis of non-natural aminoacyl-tRNAs32, 33. The ribozyme described here executes two of the key functions of an aminoacyl tRNA synthetase — specific amino acid recognition and charging of tRNA. Our results, coupled with previous demonstrations of ribozyme catalyzed amide and peptide bond formation, strongly support the idea that a translation system could have evolved in the RNA world from an initial set of simple ribozymes involved in acyl-transfer functions. Methods Aminoacyl-transfer reaction. Reactions were performed under the following conditions: 2 M ATRib (single-turnover) or 0.2 M (multiple-turnover), 1 M 5'-[32P]-labeled tRNA (Sigma), and 10 M Biotin-L-aminoacyl-3'-ACCAAC-5' in a buffer containing 25 mM HEPES (pH 8.0), 100 mM KCl, 50 mM MgCl2 at 25 °C. The ribozyme was preincubated with buffer in the absence of MgCl2, heated at 95 °C for 1–2 min, and cooled to 25 °C. MgCl2 was then added followed by a 5 min equilibration. The reaction was initiated by the addition of pre-mixed tRNA and donor hexanucleotide. At each time point, a 2 L aliquot was removed from 20 l reaction mixture, and quenched with 4 l MEUS buffer (25 mM MOPS, 5 mM EDTA, 8 M urea, 10 M streptavidin, pH 6.5). The resulting solution was analyzed by 6% polyacrylamide gel electrophoresis (PAGE) in a cold room to keep the gel temperature below 20 °C. Pool construction. The oligonucleotides, DNA template corresponding to the sequence of ATRib, the 5'-primer containing T7 promoter sequence (5'GGTAACACGCATATGT-AATACGACT CACTATAGGAACAACTTGCAGCTTTC-3'), the random pool DNA (5'-GTGATCGTCCAACGGCCTC-N70 -ACCAAAAACAAAAAGCATAACC-3'), and the 3'-primer (5'GTGATCGTCCAACGGCCTC-3') were synthesized on an automated DNA synthesizer. After Taq polymerase-extension of these synthetic DNA templates, the full-length product was amplified by eight cycles of large-scale PCR in the presence of the 5'-and 3'-primers. Four equivalents of the pool DNA were transcribed by T7 RNA polymerase in the presence of [ 32P]-UTP, and purified by PAGE. The resulting pool RNA was treated with calf intestinal phosphatase to remove the 5'triphosphate. Selection. The folded 1 M pool RNA (10 M for the first round of the selection) was incubated with 5 mM Biotin-L-GlnCME in a EKM buffer (50 mM EPPS, 500 mM KCl, 100 mM MgCl2, pH 7.5) at 25 °C for 3 h. This RNA was ethanolprecipitated twice, and the pellet was dissolved into a EKE buffer (50 mM EPPS, 500 mM KCl, 5 mM EDTA, pH 7.0), and applied to a column containing 0.25 ml (1 ml for the first round) of streptavidin-agarose gel. The slurry mixture was gently suspended for 30 min at 25 °C, and the resulting resin was washed 10-resin volumes of the EKE buffer, 4 M urea, followed by 5-resin volumes of water. Bound RNA was eluted with 10 mM biotin by heating at 95 °C for 10 min. The collected RNA was ethanol-precipitated and dissolved in 10 l of water. The remaining procedures were carried out as described16. Construction of QRtrans and its mutants. The PCR-DNA template of AD02 ribozyme was passed through G-50 Sephadex columns (Boehringer Mannheim) to remove remaining primers from the PCR reaction. The purified DNA was used in the PCR reaction for generating the wild type QRtrans UAACCA DNA using the corresponding internal 5'primer containing T7 promoter sequence and the 3'-primer. Two mutant QrtransUAGCCA and QRtransACGCCA DNAs were generated by the PCR reaction with the purified DNA using the internal 5'-primer and the 3'-primers containing the corresponding mutations. Aminoacylation of tRNA. The folded 2 M AD02 ribozyme was equilibrated with MgCl2 for 5 min, and the reaction initiated by the addition of pre-mixed 5 mM Biotin-L-Gln-CME and 20 M 5'-[ 32P]-tRNAfMet. After incubating for 120 min at 25 °C, the reaction was subjected to PCR at 80 °C for 30 sec and 25 °C for 60 min. At each time point, a 2 l aliquot was removed from 20 l reaction mixture, was ethanol-precipitated twice, and the pellet was dissolved into 4 l MEUS buffer. The remaining procedures were the same as those in the acyltransfer reaction. Received 3 September 1999; Accepted 2 November 1999. REFERENCES 1. Schimmel, P., Giegé, R., Moras, D. & Yokoyama, S. Proc. Natl. Acad. Sci. USA 90, 8763-8768 (1993). | PubMed | ISI | 2. Yarus, M. Science 240, 1751-1758 (1988). | PubMed | ISI | 3. Hager, A.J., Pollard, J.D. & Szostak, J.W. Chem. Biol. 3, 717-725 (1996). | PubMed | ISI | 4. Piccirilli, J.A., McConnell, T.S., Zaug, A.J., Noller, H.F. & Cech, T.R. Science 256, 1420-1424 (1992). | PubMed | ISI | 5. Noller, H.F., Hoffarth, V. & Zimniak, L. Science 256, 1416-1419 (1992). | PubMed | ISI | 6. Schimmel, P. Biochemistry 28, 2747-2759 (1989). | PubMed | ISI | 7. Giegé, R., Sissler, M. & Florentz, C. Nucleic Acids Res. 26, 50175035 (1998). | Article | PubMed | ISI | 8. McClain, W.H. J. Mol. Biol. 234, 257-280 (1993). | Article | PubMed | ISI | 9. Nureki, O.,et al. Science 280, 578-582 (1998). 10. Famulok, M. J. Am. Chem. Soc. 116, 1698-1706 (1994). | ISI | 11. Majerfeld, I. & Yarus, M. Nature Struct. Biol. 1, 287-292 (1994). | PubMed | ISI | 12. Yang, Y., Kochoyan, M., Burgstaller, P., Westhof, E. & Famulok, M. Science 272, 1343-1347 (1996). | PubMed | ISI | 13. Illangasekare, M., Sanchez, G., Nickles, T. & Yarus, M. Science 267, 643-647 (1995). | PubMed | ISI | 14. Illangasekare, M., Kovalchuke, O. & Yarus, M. J. Mol. Biol. 274, 519-529 (1997). | Article | PubMed | ISI | 15. Jenne, A. & Famulok, M. Chem. Biol. 5, 23-34 (1998). | PubMed | ISI | 16. Lohse, P.A. & Szostak, J.W. Nature 381, 442-444 (1996). | PubMed | ISI | 17. Wiegand, T.W., Janssen, R.C. & Eaton, B.E. Chem. Biol. 4, 675683 (1997). | PubMed | ISI | 18. Zhang, B. & Cech, T.R. Nature 390, 96-100 (1997). | Article | PubMed | ISI | 19. Suga, H., Lohse, P.A. & Szostak, J.W. J. Am. Chem. Soc. 120, 1151-1156 (1998). | Article | PubMed | ISI | 20. Suga, H., Cowan, J.A. & Szostak, J.W. Biochemistry 37, 1011810125 (1998). | Article | PubMed | ISI | 21. Mathews, D.H., Sabina, J., Zuker, M. & Turner, D.H. J. Mol. Biol. 288, 911-940 (1999). | Article | PubMed | ISI | 22. Burke, J.M.,et al. Cell 45, 167-176 (1986). | PubMed | ISI | 23. Golden, B.L. & Cech, T.R. Biochemistry 35, 3754-3763 (1996). | Article | PubMed | ISI | 24. Chanfreau, G. & Jacquier, A. EMBO J. 15, 3466-3476 (1996). | PubMed | ISI | 25. Costa, M., Deme, E., Jacquier, A. & Michel, F. J. Mol. Biol. 267, 520-536 (1997). | Article | PubMed | ISI | 26. Chu, V.T., Liu, Q., Podar, M., Perlman, P.S. & Pyle, A.M. RNA 4, 1186-1202 (1998). | Article | PubMed | ISI | 27. Lodmell, J.S. & Dahlberg, A.E. Science 277, 1262-1267 (1997). | Article | PubMed | ISI | 28. Agrawal, R.K. & Frank, J. Curr. Opin. Struct. Biol. 9, 215-221 (1999). 29. Konarska, M.M. & Sharp, P.A. Cell 49, 763-774 (1987). | PubMed | ISI | 30. Fortner, D.M., Troy, R.G. & Brow, D.A. Genes Dev. 8, 221-233 (1994). | PubMed | ISI | 31. Kambach, C., Walke, S. & Nagai, K. Curr. Opin. Struct. Biol. 9, 222-230 (1999). | Article | PubMed | ISI | 32. Noren, C.J., Anthony-Cahill, S.J., Griffith, M.C. & Schultz, P.G. Science 244, 182-188 (1989). | PubMed | ISI | 33. Arslan, T., Mamaev, S.V., Mamaev, N.V. & Hecht, S.M. J. Am. Chem. Soc. 119, 10877-10887 (1997). | Article | ISI | Figure 1: Secondary structure of acyl-transferase ribozyme (ATRib) and schematic representation of a ATRib-catalyzed aminoacylation of acceptor RNA. a, Secondary structure. The ATRib self-aminoacylates the 5'-hydroxyl group in the presence of the acyl-donor hexanucleotide. Bold letters in the ribozyme denote the internal guide sequence (IGS), which is complementary to the sequence of the donor. b, Schematic representation of the reaction. The abbreviation 'aa' stands for amino acid. The G82 of the wild type ATRib is deleted for efficient aminoacylation on tRNA. Figure 2: Aminoacylation of tRNA catalyzed by ATRib. a, Methionylation of tRNAfMet catalyzed by ATRib, in which the IGS is 5'-UGGUU-3'. Single turnover conditions (2 M ATRib, 1 M tRNAfMet, and 10 M donor) were used in lane 1–7, and the multiple turnover conditions (0.2 M ATRib, 1 M tRNA fMet, and 10 M donor) were used in lane 8. The abbreviation 'SAv' denotes streptavidin. b, Methionylation of cognate and non-cognate tRNAs catalyzed by ATRib. Sequence of the acyl-acceptor region of each tRNA is shown, and the bold letters denote the complementary sequence to that in the IGS of ATRib. Reactions were carried out under the multiple-turnover conditions (0.2 M ATRib, 1 M tRNAfMet, and 10 M donor). c, Aminoacylation of tRNAPhe in the presence of various aminoacylhexanucleotides. Reactions were performed under the multiple-turnover conditions, where 0.5 M ATRib, 2 M tRNAPhe, and 20 M donor for Biotin-l-Met- and BiotinPhe-hexanucleotide or 10 M for Biotin-l-Leu- hexanucleotide. Figure 3: In vitro evolution of aaRS-like ribozymes. a, Schematic representation of aminoacylation of tRNA catalyzed by an aaRS-like ribozyme. The 70 nt random region is shown in the green box. The 3'-end 20 nt constant region is indicated by a solid line. b, Structure of amino acid cyanomethyl ester (CME) substrates used in this study. The synthesis of substrates were performed by methods described elsewhere19 with minor modifications. c, Flowchart of the in vitro evolution of new catalytic domain for glutamine in ATRib. Procedures in each phase of the selection are listed. An initial RNA pool used in the experiment contained approximately 1015 different molecules. d, Phase I. SAv-dependent mobility-gel-shift assay was performed in each round of selected RNA (lanes 1–9). Control experiments for pool 9 RNA were no streptavidin (lane 10), no substrate (lane 11), and nonphosphatased (that is, 5'-triphosphate) pool 9 RNA. e, Phase II. The gel-shift assay shows the activity of pool 9 RNA with 5'-OH (lane 1) and 5'-triphosphate (lane 2), pool 10 (lanes 3 and 4) isolated from the negativepositive selection, and pool 10 (lanes 5 and 6) isolated from the negative selection. f, Phase III. Reactions were done in the presence of ~1 M pool RNA, 5 M Biotin-lPhe-hexanucleotide or 5 mM Biotin-L-Gln-CME for 3 h at 25 °C. The gel-shift assay shows the activity of pool 10 RNA with 5'-OH and 5'-triphosphate (lanes 1–4), pool 11 (lanes 5–8), and pool 12 (lanes 9–12). Figure 4: Sequence alignment of active clones in pool 12 DNA and amino acid specificity of clone AD02. a, The ATRib domain is not shown except for the IGS. In class I ambidextrous ribozyme, the proposed basepair interaction between IGS and L6b are highlighted in green rectangle boxes (see Fig. 5 a), and paired regions are in colored boxes. The regions in the same colored boxes indictate the paired regions Figure 5: Structure and function of the ribozyme. a, Bases in green rectangle boxes are proposed to form base pairs in the QR-dependent self-acylation step. b, Reactions were carried out in the presence of 2 M QR domain, 2 M 5'-OH-ATRib, and 5 mM Biotin-l-GlnCME at 25 °C. Mutations or deletions in the ATRib and QRtrans are indicated with letters in red with nucleotide numbers. Watson-Crick and G:U wobble base pair interactions are shown by solid lines and diamonds, respectively. c, The yields of Biotin-l-Gln-tRNAfMet were 0.1 % (lane 1), 2.6 % (lane 2), 3.2 % (lane 3), 3.6 % (lane 4), and 4.0 % (lane 5). Reactions were carried out in the presence of 2 M ribozyme, 5 mM Biotin-L-GlnCME, and 20 M tRNAfMet. Thermocycling was performed as described in the Methods. Negative controls were performed in the absence of streptavidin (lane 6) or replacing AD02 with 2 M ATRib (lane 7). Competitive inhibition was carried out in the presence of 20 M 5'-CAACCA-3' (lanes 8–10). Positive control was performed by the aminoacylation of tRNAfMet catalyzed by ATRib using 5 M Biotin-l-Phe-3'-ACCAAC-5' (lane 11). 5 mM Biotin-l-Phe-CME was used instead of Gln (lane 12). RNA (2001), 7:395-404 Cambridge University Press Copyright © 2001 RNA Society Research Article The effect of cytidine on the structure and function of an RNA ligase ribozyme JEFF ROGERS a1 and GERALD F. JOYCE a1 c1 a1 Departments of Chemistry and Molecular Biology and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037, USA Abstract A cytidine-free ribozyme with RNA ligase activity was obtained by in vitro evolution, starting from a pool of random-sequence RNAs that contained only guanosine, adenosine, and uridine. This ribozyme contains 74 nt and catalyzes formation of a 3[prime prime or minute],5[prime prime or minute]-phosphodiester linkage with a catalytic rate of 0.016 min[minus sign]1. The RNA adopts a simple secondary structure based on a three-way junction motif, with ligation occurring at the end of a stem region located several nucleotides away from the junction. Cytidine was introduced to the cytidine-free ribozyme in a combinatorial fashion and additional rounds of in vitro evolution were carried out to allow the molecule to adapt to this added component. The resulting cytidine-containing ribozyme formed a 3[prime prime or minute],5[prime prime or minute] linkage with a catalytic rate of 0.32 min [minus sign]1. The improved rate of the cytidine-containing ribozyme was the result of 12 mutations, including seven added cytidines, that remodeled the internal bulge loops located adjacent to the threeway junction and stabilized the peripheral stem regions. (Received November 10 2000) (Revised December 4 2000) (Accepted December 6 2000) Key Words: in vitro evolution; ribozyme; RNA ligase. Correspondence: Reprint requests to: Gerald F. Joyce, Departments of Chemistry and Molecular Biology and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA; e-mail: gjoyce@scripps.edu. c1 04 September 1997 Nature 389, 54 - 57 (1997) © Macmillan Publishers Ltd. <> RNA-catalysed carbon–carbon bond formation THEODORE M. TARASOW, SANDRA L. TARASOW & BRUCE E. EATON NeXstar Pharmaceuticals, Inc., 2860 Wilderness Place, Boulder, Colorado80301, USA The 'RNA world' hypothesis1-3, which assumes that the chemical processes that led to the appearance of life were carried out by RNA molecules, has stimulated interest in catalytic reactions involving oligonucleotides such as catalytic RNA (ribozymes)4. Naturally occurring ribozymes have, for example, been shown to efficiently catalyse the formation and cleavage of nucleic-acid phosphodiester bonds4-8, and this narrow range of RNAcatalysed reactions has been subsequently expanded by in vitro selection methods to include ester9 and amide10,24 bond formation SN2 reactions and porphyrin metallations11,12. Carbon–carbon bond formation and the creation of asymmetric centres are both of great importance biochemically, but have not yet been accomplished by RNA catalysis. A widely used reaction that creates two new carbon–carbon bonds and up to four stereo-centres is the Diels–Alder cycloaddition, which occurs between a 1,3-butadiene and an alkene. Here we report the successful application of in vitro selection to isolate pyridine-modified RNA molecules that catalyse a Diels–Alder cycloaddition. We find that the RNA molecules accelerate the reaction rate by a factor of up to 800 relative to the uncatalysed reaction. The Diels–Alder reaction investigated in this study is shown in Fig. 1, which also shows the modified reactants. The in vitro selection (SELEX) for RNA molecules with Diels–Alder activity (DAase activity) was carried out with a library of 1014unique sequences. The RNA molecules were constructed of a contiguous 100-nucleotide randomized region, flanked by constant sequence segments to allow for amplification and other enzymatic processes. The RNA was modified by substituting 5-pyridylmethylcarboxamid-UTP13 (compound 2 in Fig. 1) for UTP in the transcription reaction. Pyridyl-modified uridine 2 was chosen to augment the hydrogen-bonding groups of native RNA, to furnish additional hydrophobic and dipolar interactions, and to provide metal coordination sites unlike any contained in unmodified RNA. Previous attempts at isolating RNA DAases were unsuccessful using unmodified oligonucleotide libraries14. To allow for selection based on the ability to perform the desired chemistry, RNAs were coupled to an acyclic diene through a long polyethylene glycol (PEG; MW average relative molecular mass (Mr) 2,000) linker. The flexible PEG linker was used to provide the diene with the opportunity to access the RNA surface and mimic a diene substrate free in solution. Figure 1 The Diels–Alder reaction between the acyclic diene conjugated to the RNA through a long PEG linker and the maleimide dienophile 1 (BMCC). Full legend High resolution image and legend (34k) Rounds of in vitro selection for DAase activity were conducted by incubating the RNA– diene construct with the maleimide dienophile (compound 1 in Fig. 1; BMCC) in the presence of transition metals that could form pyridyl–Lewis acid complexes and/or coordinate to the RNA to enhance tertiary structure. The maleimide was attached to biotin so that RNA DAases could be partitioned away from unreacted RNA using streptavidin binding and denaturing polyacrylamide electrophoresis. Following 12 rounds of in vitro selection, the amount of Diels–Alder reaction observed in the presence of the RNA increased from 2.4% in 8 h to 3.3% in 3 min. Subsequently, the library was cloned and bidirectionally sequenced. From the 46 sequences obtained, eight non-clonally derived families were identified. One family comprised 59% of the library population while the other seven unique sequences ranged in representation from one to six of the 46 isolates. The random region sequences for representative isolates are shown in Fig. 2a . Computer-assisted local sequence alignment of representatives from each clonal family identified one consensus sequence, UUCUAACGCG, in five of the eight nonclonally derived sequences analysed (Fig. 2a). Beyond this limited amount of homology, there are no obvious sequence or structural similarities between the eight non-clonally derived families. Figure 2 a, Random region ( 100N) sequences of 11 isolates obtained from the DAase in vitro selection. Full legend High resolution image and legend (50k) Eight of the isolates shown in Fig. 2a were analysed for their DAase activity. Isolates were kinetically evaluated under solution-saturating concentrations of BMCC (500 µM). All other conditions were identical to those used during the selection. The results are summarized in Fig. 2b and indicate that each of the isolates tested was capable of facilitating the Diels–Alder reaction, establishing that there were at least eight unique sequences present in the original library that could promote this cycloaddition reaction. Isolate 22 (DA-22) was chosen for more detailed characterization. Control experiments demonstrated that reaction product formation required both substrates (diene and dienophile). DAase activity was completely dependent on the presence of the pyridylmodified uridine, as RNA transcribed using the native nucleotide triphosphate was inactive. This is not surprising as the unmodified RNA may have folded into considerably different three-dimensional structures. In addition, the absence of the pyridine groups could have eliminated unique metal-binding sites that effect Lewis acids catalysis. The insolubility of the dienophile substrate, BMCC, prevented complete Michaelis–Menten kinetic analysis of isolate 22. Nevertheless, kinetic analysis was performed on isolate 22 up to a maximum concentration of 500 µM BMCC (Fig. 3). The data are linear over the range of dienophile concentrations used with a slope equal to kcat/Km (3.95 0.05 M-1 s-1), where kcat is the catalytic rate constant and Km is the apparent dissociation constant of the RNA– substrate complex. Comparing the apparent second-order rate constant to the uncatalysed rate constant kuncat of 5.42 10-3 M-1 s-1 indicates that isolate 22 achieved an 800-fold rate acceleration of the Diels–Alder cycloaddition. The uncatalysed rate was measured using an identical construct comprised of random pyridine-modified RNA. Figure 3 The observed rate constant (kobs) increases as a function of BMCC concentration for isolate 22. Full legend High resolution image and legend (41k) Lineweaver–Burke analysis of the data from 30 to 500 µM BMCC gave estimates of kcat (0.011 0.002 s-1) and Km (2.3 0.5 mM). Comparison of kcat to the second-order rate constant for the uncatalysed reaction, kuncat, indicates that isolate 22 has an effective molarity of the order of 2 M (ref. 15). Despite a Km 10-fold higher than the concentration of BMCC (100 µM) used in the in vitro selection, the sequences from Family 1 (see Fig. 2a) seem to have been selected for their ability to increase the reaction rate of bound reactants (kcat) and in effect compensated for a high Km. These results clearly indicate that saturating concentrations of substrate are not necessary for the successful selection of RNA catalysts. Predictably, the free product of the Diels–Alder reaction acts as a reasonably good inhibitor for the RNA-catalysed reaction (Fig. 4). Again, the insolubility of BMCC precluded determination of an absolute inhibition constant (Ki), for the product but data collected at 500 µM BMCC and varying amounts of the product (compound 3 in Fig. 4) were used to determine an apparent Ki of 32.5 µM (ref. 16). Inhibition by the product is consistent with the RNA mediated Diels–Alder reaction occurring in a binding pocket which resembles the product or at least can readily undergo conformational changes to bind the product of the catalysed reaction. Figure 4 Inhibition of the DAase activity of isolate 22 by the free cycloaddition product 3 with an apparent Ki shown. Full legend High resolution image and legend (70k) The metal dependence of isolate 22 indicates an absolute dependence on Cu2+. No DAase activity was observed in the absence of Cu2+even in the presence of the other divalent metals used in the in vitro selection. Magnesium, calcium and copper restored DAase activity to 71% of that achieved by the complete metal mixture. The additional 29% incremental improvement appears to be a nonspecific metal–RNA interaction that can be accomplished using any of the divalent metals found in the reaction buffer. Titration experiments indicated maximum activity at 10–20 µM Cu2+, a level similar to that used in the in vitro selection (10 µM). The absolute and specific metal dependence on copper for DAase activity suggests that copper Lewis acid sites are formed as other metal ions present in the reaction buffer could have adopted similar coordination geometries17,18 resulting in similar RNA structures. The selection of copper in this capacity is consistent with known Lewis-acid-catalysed Diels–Alder reactions in water19,20. Indeed, although divalent metal ions have been shown to play an important role in the activity of other oligonucleotide catalysts, such exclusive dependence on Cu2+has not been observed. We used mass spectrometry to identify the DAase reaction product, in order to firmly establish RNA DAase activity21-23. Following reaction of isolate 22 with BMCC, the RNA was isolated by gel filtration and digested with ribonuclease I. The sample was then purified by high-performance liquid chromatograhy (HPLC) and subjected to electrosprayionization tandem quadrapole mass spectrometry. From the resulting data, the ion corresponding to the Diels–Alder product was identified (measured Mr, 630.6; calculated Mr, 630.8). Moreover, many additional fragment ions were observed which further substantiate formation of the Diels–Alder product. No ions were observed for BMCC reacting with any functional groups on the oligonucleotide, consistent with only the formation of the Diels–Alder cycloadduct. The methodology used to create these RNA DAases provides a straightforward approach to generating novel catalysts on demand that do not required templating of either substrate. The scope of RNA-catalysed reactions now includes carbon–carbon bond-forming reactions. Although no examples of RNA-catalysed carbon–carbon bond formation were previously known, there appear to be many solutions to [4 + 2] cycloaddition catalysis, as eight unique sequences were found that enhance the rate of the Diels–Alder reaction. These results suggest that similar strategies could be used to identify RNA molecules that catalyse other Diels–Alder reactions, including those with typically unreactive substrates, inverse electron demand Diels–Alder cycloadditions, and hetero Diels–Alder reactions. RNA catalysis of other types of cycloaddition reactions such as dipolar cycloadditions, particularly those benefiting from Lewis acid catalysis, are also possibilities. The ability to expand greatly the functional diversity of RNA through modified bases, to augment accessible chemistries through the use of transition metals, and, perhaps in the future, to include co-factor-assisted transformations, has significant implications for the range of reactions amenable to RNA catalysis. Methods Incubation conditions. The RNA–PEG–diene construct was prepared by ligating on a PEG-diene modified DNA 10-mer to the 5' end of the RNA using T4 DNA ligase. All RNA incubations were conducted under the following conditions except as noted: 50 mM HEPES, pH 7.0, 500 nM pyridyl methyl-modified RNA, 200 mM NaCl, 200 mM KCl, 1 mM CaCl2, 1 mM MgCl2, 10 µM each aluminium lactate, Ga2(SO4)2, MnCl2, FeCl2, CoCl.2, NiCl2, CuCl2 and ZnCl2, 10% ethanol and 2% dimethyl sulphoxide. The concentration of dienophile 1 (BMCC) varied in the isolate characterization experiments, but was held constant at 100 µM throughout the SELEX. Incubations were terminated by the addition of -mercaptoethanol to a final concentration of 5 mM and/or passing the solution over two successive Nap columns (Pharmacia) to remove excess BMCC. Reaction assay and partitioning. The extent of reaction and partitioning of reacted and unreacted RNA molecules was accomplished using a streptavidin (SA) dependent gel shift. The shifted and unshifted bands were visualized by autoradiography and phosphorimaging, the latter being used for quantification. For partitioning, shifted bands were excised, the RNA–SA complex extracted, desalted and subjected to reverse transcription and PCR amplification according to standard procedures. Kinetic analyses. All data were obtained at 500 nM RNA and the indicated amounts of BMCC. kobs values were determined by fitting the fraction of unreacted RNA to the equation for first-order kinetics. The uncatalysed second-order rate of Diels–Alder reaction was measured using random pyridine-modified RNA (kuncat = 5.42 10-3 M-1 s -1). Product inhibition. Apparent Ki values for the free cycloaddition product 3 were determined at 500 µM BMCC by fitting the observed first-order rate constants to the following equation for inhibition: kobs = (kobs0/2)( E - I - Ki + ((Ki + E - I)2+ 4Ki)1/2) where kobs is the measured rate constant in the presence of 3, kobs0 is the observed rate constant in the absence of 3, E represents the fractional ( ) concentration of functional active sites (E), I is the concentration of 3, and Ki is the apparent inhibition constant. Correspondence and requests for materials should be addressed to B.E.E. (email:beaton@nexstar.com). Received 23 April 1997; accepted 22 July 1997 References 1. Joyce, G. F.Ribozymes: Building the RNA world.Curr. Biol.6, 965-967 (1996). | PubMed | 2. Joyce, G. F.The rise and fall of the RNA world.New Biologist3, 399-407 (1991). | PubMed | 3. Joyce, G. F.Some biochemical thoughts on the RNA world.Chem. Biol.3, 405-407 (1996). | PubMed | 4. Cech, T. R.The chemistry of self-splicing RNA and RNA enzymes.Science236, 1532-1539 (1987). | PubMed | 5. Long, D. M.&Uhlenbeck, O. C.Self-cleaving catalytic RNA.FASEB J.7, 25-30 (1993). | PubMed | 6. Bartel, D. P.&Szostak, J. W.Isolation of new ribozymes from a large pool of random sequences.Science261, 1411-1418 (1993). | PubMed | 7. 7.Beaudry, A. A.&Joyce, G. F.Directed evolution of an RNA enzyme.Science257, 635-641 (1992). | PubMed | 8. Kumar, P. K. R.&Ellington, A. D.Artificial evolution and natural ribozymes.FASEB J.9, 11831195 (1995). | PubMed | 9. Illangasekare, M., Sanchez, G., Nickles, T.&Yarus, M.Aminoacyl-RNA synthesis catalyzed by an RNA.Science267, 643-647 (1995). | PubMed | 10. Lohse, P. A.&Szostak, J. W.Ribozyme-catalyzed amino-acid transfer reactions.Nature381, 442444 (1996). | PubMed | ISI | 11. Li, Y.&Sen, D.Acatalytic DNA for porphyrin metallation.Nature Struct. Biol.3, 743-747 (1996). 12. Conn, M. M., Prudent, J. R.&Schultz, P. G.Porphyrin metallation catalyzed by a small RNA molecule.J. Am. Chem. Soc.118, 7012-7013 (1996). | Article | 13. Dewey, T. M., Zyzniewski, C.&Eaton, B. E.The RNA world: functional diversity in a nucleoside by carboxyamidation of uridine.Nucleosides &Nucleotides15, 1611-1617 (1996). 14. Morris, K. N.et al.Enrichment for RNA molecules that bind a Diels-Alder transition state analog.Proc. Natl Acad. Sci. USA91, 13028-13032 (1994). | PubMed | 15. Jencks, W. P.in Catalysis in Chemistry and Enzymology644-712 (Dover Publications, New York, (1987)). 16. Williams, J. W.&Morrison, J. F.The kinetics of reversible tight-binding inhibition.Methods Enzymol.63, 437-467 (1979). | PubMed | 17. Kazakov, S. A.in Bioorganic Chemistry: Nucleic Acids(ed. Hecht, S. M.) 244 (Oxford Univ. Press, New York, (1996)). 18. Cotton, F. A.&Wilkinson, G.Advanced Inorganic Chemistry(Wiley, New York, (1988)). 19. Otto, S., Bertoncin, F.&Engberts, J. B. F. N.Lewis acid catalysis of a Diels-Alder reaction in water.J. Am. Chem. Soc.118, 7702-7707 (1996). | Article | 20. Otto, S.&Engberts, J. B. F. N.Lewis acid catalysis of a Diels-Alder reaction in water.Tetrahedr. Lett.36, 2645-2648 (1995). 21. Ni, J., Pomerantz, S. C., Rozenski, J., Zhang, Y.&McCloskey, J. M.Interpretation of oligonucleotide mass spectra for determination of sequence using electrospray ionization and tandem mass spectrometry.Anal. Chem.68, 1989-1999 (1996). | Article | PubMed | 22. Pomerantz, S. C., McCloskey, J. A., Tarasow, T. M.&Eaton, B. E.Deconvolution of combinatorial oligonucleotide libraries by electrospray ionization tandem mass spectrometry.J. Am. Chem. Soc.119, 3861-3867 (1997). | Article | 23. Tarasow, T., Tinnermeier, D.&Zyzniewski, C.Characterization of oligodeoxyribonucleotidepolyethylene glycol conjugates by electrospray mass spectrometry.Bioconjugate Chem.8, 89-93 (1997). | Article | 24. Wiegand, T. W., Janssen, R. C.&Eaton, B. E.Selection of RNA amide synthases.Chem. Biol.(in the press). Acknowledgements. We thank L. Gold for support, guidance and vision; the scientific community at NeXstar, especially members of the Medicinal Chemistry group, for helpful discussions and ideas; S. Wayland for the synthesis of compound 2; and T. Wiegand and D. Nieuwlandt for inspiring dialogue and technical assistance. We also thank S. C. Pomerantz, P. F. Crain and J. A. McCloskey for ESI–MS/MS analysis. Figure 1 The Diels–Alder reaction between the acyclic diene conjugated to the RNA through a long PEG linker and the maleimide dienophile 1 (BMCC). The RNA library was prepared with a 100-nucleotide random region (100N). Biotin represents the portion of BMCC not shown in the cycloadduct. Also shown is the structure of the pyridyl methyl n modified UTP (2) substituted for native UTP during transcription; OPPP represents the triphosphate moiety. Figure 2 a, Random region ( 100N) sequences of 11 isolates obtained from the DAase in vitro selection. Isolates are identified by the number to the left of the sequences. Members of clonal families are labelled (in parentheses) with the total number of family members and the family population as a percentage of the total number of sequences. The computer-identified consensus sequence is in bold and underlined for each of the isolates from families 1, 2 and 4 and the two orphans. b, Percentage of individual RNA isolates reacted as a function of time. Isolates are represented by the symbols listed to the right and were incubated with 500 µM BMCC for the times indicated. Figure 3 The observed rate constant (kobs) increases as a function of BMCC concentration for isolate 22. A linear fit of the data yields the line shown (y = 0.001545 + 0.0002373x; R = 0.9998) with the ratio kcat/Km determined from the slope. Figure 4 Inhibition of the DAase activity of isolate 22 by the free cycloaddition product 3 with an apparent Ki shown. Observed rate constants were measured at increasing concentrations of 3 in the presence of 500 µM BMCC (1) and fitted to the equation described in Methods. Mechanism of Ribosomal Peptide Bond Formation The mechanism of peptide bond synthesis constitutes a fundamental and long-debated question in molecular biology. For many years, ribosomologists championed a proteinbased mechanism, similar to the charge relay system that has been proposed for peptide hydrolysis by serine proteinases [references in (1)]. As it has become apparent that the peptidyl transferase center is composed mainly of RNA, however, two likely mechanisms for catalysis have emerged that are compatible with the available biochemical data: divalent metal ion catalysis (1), or acid-base catalysis mediated by a cytosine (N3) or adenosine (N1, N3). The environment of the catalytic nucleotide would create the unusal higher pKa (where Ka is the acid dissociation constant) necessary for it to behave analogously to the histidine of the serine proteinases. This pKa shift has been shown for a catalytic cytosine in the active site of the hepatitis delta virus ribosome (2). In support of the acid-base catalysis hypothesis, Muth et al. (3) have demonstrated that the highly conserved nucleotide A2451 (Escherichia coli numbering) in the active site has a pKa shifted to a value of around 7. In addition, in a 2.4 Å x-ray map of the large ribosomal subunit, Nissen et al. (4) show that the N3 of this nucleotide is in close contact with a transition state analog of peptide bond formation, CCdA-pPuro. These results were combined into a model in which A2451 functions both in general acid-base catalysis and in transition state stabilization (3, 4). Although this evidence is seductive, a definitive assignment of A2451 as the catalytic nucleotide must be treated with caution, for three reasons. (i) The structure observed with the transition analog CCdA-p-Puro might not be an active one, because small analogs of P site tRNAs require different reaction conditions (5). Furthermore, the dA substituted for the terminal A in the analog should interact differently with the active site and has indeed been shown to be inactive as a P-site substrate (6). (ii) A2451 was the main cross-link site of a P-site bound Phe-tRNA whose amino group was acylated by a benzophenone derivative (7). Significantly, the cross-linked tRNA was still active in peptide bond formation (8). (iii) A chloramphenicol-resistant mutant exists that harbors an A2451-to-U transversion (9). Taken together, these data argue against A2451 being the sole catalytic nucleotide, which in turn leaves open the question whether the catalysis of peptide bond formation is mediated by catalytic nucleotides, divalent metal ions, or both. The answer to this question will come from biochemical experiments, coupled with high-resolution x-ray analysis of active ribosomal 50S subunits containing maps of the relevant metal ions and water molecules. We eagerly await the results. Andrea Barta Silke Dorner Norbert Polacek Institute of Medical Biochemistry University of Vienna Dr. Bohrgasse 9/3 A-1030, Vienna Austria REFERENCES 1. A. Barta, I. Halama, in Ribosomal RNA and Group I Introns, R. Green, R. Schroeder, Eds. (Chapman & Hall, New York, 1996) pp. 35-54. 2. S. Nakano, D. M. Chadalovada, P. C. Bevilacqua, Science 287, 1493 (2000) [Abstract/Full Text] . 3. G. W. Muth, L. Ortoleva-Donnelly, S. A. Strobel, Science 289, 947 (2000) [Abstract/Full Text] . 4. P. Nissen, J. Hansen, N. Ban, P. B. Moore, T. A. Steitz, Science 289, 920 (2000) [Abstract/Full Text] . 5. M. Welch, J. Chastang, M. Yarus, Biochemistry 34, 385 (1995) [ISI][Medline] . 6. K. Quiggle, G. Kumar, T. W. Ott, E. K. Ryu, S. Chládek, Biochemistry 20, 3480 (1981) [ISI][Medline] . 7. G. Steiner, E. Kuechler, A. Barta, EMBO J. 7, 3949 (1988) [Abstract] . 8. A. Barta and E. Kuechler, FEBS Lett. 163, 319 (1983) [ISI][Medline] . 9. S. E. Kearsey and I. W. Craig, Nature 290, 607 (1981) [ISI][Medline] . 24 August 2000; accepted 22 November 2000 The generation of a model for the molecular structure of the large ribosomal subunit, from x-ray crystallographic studies at 2.4 Å resolution, is one of the most exciting biological advances in recent years (1). Of particular interest is the identification of the active site for peptidyl transfer based on the determination of the structure of a complex with a transition state analog (2). A key feature of this active site is a completely conserved adenosine that is proposed to act as a general base catalyst. The elegant study of Muth et al. (3), which revealed that this adenosine has a markedly shifted pKa near 7.6, supports the importance of this base. Based on these results, Nissen et al. (2) have proposed a mechanism that involves the deprotonation of the amino group of aminoacyl-tRNA (aa-tRNA), leading to the nucleophilic attack of the deprotonated amine on the carbonyl group of the ester linkage holding the growing polypeptide chain. Such a mechanism, however, has three significant difficulties: (i) The amino group would be expected to be largely protonated (that is, present as a primary ammonium group, NH3+) at neutral pH. The pKa value for a terminal amino group is expected to be approximately 8.0. (ii) Even with its shifted pKa, the adenosine is not nearly a strong enough base to deprotonate an amino group, NH2, that is expected to have a pKa near 35. Based on the difference of more than 25 pKa units, the rate of deprotonation of the amino group would be much too slow to support catalysis. (iii) The amino group, NH2, would presumably be a sufficiently strong nucleophile to attack the ester linkage without the need for assistance from a base. We propose an alternative mechanism that avoids these difficulties (Fig. 1). In this mechanism, the aa-tRNA is present initially in its protonated form, which resolves difficulty (i). The ribosomal base (adenosine 2451) then removes a proton from this ammonium group to generate the free amino group. The pKa values for the base and the ammonium group are expected to be nearly matched, so deprotonation should be quite feasible thermodynamically; this resolves difficulty (ii). The reaction products are the aa-tRNA with a free amino group and the protonated adenosine 2451 base. The amino group is a strong enough nucleophile to attack the ester linkage of the peptidyl-tRNA, which resolves difficulty (iii). This reaction generates a tetrahedral intermediate that collapses to release the tRNA previously linked to the polypeptide chain. The protonated adenosine 2451 can act as a general acid to facilitate this reaction, as suggested by Nissen et al. (2). This generates the product with the new peptide bond formed, but in an N-protonated form. The pKa of this product is expected to be very low, so that it would readily give up a proton to generate the final product. Fig. 1. Alternative mechanism for peptide bond formation within the ribosome. [View Larger Version of this Image (23K GIF file)] The mechanism proposed by Nissen et al. (2) and the alternative proposed here allow distinct experimental predictions. Most important, if a nucleotide with a dramatically altered pKa is indeed required to deprotonate an amino group, mutation of this base or its surroundings should abolish the ability of the ribosome to catalyze peptide bond formation completely. In contrast, in the alternative mechanism, the role of the base is to deprotonate an ammonium group with a pKa value expected to be near 8. Loss of this base would be expected to reduce the rate of this step by a relatively small (but functionally important) factor of between 5 and 100. Further experimental studies should provide additional data that should help distinguish between these two mechanisms, as well as other possible ones. Jeremy M. Berg Jon R. Lorsch Department of Biophysics and Biophysical Chemistry School of Medicine Johns Hopkins University 725 North Wolfe Street Baltimore, MD 21205, USA E-mail: jberg@jhmi.edu REFERENCES 1. N. Ban, P. Nissen, J. Hansen, P. B. Moore, T. A. Steitz, Science 289, 905 (2000) [Abstract/Full Text] . 2. P. Nissen, J. Hansen, N. Ban, P. B. Moore, T. A. Steitz, Science 289, 920 (2000) [Abstract/Full Text] . 3. G. W. Muth, L. Ortoleva-Donnelly, S. A. Strobel, Science 289, 947 (2000) [Abstract/Full Text] . 26 September 2000; accepted 22 November 2000 Response: Although the mechanistic proposals advanced in our studies (1, 2) are hypotheses that need to be examined critically, we do not believe that the points raised by Barta et al. refute them. In our crystal structures, both the substrate analog bound to the A-site and the CCdA-p-Puro intermediate analog hydrogen bond with residues identified as crucial for A-site and P-site binding, and the puromycin moieties of both occupy the same position. Hence, there is no compelling reason to believe that the CCdA-p-Puro complex is inactive; indeed, the opposite seems likely, because the one A in 23S rRNA that has an anomalously high pKa is positioned so that it could catalyze peptide bond formation if CCdA-p-Puro were properly oriented. There is no reason to assume that we are studying an inactive 50S subunit conformation. We have found small analogs of P-site tRNAs that do not require alcohol to react (2, 3), and preliminary crystallographic experiments suggest that they do indeed react in our crystals to yield products whose binding is consistent with our published complexes. The suggestion that the absence of a 2' OH from A76 of the CCdA-p-Puro might cause it to bind abnormally is interesting, but seems unlikely. Our model suggests that a 2' OH in that position could hydrogen bond with C2104 without much change in the position of the inhibitor. Findings reviewed by Chládek and Sprinzl (4) do not support the conclusion that the 2' OH at A76 is essential for P-site binding. Finally, when the CCdA in the P site is bonded to the puromycin of the A site, as it is in this case, the binding effect of the missing 2' OH should be reduced. Barta et al., citing earlier work (5, 6), also point out that ribosomes carrying PhetRNA, whose amino group is cross-linked to the A that we propose is critical for catalysis, remain active. This argument has two important deficiencies. First, it hinges on assumptions about the rate of the chemical step in peptide bond formation. Although the overall rate of ribosomal peptide elongation is reasonably well known, its chemical step is not rate limiting--and there is no convenient way to measure its rate in the ribosome. The assay done to demonstrate the activity of the cross-linked ribosomes required 15 min (5), enough time for a ribosome to make a polypeptide 1.8 × 104 amino acids long. A rate reduction of as much as 102 or 103 due to modifications of the A in question would not have been detected. Second, in 1983, Barta and Kuechler (5), using 3-(4'-benzoylphenyl)propionyl-PhetRNA (BP-Phe-tRNA), did indeed obtain cross-links that targeted 23S RNA exclusively when bound to what they believed was the P site, and they also showed that the cross-linked ribosomes were active in peptide bond formation (5). A year later, the cross-linked residues were identified as U2584 and U2585 (7). Four years after that, however, Steiner et al. (6), using optimized protocols, reported that BP-Phe-tRNA bound to the P-site cross-links to A2451 and C2452, and that the U2584 and U2585 cross-links reported earlier represent A-site binding. The peptidyl transferase activity assay of the crosslinked ribosomes was not repeated as far as we know, and hence it is not known whether ribosomes crosslinked on A2451 possess peptidyl transferase activity. The activity of U2584/U2585 cross-link products, although surprising, does not conflict with our model of catalysis. Indeed, it effectively rules out the possibility that these uridine residues have any other than secondary roles in the activity, which is exactly what we have proposed. Finally, Barta et al. cite a chloramphenicol-resistant mutant that contains an A2451to-U transversion (8). This also is not a crucial test of the role of A2486 (2451), however. As already noted, interpretation of these in vivo experiments requires knowing the rate of the chemical step compared with the overall rate of peptide bond synthesis. Moreover, as reported in (2), we have made all possible substitutions of A2451 in a plasmid encoding for 23S rRNA, and find that when expression from plasmids is induced in E. coli, they show a dominant lethal phenotype. Concerning the alternative model, metal ion catalysis, our electron density maps contain abundant information about the positions of metal ions and water molecules in the large ribsomal subunit--and, although our analysis is still incomplete, we see no evidence for a metal ion at the catalytic site. Thus, we conclude that none of the work cited by Barta et al. casts doubt on the proposal that A2486 (2451) plays a catalytic role, although future experiments may of course do so. The issue raised by Berg and Lorsch is one that we have also considered. Which -amino group proton, if any, is the base A2486 (E. coli 2451) removing? Although the pKa of an -amino group is about 8 in solution, its pKa might be higher or lower when aminoacylated-tRNA is bound to the ribosome. In the case of elongation factor Tu binding of aminoacyl-tRNA, the co-crystal structure establishes that the amino group is in the NH2 form in this complex (9). The influence of the large ribosomal subunit on the pKa of this -amino group is unknown, however, and thus the possibility should be considered that A2486 removes a proton from the NH3+ state of the -amino group, as Berg and Lorsch suggest. Although it is not likely that the contribution of A2486 to the rate of catalyses using this latter mechanism would be greater than about 10, it is not obvious that the expected contribution to the rate of catalysis of other possible mechanisms (transition state stabilization or removal of a proton from the NH2 state) would necessarily be much greater. One reason we have favored the mechanism proposed in (1) is that the protein synthesis reaction is the reverse of the acylation reaction of serine proteases, and thus their mechanisms may be chemically related. In the acylation step of chymotrypsin, for example, His57 is proposed to assist in the removal of a proton from Ser195 as it attacks the carbonyl carbon of the peptide bond being cleaved forming a tetrahedral carbon intermediate (10). In the breakdown of this intermediate, a protonated His57 is proposed to deliver a proton to the substrate peptide NH , which leaves to form the NH2 product. The third proton needed to form the NH3+ is presumed to come from water. If A2486 functions in peptide synthesis analogously to His57, one might anticipate that the reverse of the serine protease acylation reaction and peptide synthesis should follow similar mechanistic pathways. The pKa of the -NH2 is indeed high, but this pKa becomes greatly reduced as the nitrogen attacks the carbonyl carbon. Thus, removal by A2486 would get progressively easier as the C-N bond is being formed. In any event, it appears that this proton would be removed from the NH2 either by water or by A2486. Furthermore, Fahnestock et al. (11) showed many years ago that ribosomes will catalyze the formation of ester bonds between the formylmethionyl moiety of formylmethionyl tRNA and a puromycin derivative in which the -amino group is replaced by a hydroxyl group. Although the pKa of the hydroxyl group is much higher than that of the NH3+ it replaces, the pH/rate profiles for the puromycin and the hydroxypuromycin versions of the reaction are nevertheless the same (11). The second role for a protonated A2486 proposed in (1) is transition state or intermediate stabilization interaction with the oxyanion of the tetrahedral carbon intermediate. Indeed, this is the interaction observed in the crystal structure of the large subunit complex with CCdA-p-Puro. We can, of course, provide no quantitative estimate of the contribution made by A2486 to general base catalysis (whichever proton is removed), general acid catalysis, or transition state stabilization. Perhaps the largest contribution to the catalysis of peptide bond formation is provided by the positioning of the two substrates by the ribosome in an optimal orientation for attack of the -amino group of aminoacyl-tRNA on the carbonyl carbon of peptidyl tRNA. Only further mutagenic, kinetic, and structural experiments can address these issues. Poul Nissen Jeffrey Hansen Greg W. Muth Nenad Ban Department of Molecular Biophysics and Biochemistry Yale University New Haven, CT 06520-8114, USA Peter B. Moore Scott A. Strobel Thomas A. Steitz* Department of Molecular Biophysics and Biochemistry and Department of Chemistry Yale University * Also Howard Hughes Medical Institute REFERENCES 1. P. Nissen, N. Ban, J. Hansen, P. B. Moore, T. A. Steitz, Science 289, 920 (2000) [Abstract/Full Text] . 2. G. W. Muth, L. Ortoleva-Donnelly, S. A. Strobel, Science 289, 947 (2000) [Abstract/Full Text] . 3. S. A. Strobel, unpublished data. 4. S. Chládek and M. Sprinzl, Angew. Chem. Int. Ed. Engl. 24, 371 (1985) [ISI] . 5. A. Barta and E. Kuechler, FEBS Lett. 163, 319 (1983) [ISI][Medline] . 6. G. Steiner, E. Kuechler, A. Barta, EMBO J. 7, 3949 (1988) [Abstract] . 7. A. Barta, G. Steiner, J. Brosius, H. F. Noller, E. Kuechler, Proc. Natl. Acad. Sci. U.S.A. 81, 3607 (1984) [ISI][Medline] . 8. S. E. Kearsey and I. W. Craig, Nature 290, 607 (1981) [ISI][Medline] . 9. P. Nissen, et al., Science 270, 1464 (1995) [Abstract] . 10. D. M. Blow, J. J. Birktoft, B. S. Hartley, Nature 221, 331 (1969) . 11. S. Fahnestock, H. Neumann, V. Shashona, A. Rich, Biochemistry 9, 2477 (1970) [ISI][Medline] . 6 November 2000; accepted 22 November 2000 Reprint (PDF) Version of this Article Similar articles found in: SCIENCE Online ISI Web of Science Search Medline for articles by: Barta, A. || Steitz, T. A. Search for citing articles in: ISI Web of Science (4) Alert me when: new articles cite this article Download to Citation Manager One Sequence, Two Ribozymes: Implications for the Emergence of New Ribozyme Folds Erik A. Schultes, David P. Bartel* We describe a single RNA sequence that can assume either of two ribozyme folds and catalyze the two respective reactions. The two ribozyme folds share no evolutionary history and are completely different, with no base pairs (and probably no hydrogen bonds) in common. Minor variants of this sequence are highly active for one or the other reaction, and can be accessed from prototype ribozymes through a series of neutral mutations. Thus, in the course of evolution, new RNA folds could arise from preexisting folds, without the need to carry inactive intermediate sequences. This raises the possibility that biological RNAs having no structural or functional similarity might share a common ancestry. Furthermore, functional and structural divergence might, in some cases, precede rather than follow gene duplication. Whitehead Institute for Biomedical Research and Department of Biology, Massachusetts Institute of Technology, 9 Cambridge Center, Cambridge, MA 02142, USA. * To whom correspondence should be addressed. E-mail: dbartel@wi.mit.edu Published online: 4 February 2002, DOI:10.1038/nsb758 March 2002 Volume 9 Number 3 pp 225 - 230 A pre-translocational intermediate in protein synthesis observed in crystals of enzymatically active 50S subunits T. Martin Schmeing1, Amy C. Seila2, Jeffrey L. Hansen1, Betty Freeborn3, Juliane K. Soukup1, 4, Stephen A. Scaringe5, Scott A. Strobel1, 3, Peter B. Moore1, 3 & Thomas A. Steitz1, 3, 6 1. Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA. 2. Department of Genetics, Yale University, New Haven, Connecticut 065208114, USA. 3. Department of Chemistry, Yale University, New Haven, Connecticut 065208114, USA. 4. Present address: Department of Chemistry, Creighton University, Omaha, Nebraska 68178, USA. 5. Dharmacon Research, Inc., Lafayette, Colorado 80026, USA. 6. Howard Hughes Medical Institute, New Haven, Connecticut 06520-8114, USA. Correspondence should be addressed to T A Steitz. e-mail: eatherton@csb.yale.edu The large ribosomal subunit catalyzes peptide bond formation during protein synthesis. Its peptidyl transferase activity has often been studied using a 'fragment assay' that depends on high concentrations of methanol or ethanol. Here we describe a version of this assay that does not require alcohol and use it to show, both crystallographically and biochemically, that crystals of the large ribosomal subunits from Haloarcula marismortui are enzymatically active. Addition of these crystals to solutions containing substrates results in formation of products, which ceases when crystals are removed. When substrates are diffused into large subunit crystals, the subsequent structure shows that products have formed. The CCpuromycin-peptide product is found bound to the A-site and the deacylated CCA is bound to the P-site, with its 3' OH near N3 A2486 (Escherichia coli A2451). Thus, this structure represents a state that occurs after peptide bond formation but before the hybrid state of protein synthesis. The key chemical event in protein synthesis is peptide bond formation, which occurs at the peptidyl transferase center in the large ribosomal subunit1. We previously determined the structure of the large ribosomal subunit from the archaeal halophile Haloarcula marismortui2. On the basis of structures of that subunit in complex with a putative transition state analog and a substrate analog, we proposed that its 23S rRNA component is responsible for catalyzing peptide bond formation. The 23S rRNA facilitates this formation by first positioning the 3' ends of peptidyl and aminoacyl-tRNA via their binding to the A- and P-loops and, second, by acid-base catalysis involving A2486 (E. coli A2451)3. Experiments to evaluate the role of A2486 (A2451) in catalysis are ongoing4-7. Large ribosomal subunits will catalyze the reaction of puromycin with CCA–fMet, which is an analog of the 3' end of fMet to form fMet–puromycin8. The substrates of this 'fragment reaction', which has long been used to assay the peptidyl transferase activity of large ribosomal subunits, have molecular weights low enough so that they can diffuse into crystals. Unfortunately, the fragment reaction requires high concentrations of alcohol, which is decidedly nonphysiological. However, the puromycin reaction, from which the fragment assay is derived, is not intrinsically alcohol-requiring9; alcohol is required only to enhance the affinity of low molecular weight P-site substrates for the ribosome8. We have devised somewhat larger substrates that will participate in a version of the fragment reaction that does not depend on alcohol. When these substrates are diffused into large subunit crystals, the structure of the resulting complex reveals that they have been converted to products. Further, biochemical analysis indicates that crystals of H. marismortui 50S subunits suspended in a solution of these substrates catalyze the conversion of substrates to products. These results establish that crystal structures of the large ribosomal subunit from H. marismortui in complex with substrate, intermediate and product analogs are biologically relevant and allow us to view the structure of a pretranslocation state of protein synthesis. An alcohol-free fragment reaction The version of the fragment reaction described here utilizes two small synthetic substrates (Fig. 1) and can be used to assay the peptidyl transferase activity of isolated 50S ribosomal subunits in the absence of 30S subunits, full length tRNAs, mRNA, soluble factors and organic co-solvent. The Psite substrate consists of the RNA sequence CCA attached via an ester linkage to a phenylalanine whose -amino group is covalently linked to biotin via a caproic acid moiety (CCApcb). This modification mimics the growing peptide chain at the 3' end of a P-site-bound tRNA. The A-site substrate is Cpuromycin (C-pmn), which binds more tightly to the A-site than puromycin alone10. C-pmn was 5' 32P-radiolabeled (Fig. 2, lane 1) and reacted with 10 M P-site substrate in the presence of 6.3 M 50S ribosomal subunits from Escherichia coli, 200 mM NH4Cl, 40 mM MgCl2, 50 mM 3-(N-morpholino)ethanesulfonic acid (MOPS), pH 7.1, and 33% methanol at 0 °C. A new product with slower gel mobility was produced that corresponded to C-pmn-pcb (Fig. 2, lane 2). The same product is also produced in the absence of methanol at 37 °C (Fig. 2, lane 3). Addition of streptavidin to the reaction before loading it on the gel resulted in a strong reduction of its gel mobility, indicating that the biotin of the P-site substrate-peptide chain mimic is present on the product (Fig. 2, lane 4). The dipeptide product was purified, and its molecular weight was confirmed by mass spectroscopy. Under conditions of limiting C-pmn, the reaction proceeds with a single exponential decay at a rate of 0.1 min-1, in which 95% of the limiting substrate is converted into product. This rate is dramatically dependent on the reaction pH, as reported for the minimal fragment reaction, which uses smaller substrates and requires alcohol11. Under saturating conditions at pH 8.3, the reaction proceeds at a rate of 3.8 min-1, and addition of methanol has no effect on its rate. Chloramphenicol inhibits peptide bond synthesis by binding directly to the active site of the 50S ribosomal subunit8, 12, 13. In the presence of 1 mM chloramphenicol with limiting CCApcb and excess C-pmn, the rate of reaction decreases 35fold. The sensitivity of this reaction to chloramphenicol confirms that it is dependent upon the peptidyl transferase center. This assay does not require organic co-solvent, presumably because of the peptide-like structure of CCA-pcb and interactions of C-pm with the ribosome that are beyond those possible for puromycin alone. Crystallographic analysis of fragment product formation The structure of the complex formed when H. marismortui large subunit crystals were soaked in solutions containing similar fragment substrates showed the presence of products bound to the peptidyl transferase center. Crystals were soaked for three hours at 4 °C in a solution containing 1 mM CC-pmn and 1 mM CCA-pcb. The crystals were then frozen, and X-ray diffraction data were measured to 3.1 Å resolution. A difference electron density map calculated using the observed structure factor amplitudes clearly shows electron density for the newly formed peptide linkage between the hydroxy-methyltyrosine in the A-site substrate and the phenylalanine transferred from the P-site substrate (Fig. 3; Table 1). The peptide moiety is directed towards the exit tunnel, where there is much weaker density that might arise from the caproic acid linker and biotin that are attached to the Phe. This peptidyl product, CC-puromycin-phenylalaninecaproic acid-biotin (CC-pmn-pcb), is present at high occupancy (>0.7) and is located in the A-site (Fig. 4). CCpmn-pcb binds to the A-loop in the manner reported earlier for CC-pmn bound to the A-site3, with the analogs of tRNA nucleotide C75 base-paired to G2588 (G2553) and A76 making an A-minor interaction. The corresponding portions of the substrate and product superimpose with a root mean square (r.m.s.) deviation of 0.61 Å for the RNA bases and 0.81 Å for the entire RNA moiety. The peptide portion of this product is in an extended conformation, with the methyl Tyr and Phe side chains extending in opposite directions. The and angles of these residues lie within the allowed region for -strands in a Ramachandran plot. These residues do not make any specific contacts with the ribosome, which might be expected because strong interactions between the exit tunnel and the nascent peptide chain could inhibit translation. A significant movement of U2620 (U2585) from its position in the unliganded subunit is seen again; its base is repositioned so that a hydrogen bond can form between the O4 of U2620 (U2585) and the 2' OH of dimethyl-A76, which is 3.0 Å away. The other product of the reaction, the deacylated CCA, is visible in the P-site but at a lower occupancy. The 3' OH to which the peptide had been attached is near ( 5 Å) the N3 of A2486 (A2451), whose position remains unchanged. Biochemical analysis of crystal activity The appearance of peptide synthesis products in the structure of the complex formed after crystals of the 50S subunit are soaked in substrates prompted us to assess the enzymatic activity of these crystals following a biochemical approach pioneered in studies of ribonuclease crystals by Doscher and Richards14 that established that crystals of enzymes can be catalytically active. Due to the large aqueous channels in crystals of macromolecules, substrates can diffuse in and react, and then products can diffuse out, allowing crystallized enzymes to catalyze reactions. In order to carry out the analogous experiments on the large ribosomal subunit, crystals of H. marimortui 50S were grown as described15, harvested and washed extensively. These crystals were incubated with 30 M 5' 32P-radiolabeled C-pmn and 45 M CCA-pcb in one of the buffers used to stabilize the crystals at 22 °C. The reaction was also performed on 50S ribosomes from a noncrystalline stock solution in the same buffer. Under these multiple turnover conditions, the rate of the solution reaction was 0.34 min-1; the reaction in the crystals was reduced less than four-fold (0.089 min-1) (Fig. 5). A reduction in rate is expected because of the requirement for the diffusion of substrates into and products out of the crystals. To prove that the peptidyl transferase activity observed was not derived from soluble ribosomes contaminating the crystals, the large subunit crystals were physically removed from the reaction after 180 min. There was no further conversion of substrate to product (Fig. 5). As is the case with E. coli ribosomes, the reaction involving H. marismortui ribosomes is strongly influenced by pH. When the pH is raised to 7.0, the rate of reaction for ribosomes in solution increases approximately six-fold to 1.9 min-1. In contrast to E. coli ribosome preparations, which often include subunits that can be activated by heating6, we have observed no increase in the activity of H. marismortui large ribosomal subunits upon their incubation at 37 °C in the presence of K+ before performing the reaction at 4 °C, suggesting that they are fully active. Our crystallographic studies of the H. marismortui 50S ribosomal subunit have been carried out under two different salt conditions. The first is high in potassium chloride, which corresponds to the reaction conditions used above and is similar to the conditions used for crystal growth. The other has sodium as the dominant cation and was used in the 2.4 Å resolution structure determination. When the assay just described was repeated in high sodium buffer, the reaction rate was about the same as in the high potassium buffer. Further, comparison of the 50S subunit structure obtained in high sodium with that derived in high potassium (at 3.0 Å resolution) shows no significant differences. Therefore, because these subunits are active in both sodium and potassium dominated buffers and have the same structure in both, there is every reason to believe that the reported structure2 of the H. marismortui large ribosomal subunit represents an active conformation. We have also assayed in solution the peptidyl-transferase activity of the H. marismortui large ribosomal subunit in 3 M KCl and find that its activity in the fragment assay is equal to or slightly less than its activity in the lower salt concentrations used in the crystallographic experiments. Thus, at the salt concentrations used in these experiments, there is no influence of either the identity of the majority cation or ionic strength on the rate of peptide bond formation, contrary to previous suggestions16. Not only do the 50S subunits used in these studies have peptidyl transferase activity, they are also active in the poly Udirected synthesis of polyphenylalanine in the presence of 30S subunits17. Averaged over many preparations, the large subunits used for crystallization catalyze the incorporation of 67 Phe residues per subunit per hour at 37 °C. Under the same conditions, subunits recovered from stabilized crystals incorporate 107 Phe residues per subunit per hour, which is remarkable considering that crystallization involves several weeks of incubation at 19 °C. Furthermore, large subunits in a preparation that fails to crystallize are significantly less active than those that do. Expansion of the hybrid states model for translation The currently accepted model for translation is termed the hybrid states model18. It posits that the anticodon and acceptor ends of a tRNA can simultaneously occupy different tRNA binding sites (A, P or E) on the 30S and 50S ribosomal subunits (Fig. 6)19. In the present version of this model, aminoacyl-tRNA binds to A-sites on both subunits (A/A) and peptidyl tRNA is in both P-sites (P/P) just before the peptidyl transferase reaction (Fig 6b). After peptidyl transferase occurs, the acceptor ends of the tRNAs move spontaneously to the adjacent site on the large subunit to form the P/E and A/P hybrid states. However, by using a tRNA mimic that contains puromycin, we have trapped the system in a state after the peptidyl transferase reaction has occurred but before the hybrid state has been achieved. This establishes the pretranslocation state (Fig. 6c) as a step in the overall reaction, consistent with both crosslinking and kinetic data already in the literature20, 21. This state apparently has been observed here due to the low affinity of puromycin derivatives for the P-site. Where the peptidyl product, CC-pmn-pcb (Fig. 1), is soaked into crystals, it binds to the A-site rather than the P-site (T.M.S., unpublished data). At 4 °C after a three hour soak, <1% of the substrate available had been converted to product. This suggests that under these buffer conditions, product dissociation is much slower than the chemical step of peptide bond formation. Thus, the structure we have captured represents the complex just before the rate-limiting step of the peptidyl transferase reaction under these conditions rather than product binding to the ribosome after the reaction has reached equilibrium. The reaction occurring without the A-site tRNA analog moving to the P-site demonstrates that peptide bond formation is not concerted with the movement of products into the hybrid state, but rather that reaction occurs first, yielding the pretranslocation state we see in our crystals (Figs 3, 4, 6c) before movement into the hybrid states. Conclusions The experiments reported here establish that crystallographic studies of the H. marismortui 50S ribosomal subunit and its substrate complexes are relevant to understanding the structural basis of peptide bond formation. First, development of a version of the fragment assay for peptide bond synthesis that does not depend upon alcohol has made it possible to show that, in the absence of the 30S subunit, the isolated 50S subunit has a conformation that is correct for catalyzing peptide bond formation. Second, using this assay, the H. marismortui 50S subunit is shown to be active in the crystal at a level that is comparable to its activity in solution, independent of whether sodium or potassium is the primary cation. Third, the products of this fragment reaction are found bound to the peptidyl transferase center close to each other and to the N3 of A2486. Fourth, these complexes have allowed identification of an additional intermediate state that occurs during translocation. Methods Modified fragment assay with E. coli ribosomes. The peptidyl transferase reaction was performed by incubating 2 nM 5' [32P] C-pmn, 9 M E. coli 50S subunits and 30 M CCA-pcb in the presence of 50 mM MOPS buffer, pH 7.1, 40 mM MgCl2 and 200 mM NH4Cl at 37 °C (the synthesis of CCA-pcb will be described elsewhere by S.A.S.) To determine the rate of reaction, aliquots of the reaction were removed at specified times and quenched in formamide loading buffer. Unreacted substrate was separated from product by electrophoresis using a 12% polyacrylamide gel. The fraction of unreacted substrate was plotted versus time and fit to a single exponential curve. The conditions that give the maximum rate of reaction were 2 nM 5' [32P]CCA-pcb, 9.6 M 50S subunits, 960 M C-pmn, 50 mM MOPS buffer, pH 8.3, 40 mM MgCl2 and 200 mM NH4Cl at 37 °C. The conditions used to determine the effect of chloramphenicol on the rate of reaction were 2 nM 5' [32P]CCA-pcb, 7.8 M 50S subunits, 0.8 mM C-pmn, 50 mM MOPS buffer, pH 7.1, 40 mM MgCl2 and 200 mM NH4Cl. To confirm the identity of the peptidyl product, MADLI TOF mass spectrometry was performed. The mass (positive-ion TOF) calculated for CC-pmn-pcb (C56 H76 N14 O16 PS [MH+]) was 1263.50 Da, and the observed mass was 1263.36 0.02 Da. Crystallography. For crystallography experiments, CC-pmn was used in place of C-pmn as the A-site substrate. Crystals were incubated in 1 mM CC-pmn, 1 mM CCA-pcb, 1.6 M NaCl, 30 mM MgCl2, 20% (v/v) ethylene glycol, 0.5 M NH4Cl, 100 mM KOAc and 12% (w/v) PEG 6000, pH 6.0, at 4 °C for 3 h before flash-freezing in liquid propane. The data were collected on crystals frozen at 100 K on beamline 19-ID of the Advanced Photon Source (Argonne National Laboratory, Argonne, Ilinois) using a 3 3 charge-coupled device (CCD) detector, photons of 1.0 Å wavelength, 80 m 80 m beam size and 0.4° oscillations. DENZO and SCALEPACK22 were used to process the data set. CNS23 was used for map calculations and refinement, and O24 was used for model building. For comparison of the two complexes, models of the ribosome bound with acetylated minihelix3 and CC-pmn-pcb were superimposed using the least squares function in O 24 with phosphates from the ribosomal RNA surrounding the active site. R.m.s. deviations between substrate and product were then calculated with CNS23 without direct superimposition of substrate and product, using all corresponding atoms in the RNA or RNA bases. Modified fragment assay with H. marismortui ribosomes. Crystal of the large ribosomal subunit from H. marismortui were harvested from sitting drops and washed extensively with buffer. These crystals were incubated with 30 M C-pmn spiked with trace amounts of 5' 32P-radiolabeled C-pmn and 45 M CCA-pcb in 1.4 M KCl, 0.6 M NH4Cl, 120 mM KOAc, 40 mM MgC12 and 10% (w/v) PEG 6000, pH 6.0, at 22 °C. When the reaction was perfomed using crystals from a soluble stock, the 50S concentration was 0.7 M. Because the sizes and, thus, numbers of crystals used in each trial vary, getting the same concentration of ribosomes in each trial is not possible. Concentrations of ribosomes in crystal reactions, which were determined after the reactions were performed by dissolving the crystals and measuring their optical density at 260 nm, varied from 0.6 M to 1.1 M. The progress of the reaction was monitored by withdrawing aliquots of the reaction mixture at regular intervals, mixing the aliquots with formamide and later resolving the products on a 12% polyacrylamide gel. In some trails, the reaction mixture was removed from the crystals and put in a new tube for 180 min. The aliquots were then taken from this crystal-free solution. To determine the effect of high sodium on the reaction, the assay was performed using soluble ribosomes in 1.5 M NaCl, 0.5 M NH4Cl, 96 mM KOAc, 72 mM KCl and 23 mM MgCl2, pH 6.0, at 22 °C. To determine the effect of potassium concentration, the reaction was performed in buffer containing 1.5 M KCl, 0.5 M sodium acetate and 55 mM 2-(Nmorpholino)propanesulfonic acid (MES), pH 6.5, at 22 °C. This gave a rate of 0.85 min-1. When the KCl concentration was raised to 3 M, the rate was unchanged within experimental error. In vitro translation assay. To assess the activities of the H. marismortui 50S particles used in crystallization, 40–90 nM 50S particles were recovered by melting crystals, and 140 nM 30S particles were incubated in 1.5 M (NH4)2SO4, 0.4 M NH4Cl, 30 mM Tris, 30 mM magnesium acetate, 0.5% (v/v) glycerol, 1.3 mM ATP, 0.33 mM GTP, 3.3 mM phosphoenol pyruvate, 219 g ml-1 pyruvate kinase, 25 M phenylalanine, 0.28 Ci ml-1 of [14C]phenylalanine, 0.45 mg ml-1 yeast tRNA, 5 mM -mercaptoethanol, soluble proteins from S100 fraction (exact amount used was optimized for each preparation) and 15 g ml-1 polyuridine, pH 7.6, at 37 °C for 90 min. Polyphenylalanine was precipitated on 3 MM Whatman filters by cold 5% (w/v) trichloroacetic acid and 1% (w/v) casamino acids solution. Filters were washed with this solution at 90 °C and then with ethanol before the poly-phenylalanine was quantitated by scintillation counting. Coordinates. Atomic coordinates of the product complex with the 50S subunit have been deposited in the Protein Data Bank (accession code 1KQS). Received 9 October 2001; Accepted 10 January 2002; Published online 4 February 2002. REFERENCES 1. Monro, R.E. Catalysis of peptide bond formation by 50S ribosomal 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. subunits from Escherchia coli. J. Mol. Biol. 26, 147-151 (1967). | PubMed | ISI | Ban, N., Nissen, P., Hansen, J., Moore, P.B. & Steitz, T.A. The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905-920 (2000). | Article | PubMed | ISI | Nissen, P., Hansen, J., Ban, N., Moore, P.B. & Steitz, T.A. The structural basis of ribosome activity in peptide bond synthesis. Science 289, 920-930 (2000). | Article | PubMed | ISI | Polacek, N., Gaynor, M., Yassin, A. & Mankin, A.S. Ribosomal peptidyl transferase can withstand mutations at the putative catalytic nucleotide. Nature 411, 498-501 (2001). | Article | PubMed | ISI | Xiong, L., Polacek, H., Sander, P., Boettger, E.G. & Mankin, A.S. pKa of adenine 2451 in the ribosomal peptidyl transferase center remains elusive. RNA 7, 1365-1369 (2001). | PubMed | ISI | Bayfield, M.A., Dahlberg, A.E., Schulmeister, U., Dorner, S., & Barta, A. A conformational change in the ribosomal peptidyl transferase center upon active/inactive transition. Proc. Natl. Acad. Sci. USA 98, 10096-10101 (2001). | PubMed | ISI | Thompson, J. et al. Analysis of mutations at residues A2451 and G2447 of 23S rRNA in the peptidyltransferase active site of the 50S ribosomal subunit. Proc. Natl. Acad. Sci. USA 98, 9002-9007 (2001). | Article | PubMed | ISI | Monro, R.E. & Marker, K.A. Ribosome-catalysed reaction of puromycin with a formylmethionine-containing oligonucleotide. J. Mol. Biol. 25, 347-350 (1967). | PubMed | ISI | Traut, R.R. & Monro, R.E. The puromycin reaction and its relation to protein synthesis. J. Mol. Biol. 10, 63-71 (1964). | ISI | Quiggle, K. & Chladek, S. The role of the cytidine residues of the tRNA 3'-terminus at the peptidyltransferase A- and P-sites. FEBS Lett. 118, 172-175 (1980). | PubMed | ISI | Maden, B.E. & Monro, R.E. Ribosome-catalyzed peptidyl transfer. Effects of cations and pH value. Eur. J. Biochem. 6, 309-316 (1968). | PubMed | ISI | Fernandez-Munoz, R., Monro, R.E., Torres-Pinedo, R. & Vazquez, D. Substrate- and antibiotic-binding sites at the peptidyltransferase centre of Escherichia coli ribosomes. Studies on the chloramphenicol, lincomycin and erythromycin sites. Eur. J. Biochem. 23, 185-193 (1971). | PubMed | ISI | Pestka, S. Studies on the formation of transfer ribonucleic acidribosome complexes. Phenylalanyl-oligonucleotide binding to ribosomes and the mechanism of chloramphenicol action. Biochem. Biophys. Res. Commun. 36, 589-595 (1969). | PubMed | ISI | Dosher, M. S. & Richards, F. M. The activity of an enzyme in the crystalline state: ribonuclease S*. J. Biol. Chem. 238, 2399-2406 (1963). Ban, N. et al. A 9 Å resolution X-ray crystallographic map of the large ribosomal subunit. Cell 93, 1105-1115 (1998). | PubMed | ISI | Bashan, A. et al. in Cold Spring Harbor symp. on quantitative biology 66, (Cold Spring Harbor Press, New York, in the press). Marin, I., Sanz, J.L., Sanchez, M.E. & Amils, R. Archaea, A laboratory manual, Halophiles (eds Robb, F.T. et al.) (Cold Spring Harbor Press, New York; 1995). Moazed, D. & Noller, H.F. Intermediate states in the movement of 19. 20. 21. 22. 23. 24. 25. transfer RNA in the ribosome. Nature 342, 142-148 (1989). | PubMed | ISI | Wilson, K.S. & Noller, H.F. Molecular movement inside the translational engine. Cell 92, 337-349 (1998). | PubMed | ISI | Borowski, C., Rodnina, M.V. & Wintermeyer W. Truncated elongation factor G lacking the G domain promotes translocation of the 3' end but not the anticodon domain of peptidyl-tRNA. Proc. Natl. Acad. Sci. USA 93, 4202-4206 (1996). | PubMed | ISI | Green, R., Switzer, C. & Noller, H.F. Ribosome-catalyzed peptidebond formation with an A-site substrate covalently linked to 23S ribosomal RNA. Science 280, 286-290 (1998). | Article | PubMed | ISI | Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307-326 (1997). | Article | PubMed | ISI | Brünger, A. T. et al. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D 54, 905-921 (1998). | Article | ISI | Jones, T. A., Zou, J. A., Cowan, S. & Kjeldgaard, M. Improved method for binding protein models in electron density maps and the locations of errors in these models. Acta Crystallogr. A 47, 110-119 (1991). | Article | PubMed | ISI | Yusupov, M. M. et al. Crystal structure of the ribosome at 5.5 Å resolution. Science 292, 883-896 (2001). | Article | PubMed | ISI | Figure 1: Schematic of the modified fragment assay. The substrates are shown on the left. CCAphenylalanine-caproic acid-biotin (CCA-pcb) and Cpuromycin (C-pmn) undergo a ribosome-dependent reaction in which a peptide bond is formed between the -amino group of C-pmn and the carbonyl ester of the phenylalanine moiety of CCA-pcb, yielding the two products: C-puromycin-phenylalanine-caproic acidbiotin (C-pmn-pcb) and a deacylated CCA. Figure 2: Demonstration of peptide bond formation catalyzed by 50S subunits in the absence of organic co-solvent. Lane 1 is 5' 32P-radiolabeled C-pmn alone; and lane 2, 5' 32P-radiolabeled C-pmn reacted with 10 M P-site substrate in the presence of 6.3 M 50S ribosomal subunits, 200 mM NH4Cl, 40 mM MgCl2 and 33% (v/v) methanol at 0 °C for 4 h. Lanes 3 and 4 are 5' 32Pradiolabeled C-pmn reacted with 10 M P-site substrate in the presence of 6.3 M 50S ribosomal subunits, 200 mM NH4Cl and 40 mM MgCl2, with no methanol, at 37 °C for 1.5 h. In lane 4, streptavidin was added to the reaction in lane 3 directly before loading onto the gel, which resulted in a supershift. Figure 3: Stereo view of the difference electron density map showing the formation of the product bound in the A-site. The difference electron density map was calculated using experimental structure factor amplitudes from the complex and parent crystals; the starting experimental phases were improved by density modification. The map is contoured at 4 . A skeletal model of CC-pmnpcb (green bonds) shows its presence in the A-site. U2620 (U2585) from 23S rRNA (red bonds) shifts markedly and places its O4 within hydrogen bonding distance to the 2' OH from the dimethyl-A76 of puromycin. Atoms are colored by type (carbon is green; oxygen, red; nitrogen, blue; and phosphorus, pink). Figure 4: Structure of the new fragment reaction products bound to the ribosome. a, A space-filling representation of the 50S particle (RNA in white and protein in yellow) in complex with products, with the three tRNAs as they were observed25 binding to the Thermus thermophilus 70S ribosome superimposed for reference. The subunit has been split through the tunnel, and the front half was removed to reveal the tunnel and the peptidyl transferase site (boxed). The orientation is the crown view, with the L1 protein to the left and the L7–L12 stalk to the right. b, A close-up view of the active site shows that the peptidylproduct (CC-Pmn-pcb) (green) binds the A-loop (yellow), whereas the deacylated product (CCA) (violet) base pairs to the P-loop (blue). The N3 of A2486 (A2451) (light blue) is in proximity to the 3' OH of the deacylated product, and the base of U2620 (U2585) (red) has moved near to the newly formed peptidyl ester link and the 3' OH of dimethyl A76. Figure 5: Activity of H. marismortui 50S ribosomal subunits in crystallized form. The fragment reaction is catalyzed by large ribosomal subunits from a soluble stock (i) or as washed crystals (ii and iii). After 180 min (arrow), the crystals were removed from the reaction (iii), and no further enzymatic activity was seen. Because the sizes and, thus, numbers of crystals used in each trail vary, using error bars in this graph is not possible. Instead typical results are shown. Concentrations of ribosomes in this experiment were (i) 0.7 M (ii) 0.9 M and (iii) 0.9 M. Figure 6: A new step in the hybrid states model of protein synthesis. a,b, Elongation factor-Tu (EF-Tu) (circle) delivers an aminoacyl-tRNA to the A-site. c, The peptidyl group is transferred to the A-site tRNA during the reaction, yielding the intermediate, d, before the hybrid state is adopted. e, EF-G (square) then catalyzes translocation. In this study, substrate analogs that mimic state (b) on the 50S subunit are soaked into the crystals and react to yield state (c). For an expanded version of the hybrid states model, see ref. 19, after which this figure was modeled. Table 1: Statistics for data collection and refinement 24 May 2001 Nature 411, 498 - 501 (2001); doi:10.1038/35078113 <> Ribosomal peptidyl transferase can withstand mutations at the putative catalytic nucleotide NORBERT POLACEK, MARNE GAYNOR, AYMEN YASSIN & ALEXANDER S. MANKIN Center for Pharmaceutical Biotechnology (MC 870), University of Illinois, 900 South Ashland Avenue, Chicago, Illinois 60607, USA Correspondence and requests for materials should be addressed to A.S.M. (e-mail: shura@uic.edu). Peptide bond formation is the principal reaction of protein synthesis. It takes place in the peptidyl transferase centre of the large (50S) ribosomal subunit. In the course of the reaction, the polypeptide is transferred from peptidyl transfer RNA to the -amino group of amino acyl-tRNA. The crystallographic structure of the 50S subunit showed no proteins within 18 Å from the active site, revealing peptidyl transferase as an RNA enzyme1. Reported unique structural and biochemical features of the universally conserved adenine residue A2451 in 23S ribosomal RNA (Escherichia coli numbering) led to the proposal of a mechanism of rRNA catalysis that implicates this nucleotide as the principal catalytic residue2, 3. In vitro genetics allowed us to test the importance of A2451 for the overall rate of peptide bond formation. Here we report that large ribosomal subunits with mutated A2451 showed significant peptidyl transferase activity in several independent assays. Mutations at another nucleotide, G2447, which is essential to render catalytic properties to A2451 (refs 2, 3), also did not dramatically change the transpeptidation activity. As alterations of the putative catalytic residues do not severely affect the rate of peptidyl transfer the ribosome apparently promotes transpeptidation not through chemical catalysis, but by properly positioning the substrates of protein synthesis. The proposed role of A2451 in the peptidyl transfer reaction is consistent with the results of experiments that implicate this nucleotide in interaction with peptidyl transferase substrates4-6; however, some biochemical and genetic data seem to be in conflict with the proposed catalytic mechanism7, 8. Mutations at A2451 (Fig. 1) are dominantly lethal in E. coli3 making it difficult to ascertain directly the functional role of this critical nucleotide in vivo. Therefore, to investigate the importance of A2451 for peptide bond formation, we used an in vitro genetics approach9. We engineered three possible nucleotide substitutions at A2451 in the cloned Thermus aquaticus 23S rRNA gene. Mutant rRNAs were assembled with ribosomal proteins and 5S rRNA into large ribosomal subunits, and the activities of the mutant subunits were tested in one of the most commonly used peptidyl transferase assays, known as the 'fragment reaction'10 (assay I; Fig. 2a). In the fragment reaction, the large ribosomal subunit catalyses the transfer of a peptidyl analogue, formyl[35S]methionyl, from formyl-methionyl-tRNA (fMet-tRNA) to puromycin, a structural analogue of the 3'-end of amino acyl-tRNA. Notably, all three mutants containing base changes at the proposed principal catalytic residue A2451 showed significant peptidyl transferase activity (Fig. 2a). When compared at the end of the linear range of the reaction (at 5 min), the mutant subunits showed activities ranging from 2% (2451G) to 44% (2451U) compared with the reconstituted wild-type 50S subunits (Table 1). Figure 1 The secondary structure of the central loop of domain V of T. aquaticus 23S rRNA. Full legend High resolution image and legend (31k) Figure 2 Peptidyl transferase activity of reconstituted large ribosomal subunits harbouring mutations of the proposed catalytic nucleotide A2451. Full legend High resolution image and legend (31k) To confirm this finding, we investigated the activities of mutant ribosomal subunits in a substantially different assay. In contrast to the fragment reaction, where the donor substrate (fMet-tRNA) was present in limiting amounts and the acceptor substrate (puromycin) was in excess, in assay II, the relative concentrations of substrates (N-acetyl-Phe-tRNA as the donor and 5' [32p]pCpCp-puromycin (CC-puromycin) as the acceptor) were reversed. Again all three mutants showed markedly high activity, ranging from 7% (2451U) to 62% (2451G) compared with the wild-type subunits (Fig. 3c; and Table 1). The relative activities of the mutants were switched in these two assays: the 2451G mutant ('weakest' in the fragment reaction) showed the highest activity in assay II; the 2451U mutant ('best' in the fragment reaction) was the least active in assay II. The variation in the relative activities of the mutants in the two assays may reflect differences in interaction with the substrates (puromycin versus CC-puromycin and fMet-tRNA versus N-acetyl-Phe-tRNA). Figure 3 The time course of peptidyl transferase reactions catalysed by reconstituted 50S subunits containing mutations at position 2451. Full legend High resolution image and legend (51k) In assays I and II, the peptidyl transfer is catalysed by the large ribosomal subunits in the presence of 33% methanol, which is required to improve the binding of the peptidyltRNA10. It was important to test how mutant subunits would perform under more physiological conditions in the absence of methanol. In the following two assays (III and IV), reconstituted 50S subunits were re-associated with T. aquaticus 30S ribosomal subunits to form 70S ribosomes. We then tested the peptidyl transferase activity in the presence of messenger RNA. Assay III employed the same substrates that were used in assay II, and poly(U) was used as mRNA (Fig. 2b). The relative activities of the mutants in the methanol-free assay III were comparable to those in assay II (Table 1). In assay IV, we used a principally new method of measuring peptidyl transferase activity. The method is based on the scintillation proximity assay (SPA)11, which uses streptavidin-coated scintillant-embedded beads. Formyl-[3H]Met-tRNA, which was bound to the ribosome in the presence of synthetic mRNA, served as the peptidyl donor, and biotin-puromycin served as the acceptor. Only after formyl-[3H]methionine is transferred to biotin-puromycin and the resulting compound binds to the streptavidin-coated beads is its radioactivity registered by the embedded scintillant. Both substrates of the peptidyl transferase reaction were present in excess. Under these conditions—where tRNA binding is additionally stimulated by interacting with the 30S subunit—reduction of activity of the mutants compared to the wild type was even less pronounced than in the isolated 50S subunits, ranging from 25 to 73% in assay IV (Fig. 3d; and Table 1). The actual effect of the mutations on the chemical step of the catalysis can be even smaller than that which follows from our measurements, as mutations at A2451 are expected to affect binding of donor and acceptor substrates to the peptidyl transferase centre. Modification of A2451 with dimethyl sulphate inhibits binding of peptidyl-tRNA to 50S subunits5. Thus, one would expect that mutations at this position should interfere with binding of the donor substrates. In agreement with this hypothesis, activity of the 2451C mutant in assay I could be significantly increased (to 78%) by using higher concentrations of fMet-tRNA. Additionally, we observed that activity of the 2451G mutant, which was relatively low in assay I, could be increased roughly 20-fold by using CC-puromycin instead of puromycin (data not shown), showing that the 2451G mutation interferes with binding of puromycin to the 50S subunits. The notion of substrate-binding deficiencies in mutant subunits is further supported by kinetic measurements. In assay I, which was performed under single turn-over conditions, at the time the reaction curves reached their respective plateaux (5 min) only a portion of fMet-tRNA was converted into the reaction product; less product was formed by the mutants than by wild-type 50S subunits (Fig. 3b). As neither 50S subunits nor reaction substrates are inactivated during this time, only a portion of fMet-tRNA may form reactive complexes with the reconstituted subunits. Assuming that the plateau levels of the reaction curves represent the amount of the reactive complexes at time zero, we found that the observed rate constant (Kobs) for wild-type 50S subunit was only two times higher than Kobs for the 2451U and 2451G mutants, and four times higher than for the 2451C mutant. Thus, it is possible that reduction of the yield of the peptidyl transfer reaction observed with the mutants comes primarily from impaired binding of the reaction substrates to mutant subunits, rather than from inhibition of chemical catalysis. The observed in vivo lethality of mutants at position 2451 in E. coli3 may also result from interference with positioning tRNAs on the ribosome. According to the proposed model2, the precise position of A2451 and its N3 group is fixed by a unique hydrogen-bond network with G2061 and G2447. This set of interactions is presumed to be essential for generating an unusually high pKa at N3 of A2451, thus making it suitable for functioning as a general acid–base in the proposed catalytic mechanism2, 3. Mutations of any of these guanines are therefore expected to eliminate the pKa shift and should severely disturb acid–base catalysis. However, when three possible base changes were introduced at G2447 in T. aquaticus 23S rRNA, the activities of reconstituted subunits tested in assays I, II and IV were again high: 18–70% for the adenine mutant, 39–82% for C and 5–48 % for U. Even double mutants harbouring the 2451G mutation, in combination with mutations at the position 2447, yielded subunits with high transpeptidation activities (25–42% in assay I, for the different mutants; 45–60% in assay II; and 13–22% in assay IV). Our in vitro results with the position 2447 mutants are corroborated by in vivo data because the G2447U mutation can render cells resistant to a ribosome-targeting antibiotic linezolid12, 13, indicating that mutant ribosomes can efficiently support protein synthesis. Our experiments show that mutations of nucleotides critical for the proposed catalytic mechanism2, 3 do not abolish peptidyl transferase activity of the ribosome. Mutations of other nucleotide residues in the immediate area of the active site, which could be essential in alternative catalytic models, were also tolerated9, 14-17. These findings contrast with results obtained with other ribozymes where alterations of the catalytic residues either completely eliminated catalytic activity or reduce it by many orders of magnitude18, 19, indicating that the mode of action of the ribosomal peptidyl transferase differs from that of a number of other ribozymes. The energy required for peptide bond formation is generated during amino acyl-tRNA synthesis coupled with ATP hydrolysis so that peptidyl transferase substrates are delivered to the ribosome in an activated form20. The properly positioned free -amino group of amino acyl-tRNA is a strong enough nucleophile to spontaneously attack the ester carbonyl group of peptidyl-tRNA. Therefore, the ribosome can potentially promote formation of peptide bonds without significant contribution of chemical catalysis by merely fitting the reaction components in a configuration suitable for the spontaneous reaction20, 21. Kinetic considerations suggest that the extent of the acceleration of peptidyl transfer achievable only by substrate orientation can be sufficient to allow for the known rates of protein synthesis21. In accordance with this model, the only mutations that have been shown so far to markedly reduce the ribosome's ability to catalyse peptide bond formation in vitro are mutations of G2252, a nucleotide that positions peptidyl-tRNA in the ribosomal P-site by base-pairing with the C74 of tRNA9, 14, 22. The idea that the ribosome catalyses peptidyl transfer primarily by fixing the proper position of the reaction substrates does not eliminate the possible minor contribution of chemical catalysis8; however, our results show that it may provide only a small acceleration factor and is apparently not rate limiting. Although it was not yet possible to demonstrate peptidyl transfer catalysis by isolated rRNA23, structural and biochemical data leave little doubt that the ribosome is essentially a ribozyme2, 22, 24, a remnant of the ancient protein-synthesizing machine that could function without the help of proteins25. In the course of evolution, many enzymatic functions thought to be originally performed by RNA enzymes have been taken over by protein catalysts, which provide a broader variety of functional groups that can be used for chemical catalysis. Amino-acid polymerization, however, remained the mission of rRNA. One of the functions that RNA is specifically suited to perform is binding other nucleic acids. The protein synthesis substrates are presented to the ribosome in an already activated form as peptidyl- and amino acyl-tRNAs. Thus, it is possible that, as one of the most ancient enzymes, ribosomal peptidyl transferase evolved simply as a template for binding and proper positioning of its substrates rather than as a chemical catalyst. The apparent limited importance of chemical catalysis explains why this function of rRNA in the ribosome has not been taken over by one of the ribosomal proteins. Methods Preparation of wild-type and mutant 50S subunits Mutations were engineered at position A2451 (E. coli numbering) of the T. aquaticus 23S rRNA gene cloned in the pUC18 plasmid26 by PCR mutagenesis using mutagenizing primers 5'CCATCGATCAACGGATAAAAGTTACCCCGGGGATAXCAGGCT-3' (where X indicates a G, T or C) and a plasmid-specific universal sequencing primer 5'GTTTTCCCAGTCACGAC-3'. The PCR fragment was cut with Bsu15I and EcoR1 restriction enzymes and used to replace the corresponding fragment in the wild-type gene. The mutations at position 2,447 were engineered using the same strategy and mutagenizing primers 5'-CCATCGATCAACGGATAAAAGTTACCCCGGGXATAAC-3', where X indicates A, T or C. Wild-type or mutant 23S rRNA was transcribed in vitro, purified, and assembled into 50S subunits as described9. Peptidyl transferase assay I Reconstituted 50S subunits (21 pmol) (corresponding to 0.5 Å260 of 23S rRNA transcript in reconstitution reaction) were combined with 0.5–1 pmol formyl-[35S]Met-tRNA in 50 µl of PT buffer (20 mM Tris/HCl pH 8, 20 mM MgCl2, 0.4 M KCl, 2.5 mM -mercaptoethanol, 0.1 mM EDTA). After addition of 25 µl cold methanol, reactions were incubated on ice for 5 min to allow tRNA binding. Peptidyl transfer reaction was initiated by addition of 12.5 nmol puromycin. Reactions were incubated on ice (see figures for specified times), the products were extracted into ethyl acetate and analysed by paper electrophoresis as described27. For the observed rate-constant comparison, experimental points within the first 5 min of incubation were included. The data were fit into the first order exponential decay curve of the enzyme–substrate complex using TableCurve 2D program (SPSS). Peptidyl transferase assay II Reconstituted 50S subunits (12.6 pmol), N-acetyl-PhetRNA28 (13.6 pmol) and 5'- [32P]CC-puromycin (4 pmol) (Dharmacon Research) were combined in 30 µl PT buffer containing 33% methanol. Reactions were incubated on ice for the specified time, stopped by addition of 15 µl 7M urea, and loaded on a 22.5% polyacrylamide/7M urea gel29. Radioactive species were quantified using the Molecular Dynamics phosphor imager. Peptidyl transferase assay III Native or assembled T. aquaticus 50S subunits (12 pmol) were combined with 30S subunits (12 pmol) in 16 µl reconstitution buffer26, 30 (20 mM Tris/Cl pH 7.4, 20 mM MgCl2, 400 mM NH4Cl, 6 mM spermidine, 0.4 mM EDTA and 5 mM -mercaptoethanol) for 1 h at 40 °C. Subsequently, the 70S ribosomes were incubated with 60 µg poly(U) and 13.6 pmol N-acetyl-Phe-tRNA in 30 µl buffer (final concentrations: 10 mM Tris/Cl pH 7.4, 10 mM HEPES/KOH pH 7.6, 13 mM MgCl2, 275 mM NH4Cl, 4 mM spermidine, 0.025 mM spermine, 4.5 mM -mercaptoethanol, 0.2 mM EDTA). After incubation for 15 min at 37 °C, the reaction was initiated by addition of 4 pmol [32P]CC-puromycin and allowed to proceed for 2 h at 37 °C. The reaction was stopped and radioactive products were analysed as in assay II. Peptidyl transferase assay IV For assay IV we used scintillation proximity assay (SPA) technique11 based on streptavidin-coated scintillant-embedded beads. The radioactivity of the sample is registered only when the 3H source is brought into immediate proximity of the scintillator by binding to the beads. The peptidyl transferase assay used is a modification of the protocol developed by S. Swaney and D. Shinabarger (personal communication). Reconstituted 50S subunits (21 pmol) were reassociated with an equimolar amount of native T. aquaticus 30S subunits in 27 µl reconstitution buffer26 for 1 h at 40 °C. Reconstituted ribosomes were combined with 76 pmol synthetic mRNA AAGGAGAUAUAACAAUGGGU (Dharmacon Research), 48 pmol formyl-[3H]MettRNA (5,900 c.p.m. per pmol) and 60 pmol of a biotin-puromycin derivative (Dharmacon Research) in a final volume of 60 µl of the adjusted reaction buffer (final concentrations: 20 mM Tris/Cl pH 7.6, 39 mM MgCl2, 230 mM NH4Cl, 2.2 mM -mercaptoethanol, 2.7 mM spermidine and 0.09 mM EDTA). The reaction was performed at 37 °C for 0.5– 8 h. The reaction was terminated by addition of 120 µl stop solution containing 125 mM EDTA, 1 PBS and 0.45 µg streptavidin-coated SPA beads (Amersham Pharmacia Biotech, catalogue number RPNQ0007), transferred to 96-well plates and counted in a 96well-plate scintillation counter (Wallac). The radioactivity readings of the samples containing only 30S subunits were subtracted for all experimental time points. Received 26 March 2001; accepted 30 April 2001 References 1. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A. The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905-920 (2000). | Article | PubMed | ISI | 2. Nissen, P., Hansen, J., Ban, N., Moore, P. B. & Steitz, T. A. The structural basis of ribosome activity in peptide bond synthesis. Science 289, 920-930 (2000). | Article | PubMed | ISI | 3. Muth, G. W., Ortoleva-Donnelly, L. & Strobel, S. A. A single adenosine with a neutral pKa in the ribosomal peptidyl transferase center. Science 289, 947-950 (2000). | Article | PubMed | ISI | 4. Moazed, D. & Noller, H. F. Interaction of tRNA with 23S rRNA in the ribosomal A, P, and E sites. Cell 57, 585-597 (1989). | PubMed | ISI | 5. Bocchetta, M., Xiong, L. & Mankin, A. S. 23S rRNA positions essential for tRNA binding in ribosomal functional sites. Proc. Natl Acad. Sci. USA 95, 3525-3530 (1998). | Article | PubMed | ISI | 6. Steiner, G., Kuechler, E. & Barta, A. Photo-affinity labelling at the peptidyl transferase centre reveals two different positions for the A- and P-sites in domain V of 23S rRNA. EMBO J. 7, 3949-3955 (1988). | PubMed | ISI | 7. Kearsey, S. E. & Craig, I. W. Altered ribosomal RNA genes in mitochondria from mammalian cells with chloramphenicol resistance. Nature 290, 607-608 (1981). | PubMed | ISI | 8. Barta, A. et al. Mechanism of ribosomal peptide bond formation. Science 291, 203a (2001) (online). | Article | 9. Khaitovich, P., Tenson, T., Kloss, P. & Mankin, A. S. Reconstitution of functionally active Thermus aquaticus large ribosomal subunits with in vitro-transcribed rRNA. Biochemistry 38, 1780-1788 (1999). | Article | PubMed | ISI | 10. Monro, R. E. & Marcker, K. A. Ribosome-catalysed reaction of puromycin with a formylmethionine-containing oligonucleotide. J. Mol. Biol. 25, 347-350 (1967). | PubMed | ISI | 11. Hart, H. E. & Greenwald, E. B. Scintillation proximity assay (SPA)--a new method of immunoassay. Direct and inhibition mode detection with human albumin and rabbit antihuman albumin. Mol. Immunol. 16, 265-267 (1979). | PubMed | ISI | 12. Swaney, S. M. et al. in Abstracts of the 38th Interscience Conference on Antimicrobial Agents and Chemotherapy C-104 (American Society for Microbiology, Washington DC, 1998). 13. Xiong, L. Q. et al. Oxazolidinone resistance mutations in 23S rRNA of Escherichia coli reveal the central region of domain V as the primary site of drug action. J. Bacteriol. 182, 5325-5331 (2000). | PubMed | ISI | 14. Green, R. & Noller, H. F. Reconstitution of functional 50S ribosomes from in vitro transcripts of Bacillus stearothermophilus 23S rRNA. Biochemistry 38, 1772-1779 (1999). | Article | PubMed | ISI | 15. Green, R., Samaha, R. R. & Noller, H. F. Mutations at nucleotides G2251 and U2585 of 23 S rRNA perturb the peptidyl transferase center of the ribosome. J. Mol. Biol. 266, 40-50 (1997). | Article | PubMed | ISI | 16. Porse, B. T. & Garrett, R. A. Mapping important nucleotides in the peptidyl transferase centre of 23 S rRNA using a random mutagenesis approach. J. Mol. Biol. 249, 1-10 (1995). | Article | PubMed | ISI | 17. O'Connor, M. & Dahlberg, A. E. Mutations at U2555, a tRNA-protected base in 23S rRNA, affect translational fidelity. Proc. Natl Acad. Sci. USA 90, 9214-9218 (1993). | PubMed | ISI | 18. Nakano, S., Chadalavada, D. M. & Bevilacqua, P. C. General acid-base catalysis in the mechanism of a hepatitis delta virus ribozyme. Science 287, 1493-1497 (2001). | Article | 19. Yean, S. L., Wuenschell, G., Termini, J. & Lin, R. J. Metal-ion coordination by U6 small nuclear RNA contributes to catalysis in the spliceosome. Nature 408, 881-884 (2001). | Article | 20. Krayevsky, A. A. & Kukhanova, M. K. The peptidyltransferase center of ribosomes. Prog. Nucl. Acid Res. Mol. Biol. 23, 1-51 (1979). 21. Nierhaus, K. H., Schulze, H. & Cooperman, B. S. Molecular mechanisms of the ribosomal peptidyl transferase center. Biochem. Int. 1, 185-192 (1980). | ISI | 22. Samaha, R. R., Green, R. & Noller, H. F. A base pair between tRNA and 23S rRNA in the peptidyl transferase centre of the ribosome. Nature 377, 309-314 (1995). | PubMed | ISI | 23. Khaitovich, P., Tenson, T., Mankin, A. S. & Green, R. Peptidyl transferase activity catalyzed by protein-free 23S ribosomal RNA remains elusive. RNA 5, 605-608 (1999). | Article | PubMed | ISI | 24. Zhang, B. & Cech, T. R. Peptide bond formation by in vitro selected ribozymes. Nature 390, 96100 (1997). | Article | PubMed | ISI | 25. Crick, F. H. C., Brenner, S., Klug, A. & Pieczenik, G. A speculation on the origin of protein synthesis. Orig. Life. 7, 389-397 (1976). | PubMed | ISI | 26. Khaitovich, P. & Mankin, A. S. in The Ribosome. Structure, Function, Antibiotics and Cellular Interactions (eds Garrett, R. A. et al.) 229-243 (ASM, Washington DC, 2000). 27. Green, R. & Noller, H. F. In vitro complementation analysis localizes 23S rRNA posttranscriptional modifications that are required for Escherichia coli 50S ribosomal subunit assembly and function. RNA 2, 1011-1021 (1996). | PubMed | ISI | 28. Blaha, G. et al. Preparation of functional ribosomal complexes and effect of buffer conditions on tRNA positions observed by cryoelectron microscopy. Methods Enzymol. 317, 292-309 (2000). | PubMed | ISI | 29. Kim, D. F. & Green, R. Base-pairing between 23S rRNA and tRNA in the ribosomal A site. Mol. Cell 4, 859-864 (1999). | PubMed | ISI | 30. Nierhaus, K. H. in Ribosomes and Protein Synthesis. A Practical Approach (ed. Spedding, G.) 161-189 (Oxford Univ. Press, Oxford, 1990). Figure 1 The secondary structure of the central loop of domain V of T. aquaticus 23S rRNA. Position A2451 (E. coli 23S rRNA numeration), the principal catalytic nucleotide in the proposed general acid–base catalytic mechanism of peptide bond formation2,3, is shown in bold. Its tertiary interaction partners, guanine residues 2061 and 2447, suggested to be essential for rendering catalytic properties to A2451, are outlined. Arrows indicate the mutations engineered in 23S rRNA. Figure 2 Peptidyl transferase activity of reconstituted large ribosomal subunits harbouring mutations of the proposed catalytic nucleotide A2451. a, Assay I. The transfer of formyl[35S]methionine from tRNA to puromycin was catalysed by 50S subunits in the presence of 33% methanol10 (see Methods). The reactions were incubated on ice for 30 min, the product, formylMet-puromycin (formyl-Met-Pmn), was resolved by paper electrophoresis and visualized by phosphor-imaging. The control (no 50S) contained all the reaction components except for reconstituted 50S subunits. b, Assay III. Peptidyl transferase reaction between N-acetyl-Phe-tRNA (donor) and 5' [32P]-CC-puromycin (acceptor) was performed by re-associated 70S ribosomes in the presence of poly(U). The reaction was carried out at 37 °C for 120 min and the reaction product, N-acetyl-Phe-CC-puromycin (CC-Pmn-AcPhe), was resolved from CC-puromycin (CCPmn) by gel electrophoresis29. The 30S control contained the same amount of 30S subunits as the other samples, but no reconstituted 50S subunits; the observed product spot reflected the activity of minute amounts of native 50S subunits present in the 30S subunit preparation. The 70S control in assay III shows the activity of 30S subunits re-associated with native 50S subunits. Figure 3 The time course of peptidyl transferase reactions catalysed by reconstituted 50S subunits containing mutations at position 2451. a, Phosphor-imager visualization of the appearance of the formyl-[35S]methionine-puromycin product on electrophoregram in assay I. b–d, Time course of formation of peptidyl transferase reaction products in assays I (b), II (c) and IV (d). Adenine (wild type, G, C and U) mutants containing corresponding mutations at the position A2451 of 23S rRNA. The inset in b represents activity of the A2451G mutant; the time points are the same as on the main graph but the scale of the y-axis was decreased to better reveal the kinetics of product accumulation. The y-axis values in assays I (b) and II (c) are phosphor-imager arbitrary counts. The radioactivity (c.p.m.) detected by streptavidin-coated scintillant-embedded beads in assay IV (d) represents formyl-[3H]methionyl-biotin-puromycin formed as a result of peptidyl transferase reaction (see Methods). Radioactivity values of the corresponding time points of control samples containing only native 30S subunits were subtracted from the experimental values. Vol. 95, Issue 7, 3525-3530, March 31, 1998 Biochemistry 23S rRNA positions essential for tRNA binding in ribosomal functional sites Maurizio Bocchetta, Liqun Xiong, and Alexander S. Mankin* Center for Pharmaceutical Biotechnology-m/c 870, University of Illinois, 900 South Ashland Avenue, Chicago, IL 60607 Communicated by Emanuel Margoliash, University of Illinois, Chicago, IL, January 29, 1998 (received for review December 12, 1997) rRNA plays an important role in function of peptidyl transferase, the catalytic center of the ribosome responsible for the peptide bond formation. Proper placement of the peptidyl transferase substrates, peptidyl-tRNA and aminoacyl-tRNA, is essential for catalysis of the transpeptidation reaction and protein synthesis. In this report, we define a small set of rRNA nucleotides that are most likely directly involved in binding of tRNA in the functional sites of the large ribosomal subunit. By binding biotinylated tRNA substrates to randomly modified large ribosomal subunits from Escherichia coli and capturing resulting complexes on the avidin resin, we identified four nucleotides in the large ribosomal subunit rRNA (positions G2252, A2451, U2506, and U2585) whose modifications prevent binding of a peptidyltRNA analog in the P site and one residue (U2555) whose modification interferes with transfer of peptidyl moiety to puromycin. These nucleotides represent a subset of positions protected by tRNA analogs from chemical modification and significantly narrow the number of 23S rRNA nucleotides that may be directly involved in tRNA binding in the ribosomal functional sites. * To whom reprint requests should be addressed. e-mail: shura@uic.edu. Copyright © 1998 by The National Academy of Sciences 0027-8424/98/953525-6$2.00/0 This article has been cited by other articles: Aoki, H., Ke, L., Poppe, S. M., Poel, T. J., Weaver, E. A., Gadwood, R. C., Thomas, R. C., Shinabarger, D. L., Ganoza, M. C. (2002). Oxazolidinone Antibiotics Target the P Site on Escherichiacoli Ribosomes. Antimicrob. Agents Chemother. 46: 1080-1085 [Abstract] [Full Text] Bayfield, M. A., Dahlberg, A. E., Schulmeister, U., Dorner, S., Barta, A. (2001). From the Cover: A conformational change in the ribosomal peptidyl transferase center upon active/inactive transition. Proc. Natl. Acad. Sci. U. S. A. 98: 1009610101 [Abstract] [Full Text] Thompson, J., Kim, D. F., O'Connor, M., Lieberman, K. R., Bayfield, M. A., Gregory, S. T., Green, R., Noller, H. F., Dahlberg, A. E. (2001). Analysis of mutations at residues A2451 and G2447 of 23S rRNA in the peptidyltransferase active site of the 50S ribosomal subunit. Proc. Natl. Acad. Sci. U. S. A. 98: 90029007 [Abstract] [Full Text] Khaitovich, P., Mankin, A. S., Green, R., Lancaster, L., Noller, H. F. (1999). Characterization of functionally active subribosomal particles from Thermus aquaticus. Proc. Natl. Acad. Sci. U. S. A. 96: 85-90 [Abstract] [Full Text] Conrad, J., Sun, D., Englund, N., Ofengand, J. (1998). The rluC Gene of Escherichia coli Codes for a Pseudouridine Synthase That Is Solely Responsible for Synthesis of Pseudouridine at Positions 955, 2504, and 2580 in 23 S Ribosomal RNA. J. Biol. Chem. 273: 18562-18566 [Abstract] [Full Text] http://www.pnas.org/cgi/content/abstract/95/7/3525 RNA (1999), 5:1200-1209 Cambridge University Press Copyright © 1999 RNA Society Research Article mRNA localization signals can enhance the intracellular effectiveness of hammerhead ribozymes NAN SOOK LEE a1 , EDOUARD BERTRAND a2 and JOHN ROSSI a1 a3 c1 a1 Department of Molecular Biology, Beckman Research Institute of the City of Hope, Duarte, California 91010-3011, USA a2 J. Monod, Tour 43, 2 Place Jussieu, 75251 Paris, France a3 Graduate School of Biological Sciences, City of Hope and Beckman Research Institute of the City of Hope, Duarte, California 91010-3011, USA Abstract Subcellular localization signals for several mRNAs are positioned in their 3[prime prime or minute] untranslated regions (UTR). We have utilized the human [alpha]- and [beta]-actin 3[prime prime or minute] UTRs as signals for colocalizing hammerhead ribozymes with a lacZ target mRNA. Ribozyme and target genes containing matched or unmatched 3[prime prime or minute] UTRs were cotransfected into 12-day-old chicken embryonic myoblast and fibroblast (CEMF) cultures and assayed by in situ hybridization (ISH) using a dual label, antibody sandwich procedure, and dual fluorescence microscopy to monitor intracellular colocalization. [beta]-galactosidase localization in transfectants was visualized by incubation with X-gal and also quantitated by an o-nitrophenyl [beta]-D-galactopyranoside (ONPG) assay. We found that the percentage of colocalization using the matched [alpha]- or [beta]-actin 3[prime prime or minute] UTR ([alpha]–[alpha] or [beta]–[beta]) was enhanced approximately threefold relative to unmatched 3[prime prime or minute] UTRs. The increase in ribozyme-mediated inhibition of [beta]-galactosidase activity observed when matched 3[prime prime or minute] UTRs were used was consistent with the observed percentage of colocalization. These results represent the first direct demonstration that mRNA localization signals (zipcodes) can be utilized to enhance intracellular ribozyme efficacy. (Received February 5 1999) (Revised March 10 1999) (Accepted June 17 1999) Key Words: 3[prime prime or minute] UTR; colocalization; hammerhead ribozyme; human [alpha]-actins; human [beta]-actins. Correspondence: Reprint requests to: John Rossi, Department of Molecular Biology, Beckman Research Institute of the City of Hope, Duarte, California 91010-3011, USA; e-mail: jrossi@coh.org. c1 24 May 2001 Nature 411, 494 - 498 (2001); doi:10.1038/35078107 <> Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells SAYDA M. ELBASHIR*, JENS HARBORTH†, WINFRIED LENDECKEL*, ABDULLAH YALCIN*, KLAUS WEBER† & THOMAS TUSCHL* * Department of Cellular Biochemistry; and † Department of Biochemistry and Cell Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany Correspondence and requests for materials should be addressed to T.T. (e-mail: ttuschl@mpibpc.gwdg.de). RNA interference (RNAi) is the process of sequence-specific, post-transcriptional gene silencing in animals and plants, initiated by double-stranded RNA (dsRNA) that is homologous in sequence to the silenced gene1-4. The mediators of sequence-specific messenger RNA degradation are 21- and 22-nucleotide small interfering RNAs (siRNAs) generated by ribonuclease III cleavage from longer dsRNAs5-9. Here we show that 21nucleotide siRNA duplexes specifically suppress expression of endogenous and heterologous genes in different mammalian cell lines, including human embryonic kidney (293) and HeLa cells. Therefore, 21-nucleotide siRNA duplexes provide a new tool for studying gene function in mammalian cells and may eventually be used as gene-specific therapeutics. Uptake of dsRNA by insect cell lines has previously been shown to 'knock-down' the expression of specific proteins, owing to sequence-specific, dsRNA-mediated mRNA degradation6, 10-12. However, it has not been possible to detect potent and specific RNA interference in commonly used mammalian cell culture systems, including 293 (human embryonic kidney), NIH/3T3 (mouse fibroblast), BHK-21 (Syrian baby hamster kidney), and CHO-K1 (Chinese hamster ovary) cells, applying dsRNA that varies in size between 38 and 1,662 base pairs (bp)10, 12. This apparent lack of RNAi in mammalian cell culture was unexpected, because RNAi exists in mouse oocytes and early embryos13, 14, and because RNAi-related, transgene-mediated co-suppression was also observed in cultured Rat-1 fibroblasts15. But it is known that dsRNA in the cytoplasm of mammalian cells can trigger profound physiological reactions that lead to the induction of interferon synthesis16. In the interferon response, dsRNA > 30 bp binds and activates the protein kinase PKR17 and 2',5'-oligoadenylate synthetase (2',5'-AS)18. Activated PKR stalls translation by phosphorylation of the translation initiation factors eIF2 , and activated 2',5'-AS causes mRNA degradation by 2',5'-oligoadenylate-activated ribonuclease L. These responses are intrinsically sequence-nonspecific to the inducing dsRNA. Base-paired 21- and 22-nucleotide (nt) siRNAs with overhanging 3' ends mediate efficient sequence-specific mRNA degradation in lysates prepared from Drosophila embryos9. To test whether siRNAs are also capable of mediating RNAi in cell culture, we synthesized 21nt siRNA duplexes with symmetric 2-nt 3' overhangs directed against reporter genes coding for sea pansy (Renilla reniformis, RL) and two sequence variants of firefly (Photinus pyralis, GL2 and GL3) luciferases (Fig. 1a, b). The siRNA duplexes were co-transfected with the reporter plasmid combinations pGL2/pRL or pGL3/pRL, into Drosophila S2 cells or mammalian cells using cationic liposomes. Luciferase activities were determined 20 h after transfection. In Drosophila S2 cells (Fig. 2a and b), the specific inhibition of luciferases was complete and similar to results previously obtained for longer dsRNAs6, 10, 12, 19 . In mammalian cells, where the reporter genes were 50- to 100-fold more strongly expressed, the specific suppression was less complete (Fig. 2c–j). In NIH/3T3, monkey COS-7 and Hela S3 cells (Fig. 2c–h), GL2 expression was reduced 3- to 12-fold, GL3 expression 9- to 25-fold, and RL expression 2- to 3-fold, in response to the cognate siRNAs. For 293 cells, targeting of RL luciferase by RL siRNAs was ineffective, although GL2 and GL3 targets responded specifically (Fig. 2i and j). The lack of reduction of RL expression in 293 cells may be because of its expression, 5- to 20-fold higher than any other mammalian cell line tested and/or to limited accessibility of the target sequence due to RNA secondary structure or associated proteins. Nevertheless, specific targeting of GL2 and GL3 luciferase by the cognate siRNA duplexes indicated that RNAi is also functioning in 293 cells. Figure 1 Reporter constructs and siRNA duplexes. Full legend High resolution image and legend (57k) Figure 2 RNA interference by siRNA duplexes. Full legend High resolution image and legend (77k) The 2-nucleotide 3' overhang in all siRNA duplexes was composed of (2'-deoxy) thymidine, except for uGL2, which contained uridine residues. The thymidine overhang was chosen because it reduces costs of RNA synthesis and may enhance nuclease resistance of siRNAs in the cell culture medium and within transfected cells. As in the Drosophila in vitro system (data not shown), substitution of uridine by thymidine in the 3' overhang was well tolerated in cultured mammalian cells (Fig. 2a, c, e, g and i), and the sequence of the overhang appears not to contribute to target recognition9. In co-transfection experiments, 25 nM siRNA duplexes were used (Figs 2 and 3; concentration is in respect to the final volume of tissue culture medium). Increasing the siRNA concentration to 100 nM did not enhance the specific silencing effects, but started to affect transfection efficiencies, perhaps due to competition for liposome encapsulation between plasmid DNA and siRNA (data not shown). Decreasing the siRNA concentration to 1.5 nM did not reduce the specific silencing effect (data not shown), even though the siRNAs were now only 2- to 20-fold more concentrated than the DNA plasmids; the silencing effect only vanishes completely if the siRNA concentration was dropped below 0.05 nM. This indicates that siRNAs are extraordinarily powerful reagents for mediating gene silencing, and that siRNAs are effective at concentrations that are several orders of magnitude below the concentrations applied in conventional antisense or ribozyme genetargeting experiments20. Figure 3 Effects of 21-nucleotide siRNAs, 50-bp, and 500-bp dsRNAs on luciferase expression in HeLa cells. Full legend High resolution image and legend (65k) To monitor the effect of longer dsRNAs on mammalian cells, 50- and 500-bp dsRNAs that are cognate to the reporter genes were prepared. As a control for nonspecific inhibition, dsRNAs from humanized GFP (hG)21 was used. In these experiments, the reporter plasmids were co-transfected with either 0.21 µg siRNA duplexes or 0.21 µg longer dsRNAs. The siRNA duplexes only reduced the expression of their cognate reporter gene, while the longer dsRNAs strongly and nonspecifically reduced reporter-gene expression. The effects are illustrated for HeLa S3 cells as a representative example (Fig. 3a and b). The absolute luciferase activities were decreased nonspecifically 10- to 20-fold by 50-bp dsRNA, and 20- to 200-fold by 500-bp dsRNA co-transfection, respectively. Similar nonspecific effects were observed for COS-7 and NIH/3T3 cells. For 293 cells, a 10- to 20-fold nonspecific reduction was observed only for 500-bp dsRNAs. Nonspecific reduction in reporter-gene expression by dsRNA > 30 bp was expected as part of the interferon response16. Interestingly, superimposed on the nonspecific interferon response, we detect additional sequence-specific, dsRNA-mediated silencing. The sequence-specific silencing effect of long dsRNAs, however, became apparent only when the relative reporter-gene activities were normalized to the hG dsRNA controls (Fig. 3c). Sequence-specific silencing by 50- or 500-bp dsRNAs reduced the targeted reporter-gene expression by an additional 2- to 5-fold. Similar effects were also detected in the other three mammalian cell lines tested (data not shown). Specific silencing effects with dsRNAs (356–1,662 bp) were previously reported in CHO-K1 cells, but the amounts of dsRNA required to detect a 2- to 4-fold specific reduction were about 20-fold higher than in our experiments12. Also, CHO-K1 cells appear to be deficient in the interferon response. In another report, 293, NIH/3T3 and BHK-21 cells were tested for RNAi using luciferase/ -galactosidase (lacZ) reporter combinations and 829-bp specific lacZ or 717-bp nonspecific green fluorescent protein (GFP) dsRNA10. The lack of detected RNAi in this case may be due to the less sensitive luciferase/lacZ reporter assay and the length differences of target and control dsRNA. Taken together, our results indicate that RNAi is active in mammalian cells, but that the silencing effect is difficult to detect if the interferon system is activated by dsRNA > 30 bp. To test for silencing of endogenous genes, we chose four genes coding for cytoskeletal proteins: lamin A/C, lamin B1, nuclear mitotic apparatus protein (NuMA) and vimentin27. The selection was based on the availability of antibodies needed to quantitate the silencing effect. Silencing was monitored 40 to 45 h after transfection to allow for turnover of the protein of the targeted genes. As shown in Fig. 4, the expression of lamin A/C was specifically reduced by the cognate siRNA duplex (Fig. 4a), but not when nonspecific siRNA directed against firefly luciferase (Fig. 4b) or buffer (Fig. 4c) was used. The expression of a non-targeted gene, NuMA, was unaffected in all treated cells (Fig. 4d–f), demonstrating the integrity of the targeted cells. The reduction in lamin A/C proteins was more than 90% complete as quantified by western blotting (Fig. 4j, k). We note that lamin A/C 'knock-out' mice are viable for a few weeks after birth23 and that the lamin A/C knockdown in cultured cells was not expected to cause cell death. Lamin A and C are produced by alternative splicing in the 3' region and are present in equal amounts in the lamina of mammalian cells (Fig. 4j, k). Transfection of siRNA duplexes targeting lamin B1 and NuMA reduced the expression of these proteins to low levels (data not shown), but we were not able to observe a reduction in vimentin expression. This could be due to the high abundance of vimentin in the cells (several per cent of total cell mass) or because the siRNA sequence chosen was not optimal for targeting of vimentin. Figure 4 Silencing of nuclear envelope proteins lamin A/C in HeLa cells. Full legend High resolution image and legend (108k) The mechanism of the 21-nucleotide siRNA-mediated interference process in mammalian cells remains to be uncovered, and silencing might occur post-transcriptionally and/or transcriptionally. In Drosophila lysate, siRNA duplexes mediate post-transcriptional gene silencing by reconstitution of siRNA-protein complexes (siRNPs), which guide mRNA recognition and targeted cleavage6, 7, 9. In plants, dsRNA-mediated post-transcriptional silencing has also been linked to DNA methylation, which may also be directed by 21nucleotide siRNAs24. Methylation of promoter regions can lead to transcriptional silencing25, but methylation in coding sequences does not26. DNA methylation and transcriptional silencing in mammals are well documented processes27, yet their mechanisms have not been linked to that of post-transcriptional silencing. Methylation in mammals is predominantly directed towards CpG dinucleotide sequences. There is no CpG sequence in the RL or lamin A/C siRNA, although both siRNAs mediate specific silencing in mammalian cell culture, so it is unlikely that DNA methylation is essential for the silencing process. Thus we have shown, for the first time, siRNA-mediated gene silencing in mammalian cells. The use of exogenous 21-nucleotide siRNAs holds great promise for analysis of gene function in human cell culture and the development of gene-specific therapeutics. It will also be of interest in understanding the potential role of endogenous siRNAs in the regulation of mammalian gene function. Methods RNA preparation 21-nucleotide RNAs were chemically synthesized using Expedite RNA phosphoramidites and thymidine phosphoramidite (Proligo, Germany). Synthetic oligonucleotides were deprotected and gel-purified9. The accession numbers given below are from GenBank. The siRNA sequences targeting GL2 (Acc. No. X65324) and GL3 luciferase (Acc. No. U47296) corresponded to the coding regions 153–173 relative to the first nucleotide of the start codon; siRNAs targeting RL (Acc. No. AF025846) corresponded to region 119–139 after the start codon. The siRNA sequence targeting lamin A/C (Acc. No. X03444) was from position 608–630 relative to the start codon; lamin B1 (Acc. No. NM_005573) siRNA was from position 672–694; NuMA (Acc. No. Z11583) siRNA from position 3,988–4,010, and vimentin (Acc. No. NM_003380) from position 346–368 relative to the start codon. Longer RNAs were transcribed with T7 RNA polymerase from polymerase chain reaction (PCR) products, followed by gel purification. The 49- and 484-bp GL2 or GL3 dsRNAs corresponded to positions 113–161 and 113–596, respectively, relative to the start of translation; the 50- and 501-bp RL dsRNAs corresponded to position 118–167 and 118–618, respectively. PCR templates for dsRNA synthesis targeting humanized GFP (hG) were amplified from pAD3 (ref. 21), whereby 50and 501-bp hG dsRNA corresponded to positions 121–170 and 121–621, respectively, to the start codon. For annealing of siRNAs, 20 µM single strands were incubated in annealing buffer (100 mM potassium acetate, 30 mM HEPES-KOH at pH 7.4, 2 mM magnesium acetate) for 1 min at 90 °C followed by 1 h at 37 °C. The 37 °C incubation step was extended overnight for the 50- and 500-bp dsRNAs, and these annealing reactions were performed at 8.4 µM and 0.84 µM strand concentrations, respectively. Cell culture S2 cells were propagated in Schneider's Drosophila medium (Life Technologies) supplemented with 10% fetal bovine serum (FBS) 100 units ml-1 penicillin, and 100 µg ml-1 streptomycin at 25 °C. 293, NIH/3T3, HeLa S3, HeLa SS6, COS-7 cells were grown at 37 °C in Dulbecco's modified Eagle's medium supplemented with 10% FBS, 100 units ml-1 penicillin, and 100 µg ml-1 streptomycin. Cells were regularly passaged to maintain exponential growth. Twenty-four h before transfection at 50–80% confluency, mammalian cells were trypsinized and diluted 1:5 with fresh medium without antibiotics (1–3 105 cells ml-1) and transferred to 24-well plates (500 µl per well). S2 cells were not trypsinized before splitting. Co-transfection of reporter plasmids and siRNAs was carried out with Lipofectamine 2000 (Life Technologies) as described by the manufacturer for adherent cell lines. Per well, 1.0 µg pGL2-Control (Promega) or pGL3-Control (Promega), 0.1 µg pRL-TK (Promega), and 0.21 µg siRNA duplex or dsRNA, formulated into liposomes, were applied; the final volume was 600 µl per well. Cells were incubated 20 h after transfection and appeared healthy thereafter. Luciferase expression was subsequently monitored with the Dual luciferase assay (Promega). Transfection efficiencies were determined by fluorescence microscopy for mammalian cells lines after co-transfection of 1.1 µg hGFP-encoding pAD3 (ref. 21) and 0.21 µg inverted GL2 siRNA, and were 70– 90%. Reporter plasmids were amplified in XL-1 Blue (Stratagene) and purified using the Qiagen EndoFree Maxi Plasmid Kit. Transfection of siRNAs for targeting endogenous genes was carried out using Oligofectamine (Life Technologies) and 0.84 µg siRNA duplex per well, but it was recently found that as little as 0.01 µg siRNAs per well are sufficient to mediate silencing. HeLa SS6 cells were transfected one to three times in approximately 15 h intervals and were assayed 40 to 45 h after the first transfection. It appears, however, that a single transfection is as efficient as multiple transfections. Transfection efficiencies as determined by immunofluorescence of targeted cells were in the range of 90%. Specific silencing of targeted genes was confirmed by at least three independent experiments. Western blotting and immunofluorescence microscopy Monoclonal 636 lamin A/C specific antibody28 was used as undiluted hybridoma supernatant for immunofluorescence and 1/100 dilution for western blotting. Affinity-purified polyclonal NuMA protein 705 antibody29 was used at a concentration of 10 µg ml-1 for immunofluorescence. Monoclonal V9 vimentin-specific antibody was used at 1/2,000 dilution. For western blotting, transfected cells grown in 24-well plates were trypsinized and harvested in SDS sample buffer. Equal amounts of total protein were separated on 12.5% polyacrylamide gels and transferred to nitrocellulose. Standard immunostaining was carried out using ECL enhanced chemiluminescence technique (Amersham Pharmacia). For immunofluorescence, transfected cells grown on glass coverslips in 24-well plates were fixed in methanol for 6 min at -10 °C. Target gene specific and control primary antibody were added and incubated for 80 min at 37 °C. After washing in phosphate buffered saline (PBS), Alexa 488-conjugated anti-rabbit (Molecular Probes) and Cy3-conjugated antimouse (Dianova) antibodies were added and incubated for 60 min at 37 °C. Finally, cells were stained for 4 min at room temperature with Hoechst 33342 (1 µM in PBS) and embedded in Mowiol 488 (Hoechst). Pictures were taken using a Zeiss Axiophot camera with a Fluar 40/1.30 oil objective and MetaMorph Imaging Software (Universal Imaging Corporation) with equal exposure times for the specific antibodies. Received 20 February 2001; accepted 26 April 2001 References 1. Fire, A. RNA-triggered gene silencing. Trends Genet. 15, 358-363 (1999). | Article | PubMed | ISI | 2. Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490 (2001). | Article | PubMed | ISI | 3. Hammond, S. M., Caudy, A. A. & Hannon, G. J. Post-transcriptional gene silencing by doublestranded RNA. Nature Rev. Genet. 2, 110-1119 (2001). | Article | PubMed | ISI | 4. Tuschl, T. RNA interference and small interfering RNAs. Chem. Biochem. 2, 239-245 (2001). | ISI | 5. Hamilton, A. J. & Baulcombe, D. C. A species of small antisense RNA in posttranscriptional gene silencing in plants. Science 286, 950-952 (1999). | Article | PubMed | ISI | 6. Hammond, S. M., Bernstein, E., Beach, D. & Hannon, G. J. An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404, 293-296 (2000). | Article | PubMed | ISI | 7. Zamore, P. D., Tuschl, T., Sharp, P. A. & Bartel, D. P. RNAi: Double-stranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals. Cell 101, 25-33 (2000). | PubMed | ISI | 8. Bernstein, E., Caudy, A. A., Hammond, S. M. & Hannon, G. J. Role for a bidentate ribonuclease in the initiation step of RNA interference. Nature 409, 363-366 (2001). | Article | PubMed | ISI | 9. Elbashir, S. M., Lendeckel, W. & Tuschl, T. RNA interference is mediated by 21 and 22 nt RNAs. Genes Dev. 15, 188-200 (2001). | Article | PubMed | ISI | 10. Caplen, N. J., Fleenor, J., Fire, A. & Morgan, R. A. dsRNA-mediated gene silencing in cultured Drosophila cells: a tissue culture model for the analysis of RNA interference. Gene 252, 95-105 (2000). | Article | PubMed | ISI | 11. Clemens, J. C. et al. Use of double-stranded RNA interference in Drosophila cell lines to dissect signal transduction pathways. Proc. Natl Acad. Sci. USA 97, 6499-6503 (2000). | Article | PubMed | ISI | 12. Ui-Tei, K., Zenno, S., Miyata, Y. & Saigo, K. Sensitive assay of RNA interference in Drosophila and Chinese hamster cultured cells using firefly luciferase gene as target. FEBS Lett. 479, 7982 (2000). | Article | PubMed | ISI | 13. Wianny, F. & Zernicka-Goetz, M. Specific interference with gene function by double-stranded RNA in early mouse development. Nature Cell Biol. 2, 70-75 (2000). | Article | PubMed | ISI | 14. Svoboda, P., Stein, P., Hayashi, H. & Schultz, R. M. Selective reduction of dormant maternal mRNAs in mouse oocytes by RNA interference. Development 127, 4147-4156 (2000). | PubMed | ISI | 15. Bahramian, M. B. & Zarbl, H. Transcriptional and posttranscriptional silencing of rodent alpha(I) collagen by a homologous transcriptionally self-silenced transgene. Mol. Cell. Biol. 19, 274-283 (1999). | PubMed | ISI | 16. Stark, G. R., Kerr, I. M., Williams, B. R., Silverman, R. H. & Schreiber, R. D. How cells respond to interferons. Annu. Rev. Biochem. 67, 227-264 (1998). | PubMed | ISI | 17. Manche, L., Green, S. R., Schmedt, C. & Mathews, M. B. Interactions between double-stranded RNA regulators and the protein kinase DAI. Mol. Cell. Biol. 12, 5238-5248 (1992). | PubMed | ISI | 18. Minks, M. A., West, D. K., Benvin, S. & Baglioni, C. Structural requirements of double-stranded RNA for the activation of 2',5'-oligo(A) polymerase and protein kinase of interferon-treated HeLa cells. J. Biol. Chem. 254, 10180-10183 (1979). | PubMed | ISI | 19. Clemens, M. & Williams, B. Inhibition of cell-free protein synthesis by pppA2'p5'A2'p5'A: a novel oligonucleotide synthesized by interferon-treated L cell extracts. Cell 13, 565-572 (1978). | PubMed | ISI | 20. Macejak, D. G. et al. Inhibition of hepatitis C virus (HCV)-RNA-dependent translation and replication of a chimeric HCV poliovirus using synthetic stabilized ribozymes. Hepatology 31, 769-776 (2000). | PubMed | ISI | 21. Kehlenbach, R. H., Dickmanns, A. & Gerace, L. Nucleocytoplasmic shuttling factors including Ran and CRM1 mediate nuclear export of NFAT In vitro. J. Cell. Biol. 141, 863-874 (1998). | PubMed | ISI | 22. Kreis, T. & Vale, R. Guidebook to the Cytoskeletal and Motor Proteins, Parts 2b and 3a (Oxford Univ. Press, Oxford, 1999). 23. Sullivan, T. et al. Loss of A-type lamin expression compromises nuclear envelope integrity leading to muscular dystrophy. J. Cell Biol. 147, 913-920 (1999). | PubMed | ISI | 24. Wassenegger, M. RNA-directed DNA methylation. Plant Mol. Biol. 43, 203-220 (2000). | Article | PubMed | ISI | 25. Mette, M. F., Aufsatz, W., van der Winden, J., Matzke, M. A. & Matzke, A. J. M. Transcriptional silencing and promoter methylation triggered by double-stranded RNA. EMBO J. 19, 5194-5201 (2000). | Article | PubMed | ISI | 26. Wang, M.-B., Wesley, S. V., Finnegan, E. J., Smith, N. A. & Waterhouse, P. M. Replicating 27. 28. 29. 30. satellite RNA induces sequence-specific DNA methylation and truncated transcripts in plants. RNA 7, 16-28 (2001). | Article | PubMed | ISI | Razin, A. CpG methylation, chromatin structure and gene silencing--a three-way connection. EMBO J. 17, 4905-4908 (1998). | Article | PubMed | ISI | Röber, R. A., Gieseler, R. K., Peters, J. H., Weber, K. & Osborn, M. Induction of nuclear lamins A/C in macrophages in in vitro cultures of rat bone marrow precursor cells and human blood monocytes, and in macrophages elicited in vivo by thioglycollate stimulation. Exp. Cell Res. 190, 185-194 (1990). | PubMed | ISI | Harborth, J., Wang, J., Gueth-Hallonet, C., Weber, K. & Osborn, M. Self assembly of NuMA: multiarm oligomers as structural units of a nuclear lattice. EMBO J. 18, 1689-1700 (1999). | Article | PubMed | ISI | Parrish, S., Fleenor, J., Xu, S., Mello, C. & Fire, A. Functional anatomy of a dsRNA trigger: Differential requirement for the two trigger strands in RNA Interference. Mol. Cell 6, 1077-1087 (2000). | PubMed | ISI | Figure 1 Reporter constructs and siRNA duplexes. a, The firefly (Pp-luc) and sea pansy (Rr-luc) luciferase reporter-gene regions from plasmids pGL2-Control, pGL3-Control, and pRL-TK (Promega) are illustrated; simian virus 40 (SV40) promoter (prom.); SV40 enhancer element (enh.); SV40 late polyadenylation signal (poly(A)); herpes simplex virus (HSV) thymidine kinase promoter, and two introns (lines) are indicated. The sequence of GL3 luciferase is 95% identical to GL2, but RL is completely unrelated to both. Luciferase expression from pGL2 is approximately 10-fold lower than from pGL3 in transfected mammalian cells. The region targeted by the siRNA duplexes is indicated as black bar below the coding region of the luciferase genes. b, The sense (top) and antisense (bottom) sequences of the siRNA duplexes targeting GL2, GL3, and RL luciferase are shown. The GL2 and GL3 siRNA duplexes differ by only three single-nucleotide substitutions (boxed in grey). As nonspecific control, a duplex with the inverted GL2 sequence, invGL2, was synthesized. The 2-nucleotide 3' overhang of 2'-deoxythymidine is indicated as TT; uGL2 is similar to GL2 siRNA but contains ribo-uridine 3' overhangs. Figure 2 RNA interference by siRNA duplexes. Ratios of target to control luciferase were normalized to a buffer control (Bu, black bars); grey bars indicate ratios of Photinus pyralis (Ppluc) GL2 or GL3 luciferase to Renilla reniformis (Rr-luc) RL luciferase (left axis), white bars indicate RL to GL2 or GL3 ratios (right axis). a, c, e, g and i, Experiments performed with the combination of pGL2-Control and pRL-TK reporter plasmids; b, d, f, h and j, experiments performed with the combination of pGL3-Control and pRL-TK reporter plasmids. The cell line used for the interference experiment is indicated at the top of each plot. The ratios of Pp-luc/Rr-luc for the buffer control (Bu) varied between 0.5 and 10 for pGL2/pRL, and between 0.03 and 1 for pGL3/pRL, respectively, before normalization and between the various cell lines tested. The plotted data were averaged from three independent experiments s.d. Figure 3 Effects of 21-nucleotide siRNAs, 50-bp, and 500-bp dsRNAs on luciferase expression in HeLa cells. The exact length of the long dsRNAs in base pairs is indicated below the bars. Experiments were performed with pGL2-Control and pRL-TK reporter plasmids. The data were averaged from two independent experiments s.d. a, Absolute Pp-luc expression, plotted in arbitrary luminescence units (a.u.). b, Rr-luc expression, plotted in arbitrary luminescence units. c, Ratios of normalized target to control luciferase. The ratios of luciferase activity for siRNA duplexes were normalized to a buffer control (Bu, black bars); the luminescence ratios for 50- or 500-bp dsRNAs were normalized to the respective ratios observed for 50- and 500-bp dsRNAs from humanized GFP (hG, black bars). We note that the overall differences in sequence between the 49- and 484-bp GL2 and GL3 dsRNAs are not sufficient to confer specificity for targeting GL2 and GL3 targets (43-nucleotide uninterrupted identity in 49-bp segment, 239-nucleotide longest uninterrupted identity in 484-bp segment)30. Figure 4 Silencing of nuclear envelope proteins lamin A/C in HeLa cells. Triple fluorescence staining of cells transfected with lamin A/C siRNA duplex (a, d, g), with GL2 luciferase siRNA duplex (nonspecific siRNA control) (b, e, h), and with buffer only (c, f, i). a–c, Staining with lamin A/C specific antibody; d–f, staining with NuMA-specific antibody; g–i, Hoechst staining of nuclear chromatin. Bright fluorescent nuclei in a represent untransfected cells. j, k, Western blots of transfected cells using lamin A/C- (j) or vimentin-specific (k) antibodies. The Western blot was stripped and re-probed with vimentin antibody to check for equal loading of total protein. Technical Reports DOI:10.1038/nbt0502-500 May 2002 Volume 20 Number 5 pp 500 - 505 Expression of small interfering RNAs targeted against HIV-1 rev transcripts in human cells Nan Sook Lee1, Taikoh Dohjima1, Gerhard Bauer1, Haitang Li1, Ming-Jie Li1, Ali Ehsani1, 3, Paul Salvaterra2, 3 & John Rossi1, 3 1. Division of Molecular Biology, Graduate School of Biological Sciences, City of Hope, Duarte, CA 91010. 2. Division of Neuroscience, Graduate School of Biological Sciences, City of Hope, Duarte, CA 91010. 3. Division of Beckman Research Institute of the City of Hope, Graduate School of Biological Sciences, City of Hope, Duarte, CA 91010. Correspondence should be addressed to J Rossi. e-mail: jrossi@bricoh.edu RNA interference (RNAi) is the process of sequence-specific, posttranscriptional gene silencing in animals and plants initiated by double-stranded (ds) RNA that is homologous to the silenced gene1-7. This technology has usually involved injection or transfection of dsRNA in model nonvertebrate organisms. The longer dsRNAs are processed into short (19–25 nucleotides) small interfering RNAs (siRNAs) by a ribonucleotide–protein complex that includes an RNAse III–related nuclease (Dicer)7, a helicase family member8, and possibly a kinase9 and an RNA-dependent RNA polymerase (RdRP)10, 11. In mammalian cells it is known that dsRNA 30 base pairs or longer can trigger interferon responses that are intrinsically sequence-nonspecific12, thus limiting the application of RNAi as an experimental and therapeutic agent. Duplexes of 21nucleotide siRNAs with short 3' overhangs, however, can mediate RNAi in a sequence-specific manner in cultured mammalian cells12, 13. One limitation in the use of siRNA as a therapeutic reagent in vertebrate cells is that short, highly defined RNAs need to be delivered to target cells—a feat thus far only accomplished by the use of synthetic, duplex RNAs delivered exogenously to cells12, 13. In this report, we describe a mammalian Pol III promoter system capable of expressing functional double-stranded siRNAs following transfection into human cells. In the case of the 293 cells cotransfected with the HIV-1 pNL4-3 proviral DNA and the siRNA-producing constructs, we were able to achieve up to 4 logs of inhibition of expression from the HIV-1 DNA. Our system takes advantage of the human U6 snRNA promoter and its simple termination signal (a short stretch of uridines), which we have previously used for expression of short, defined ribozyme transcripts in human cells14, 15. Appropriately selected siRNA sequences can be easily inserted into a transcriptional cassette, providing an optimal system for testing endogenous expression and function of siRNAs. As an assay for the siRNAs, we fused rev to enhanced green fluorescent protein (EGFP) to provide a reporter system for monitoring siRNA function (Fig. 1A). After inserting the rev-EGFP fusion gene into the ecdysoneinducible pIND vector system (Fig. 1A), we obtained temporal control of target expression by transfecting this construct into a 293/EcR cell line engineered to respond to induction by addition of the insect hormone analogue Ponasterone A. When the pIND-rev-EGFP vector was transfected into these cells, EGFP fluorescence was observable as early as 3 h after addition of Ponasterone A and continued for >100 h. In the absence of Ponasterone A, EGFP fluorescence was not observable (data not shown). To choose target sequences for the siRNAs, we utilized a synthetic oligodeoxyribonucleotide/RNAse H method that takes advantage of the protein-associated state of the target RNA in a cell extract16, 17. Endogenous RNAse H in the extracts will effectively cleave RNAs that are base-paired with DNA oligomers. A highly accessible site in the rev sequence was identified using an oligonucleotide library screen17 in extracts prepared from rev-EGFP-expressing cells (data not shown). This sequence, 5'GCCTGTGCCTCTTCAGCTACC-3', is located 213 nucleotides downstream from the AUG codon of rev, and 494 nucleotides downstream of the site for pIND transcription initiation. We designated this as site II because of its position in the transcript relative to an additional site we chose for testing (see below). We next chose a second 21nucleotide sequence that has a total GC content similar to site II as well as a 3' cytosine. The requirement for a 3' cytosine is based on the first nucleotide of our pTZ U6+1 transcript, which initiates with a G, and therefore requires a C for base pairing (Fig. 1D, E). The target sequence 5'GCGGAGACAGCGACGAAGAGC-3', designated as site I, in addition to being part of the rev transcript, is also located in the HIV-1 tat transcripts. The first base of this sequence is positioned 20 nucleotides downstream of the rev translational initiation codon, and 301 nucleotides downstream of the vector transcriptional initiation site. To compare the relative accessibility of these two sequences for Watson–Crick base pairing, we synthesized two 21-mer DNA oligonucleotides complementary to each of the two siRNA target sites, and used these as probes for RNAse H– mediated cleavage of the rev-EGFP message in extracts prepared from cells constitutively expressing this gene product. It was demonstrated (Fig. 1B) that site II is highly accessible to base pairing with its cognate oligo (89% reduction in RT-PCR product relative to the no-oligo control) whereas site I is not (27% reduction relative to the control). The marked differences in the accessibility to antisense pairing between these two sites provided a test for the potential role that target accessibility plays in siRNAmediated targeting. Genes encoding an siRNA targeted to site I or II were inserted behind the Pol III U6 snRNA promoter of pTZ U6+1 (Fig. 1C, D). The transcriptional cassettes were constructed in such a way that they were either in the same or in different vectors. The constructs in separate vectors provided a set of sense and antisense controls. We cotransfected our siRNA expression constructs or controls with an ecdysone-inducible rev-EGFP expression plasmid into 293/EcR cells, stably expressing the ecdysone receptor. We initiated rev-EGFP expression 16–20 h after transfection by addition of Ponasterone A to the culture medium. After an additional 48 h incubation, we subjected the cells to fluorescence microscopic analyses and fluorescenceactivated cell sorting (FACS). The combined sense and antisense RNA oligomers targeted to site II in the rev sequence reduced the EGFP signal by 90% relative to the controls, whereas the combined sense and antisense RNA oligomers targeted against site I gave an 25% reduction in fluorescence (Fig. 2A, B). The inhibition mediated by the site II siRNAs was similar whether both sense and antisense RNA oligomers were expressed from the same plasmid or from different plasmid backbones. The control constructs (sense, antisense, an irrelevant siRNA construct, or a ribozyme targeted to site II), each expressed from the U6 promoter, gave no significant reduction of EGFP expression relative to the vector backbone control (Fig. 2A, B). A combination of the site I and II siRNA genes gave an intermediate level of inhibition (Fig. 1B), which is due to a 50% reduction of the siRNA II encoding DNA in the cotransfection assay. We obtained similar results, after correction for transfection efficiencies, when the same sets of constructs were transfected into a cell line stably expressing the rev-EGFP construct from a cytomegalovirus (CMV) promoter (data not shown). To test the potential anti-HIV-1 inhibitory activity of the various constructs described earlier, we cotransfected each of the siRNA expression vectors (site I and II) as well as sense, antisense, ribozyme, and irrelevant siRNA control constructs with HIV-1 pNL4-3 proviral DNA into 293 cells. At the intervals indicated in Figure 2C, we withdrew supernatant samples from the cell cultures and measured HIV-1 p24 viral antigen levels. Both the site I and site II siRNAs strongly inhibited HIV-1 replication in this assay, but no inhibition was observed using the irrelevant siRNA construct or any of the other controls. The combination of both site I and II siRNA constructs was the most potent, providing 4 logs of inhibition relative to the control constructs. The relative differences in the siRNA targeted against site I in the context of the rev-EGFP fusion mRNA versus the entire HIV-1 genome (Fig. 2B, C) is most likely a consequence of site I being present in both the tat and rev transcripts (Fig. 1A). The differential susceptibility of the site I target sequence in the context of tat relative to rev, or possible synergistic effects mediated by downregulation of both the tat and rev transcripts by this siRNA could be important determinants for this differential behavior. Northern gel analyses were carried out to examine the expression patterns and sizes of the siRNAs transcribed from the U6 RNA PolIII promoter system in cells that harbored the siRNA expression vectors and the target constructs. These analyses demonstrated strong expression of sense and antisense RNAs as monitored by hybridization to the appropriate probes (Fig. 3). All control RNAs were detected at the expected sizes (Fig. 3A, B). RNAs prepared from cells simultaneously expressing sense and antisense constructs generated hybridization signals of the size expected for the individual short RNAs ( 23 nucleotides). In addition to the monomer-sized RNAs, a strong hybridization product approximately twice the size of the short RNA oligomers ( 46 nucleotides) was obtained under nondenaturing gel electrophoresis conditions (Fig. 3C). This product was only observed in samples derived from cells expressing both sense and antisense constructs (Fig. 3C). The 46-nucleotide products were not visible under stringent denaturing gel conditions (Fig. 3A–D). The RNA samples (either unheated or heat-denatured at 95°C and quickchilled) were treated with a mixture of the single-strandspecific RNAses A and T1 before denaturing gel electrophoresis and blotting. The nonheated sample treated with the RNAse mix generated a hybridizing product, whereas the heat-treated samples did not. The hybridizing product was slightly shorter than the RNAs not treated with RNAse, most likely a result of RNAse trimming of the nonbase-paired ends in the RNA duplexes. In addition to the 46-nucleotide product, there is a band at 65 nucleotides seen prominently in Figure 3D and weakly in Figure 3A and B. A probe that would detect a site II primerextended product did not hybridize to this transcript (data not shown). Thus, this product is not likely to be the result of an antisense primed extension by an RNA-dependent RNA polymerase (RdRP)10, 18, but is most probably a readthrough transcript terminating at the next string of uridines. Synthetic and endogenous siRNAs direct targeted mRNA degradation6, 9, 10, 12, 13. To determine whether our siRNA complexes directed degradation of the rev-EGFP mRNA, we carried out northern hybridization analyses probing for the rev-EGFP transcripts (Fig. 3E). We also probed for the glyceraldehyde phosphodehydrogenase (GAPDH) mRNA and U6 snRNA, as internal controls. These data demonstrated selective destruction of the fusion transcript only in cells expressing the combination of site II sense and antisense siRNAs. Northern gel analyses of the HIV-1 RNAs on day 4 after the cotransfection experiments with the siRNA constructs revealed no hybridization to HIV-1-specific transcripts in the samples prepared from the siRNA site Iand II-expressing cells (data not shown), whereas there was strong expression of the siRNAs and both sense and antisense transcripts (Fig. 3B). We are presently inserting our siRNA constructs into a lentiviral vector backbone for functional testing in acute HIV1 infections. By combining siRNA constructs targeted to different sites in the HIV-1 genome and/or the cellular CCR5 coreceptor, it may be possible to circumvent genetic resistance of the virus, thereby creating a potent gene therapy approach for the treatment of HIV-1 infection. Experimental protocol Accessibility assay. To evaluate the accessibility of target sequences for antisense base pairing, we employed endogenous RNAse H activity present in the cell extracts prepared from 293 cells stably expressing rev-EGFP. Two DNA oligonucleotides, complementary to each of the two target sites, were synthesized and used as probes for accessibility assays in cell extracts as described16, 17, with some minor modifications. Constructs. A SacII (filled in)–EcoRI fragment containing the rev-EGFP fusion gene of CMV-rev-EGFP was inserted into HindIII (filled in)–EcoRI sites of the pIND vector (Invitrogen, Carlsbad, CA), yielding the pIND-rev-EGFP construct of Figure 1A. The siRNA expression vectors were prepared using the pTZ U6+1 vector (Fig. 1C). One cassette harbors the 21nucleotide sense sequences and the other a 21-nucleotide antisense sequence. These sequences were designed to target either site I or site II (Fig. 1A). A string of six thymidines was inserted at the 3' terminus of each of the 21mers followed by an XbaI restriction site. The first G of the inserts was provided by the SalI site of the vector that was rendered blunt-ended by mung bean nuclease. The inserts were digested by BsrBI for sense and AluI for antisense (site I), StuI for sense and SnaBI for antisense (site II) for bluntend cloning immediately downstream of the U6 promoter sequence. The 3' ends of the inserts were digested with XbaI for insertion in the SalI (blunted)–XbaI digested pTZ U6+1 vector to create the desired transcription units (Fig. 1C, D). As a control to verify an siRNA mechanism, irrelevant S/AS sequences lacking complementarity to HIV-1 (S/AS (IR)) were subcloned in pTZ U6+1 as described earlier. To create plasmids in which both sense and antisense sequences were in the same vector, the pTZ U6+1 sense sequence–harboring vectors were digested with BamHI (which was filled in using T4 DNA polymerase) and HindIII. The digested fragments containing the sense sequences were subcloned into the SphI (filled in by T4 DNA polymerase)–HindIII sites of the antisense AS(I) or AS(II) constructs, generating both sense and antisense transcription units (S/AS(I) or S/AS(II)) (Fig. 1D). The DNA sequences for each of the aforementioned constructs were confirmed before use. Cell culture. 293/EcR cells were grown at 37°C in Eagle's minimal essential medium (EMEM) supplemented with 10% FBS, 2 mM L-glutamine, and 0.4 mg/ml of Zeocine (Invitrogen). Cells were replated, 24 h before transfection, on 24- or 6-well plates at 50–70% confluency with fresh medium without antibiotic. For the 293/EcR cells, cotransfection of target plasmids (pIND-rev-EGFP) and siDNAs was carried out at a 1:1 ratio with Lipofectamine Plus reagent (Life Technologies, GibcoBRL, Gaithersburg, MD) as described by the manufacturer. For each 6-well culture, we applied 0.5 g pIND-rev-EGFP and 0.5 g siDNAs and 0.1 g pCMV-lacZ (for transfection efficiency), formulated into Lipofectamine Plus. Cells were incubated overnight, and on the following day 5 M Ponasterone A (Invitrogen) was added to induce expression of pIND-rev– EGFP. Two days post induction, the transfected cells were harvested to measure EGFP fluorescence by FACS, using a modular flow cytometer (Cytomation, Fort Collins, CO). Transfection efficiencies were normalized using a fluorescent -galactosidase assay (Diagnostic Chemicals Ltd., Oxford, CT19). Fluorescent imaging was also carried out to monitor EGFP expression. Images were collected using an Olympus BX50 microscope and a DEI-750 video camera (Optronics, Goleta, CA) at 40 magnification with exposure time of 1/8 s. Specific silencing of target genes was confirmed in at least three independent experiments. Northern blotting. RNA samples were prepared from 293/EcR cells transiently cotransfected with pIND-rev-EGFP and siRNA genes or control constructs, and subjected to Ponasterone A induction as described earlier. RNAs were also prepared from 293 cells cotransfected with HIV-1 pNL43 proviral DNA and siRNA genes or control constructs. Total RNA isolation was done using the RNA STAT-60 (TELTEST "B", Friendswood, TX) according to the manufacturer's instruction. For denatured RNA samples, total RNAs were resolved in a 10 or 15% (wt/vol) polyacrylamide–8 M urea gel. To detect double-stranded siRNAs, the samples were electrophoresed in a 15% (wt/vol) polyacrylamide gel lacking urea or at low voltage (<5 V/cm) in a 10% (wt/vol) polyacrylamide–urea gel. A 1% (wt/vol) agarose– formaldehyde gel was used for analyzing the target mRNAs. In all cases, RNAs were transferred by electroblotting onto Hybond-N+ membrane (Amersham Pharmacia Biotech, Piscataway, NJ). The hybridization and wash steps were carried out at 37°C. To detect the sense or antisense siRNAs, we used radiolabeled 21-mer DNA probes. We also probed for human U6 snRNA and GAPDH mRNA as internal standards. For characterization of the 46-nucleotide RNAs, total RNAs from S/AS (I+II) were treated with a mixture of RNAase A and T1 for 30 min at 37°C, either with or without preheating at 90°C for 5 min, followed by electrophoresis in a 15% denaturing gel. For detection of rev-EGFP mRNA, we used a 25-mer deoxyribonucleotide probe that was complementary to the EGFP mRNA of the rev-EGFP fusion protein. A 29-mer deoxyribooligonucleotide probe was used for detection of the GAPDH transcript. HIV-1 antiviral assay. For determination of anti-HIV-1 activity of the siRNAs, transient assays were done by cotransfection of siDNAs and infectious HIV-1 proviral DNA, pNL4-3 into 293 cells as described15. Before transfection, the cells were grown for 24 h in six-well plates in 2 ml EMEM supplemented with 10% (vol/vol) FBS and 2 mM Lglutamine, and transfected using Lipofectamine Plus reagent (Life Technologies, GibcoBRL) as described by the manufacturer. The DNA mixtures consisting of 0.5 g siDNAs or controls, and 0.5 g pNL4-3 were formulated into cationic lipids and applied to the cells. After one, two, three, and four days, supernatants were collected and analyzed for HIV-1 p24 antigen (Beckman Coulter, Hialeah, FL). The p24 values were calculated with the aid of the Dynatech MR5000 ELISA plate reader (Dynatech Labs Inc., Chantilly, VA). Cell viability was also assessed using a Trypan Blue dye exclusion count at four days after transfection. Received 21 December 2001; Accepted 15 March 2002. REFERENCES 1. Hammond, S.M., Bernstein, E., Beach, D. & Hannon, G.J. An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404, 293-296 (2000). | Article | PubMed | ISI | 2. Fire, A. RNA-triggered gene silencing. Trends Genet. 15, 358-363 (1999). | Article | PubMed | ISI | 3. Svoboda, P., Stein, P., Hayashi, H. & Schultz, R.M. Selective reduction of dormant maternal mRNAs in mouse oocytes by RNA interference. Development 127, 4147-4156 (2000). | PubMed | ISI | 4. Sharp, P.A. RNA interference--2001. Genes Dev. 15, 485-490 (2001). | Article | PubMed | ISI | 5. Clemens, J.C. et al. Use of double-stranded RNA interference in Drosophila cell lines to dissect signal transduction pathways. Proc. Natl. Acad. Sci. USA 97, 6499-6503 (2000). | Article | PubMed | ISI | 6. Elbashir, S.M., Lendeckel, W. & Tuschl, T. RNA interference is mediated by 21- and 22-nucleotide RNAs. Genes Dev. 15, 188200 (2001). | Article | PubMed | ISI | 7. Bernstein, E., Caudy, A.A., Hammond, S.M. & Hannon, G.J. Role for a bidentate ribonuclease in the initiation step of RNA interference. Nature 409, 363-366 (2001). | Article | PubMed | ISI | 8. Dalmay, T., Horsefield, R., Braunstein, T.H. & Baulcombe, D.C. SDE3 encodes an RNA helicase required for post-transcriptional gene silencing in Arabidopsis. EMBO J. 20, 2069-2078 (2001). | Article | PubMed | ISI | 9. Nykanen, A., Haley, B. & Zamore, P.D. ATP requirements and small interfering RNA structure in the RNA interference pathway. Cell 107, 309-321 (2001). | PubMed | ISI | 10. Lipardi, C., Wei, Q. & Paterson, B.M. RNAi as random degradative PCR. siRNA primers convert mRNA into dsRNAs that are degraded to generate new siRNAs. Cell 107, 297-307 (2001). | PubMed | ISI | 11. Smardon, A. et al. EGO-1 is related to RNA-directed RNA polymerase and functions in germ-line development and RNA interference in C. elegans. Curr. Biol. 10, 169-178 (2000). | Article | PubMed | ISI | 12. Elbashir, S.M. et al. Duplexes of 21-nucleotide RNAs mediate 13. 14. 15. 16. 17. 18. 19. RNA interference in cultured mammalian cells. Nature 411, 494498 (2001). | Article | PubMed | ISI | Caplen, N.J., Parrish, S., Imani, F., Fire, A. & Morgan, R.A. Specific inhibition of gene expression by small double-stranded RNAs in invertebrate and vertebrate systems. Proc. Natl. Acad. Sci. USA 98, 9742-9747 (2001). | PubMed | ISI | Bertrand, E. et al. The expression cassette determines the functional activity of ribozymes in mammalian cells by controlling their intracellular localization. RNA 3, 75-88 (1997). | PubMed | ISI | Good, P.D. et al. Expression of small, therapeutic RNAs in human cell nuclei. Gene Ther. 4, 45-54 (1997). | PubMed | ISI | Scherr, M. & Rossi, J.J. Rapid determination and quantitation of the accessibility to native RNAs by antisense oligodeoxynucleotides in murine cell extracts. Nucleic Acids Res. 26, 5079-5085 (1998). | PubMed | ISI | Scherr, M. et al. Detection of antisense and ribozyme accessible sites on native mRNAs: application to NCOA3 mRNA. Mol. Ther. 4, 454-460 (2001). | Article | PubMed | ISI | Sijen, T. et al. On the role of RNA amplification in dsRNAtriggered gene silencing. Cell 107, 465-476 (2001). | PubMed | ISI | Lee, N.S. et al. Functional colocalization of ribozymes and target mRNAs in Drosophila oocytes. FASEB. J. 15, 2390-2400 (2001). | PubMed | ISI | Figure 1: Target rev-EGFP and U6 promoter-driven siRNA constructs. (A) The relative locations of the two siRNA target sites in the rev-EGFP target are indicated, as are the locations of these two target sites in HIV-1 transcripts from pNL4-3. (B) Accessibility assays for sites I and II in cell extracts prepared from rev-EGFP-expressing cells. The ethidium bromide–stained bands represent RTPCR products from rev-EGFP (top, 673 bp) or -actin (bottom, 348 bp) mRNAs. Lanes (from left to right): control, no added oligo, minus (-) or plus (+) RT; oligonucleotide probing of site I (-) or (+) RT; oligonucleotide probing of site II (-) or (+) RT. The reduction in target mRNA is elicited by endogenous RNAse H activity as described16, 17. (C) The schematic presentation of the upstream promoter and transcript portion of the U6 expression cassette is shown with the sequences and depicted structure of the expected primary transcript. (D) The sequences of the 21-base sense and antisense inserts with a string of six thymidines are shown. The first G came from the mung bean-treated, cleaved SalI site of the pTZ U6+1 vector. (E) The putative siRNAs derived from coexpression of the sense and antisense 21-mers (S/AS (I) or (II)), with 3'-UU overhangs are depicted. Figure 2: (A) Fluorescence imaging of the effect of siRNA on EGFP expression. 293/EcR cells were cotransfected with pIND-rev-EGFP, and various siRNA constructs as indicated. Cells were examined microscopically for rev-EGFP expression following Ponasterone A addition as described in the Experimental Protocol. Panel 5 (left side) shows fluorescent cells after transfection with an irrelevant siRNA-expressing control construct (S/AS(IR)). Other controls (S(II), AS(II), vector) gave similarly negative results (Panels 1, 2, and 4 on the left). Panel 3 (left) shows 90% reduction in fluorescent cells when 293/EcR cells were transfected with S/AS(II). The righthand panels 1–5 are DAPI-stained images showing that approximately the same numbers of cells are present in each field. Specific silencing of the rev-EGFP target was confirmed in at least three independent experiments. (B) Inhibition of EGFP expression by siRNAs. 293/EcR cells were cotransfected with pINDrev-EGFP and siRNA constructs. Cells were analyzed for EGFP expression by FACS, and the level of fluorescence relative to cells transfected with pIND-revEGFP alone was quantitated. Data are the average s.d. of three separate experiments. Only the siRNA construct containing both sense and antisense sequences directed at accessible site II (S/AS(II) or S+AS(II)) showed 90% reduction relative to the controls or vector only. The various combinations of U6driven siRNA constructs cotransfected with pIND-revEGFP are indicated. S/AS indicates the vector with both sense and antisense siRNA sequences, while S+AS indicates the siRNA sequences in separate vectors. S/AS(I+II) depicts two separate vectors containing both sense and antisense siRNA sequences for site I or II. Rbz indicates the hammerhead ribozyme targeted to cleave site II. Specific silencing of target genes was confirmed in at least three independent experiments. (C) siRNA-mediated inhibition of HIV-1 expression. pNL4-3 proviral DNA was cotransfected with the various U6+1-driven siRNA expression constructs or controls at a 1:1 (wt/wt) ratio of the respective DNAs. At 24 h post transfection, and at the indicated times, supernatant aliquots were withdrawn for HIV-1 p24 antigen assays. The various constructs used are indicated on the figure. Figure 3: Northern gel analyses. RNA samples were prepared from 293/EcR cells transiently cotransfected with pIND-rev-EGFP and various siRNA constructs, or from 293 cells cotransfected with HIV-1 pNL4-3 and the various siRNA constructs, as indicated. (A) Samples were then subjected to Ponasterone A induction (Experimental Protocol). (B) RNA samples were also prepared from day 4 of the HIV-1 cotransfections of the various constructs with HIV-1 pNL4-3 DNA into 293 cells. In (A) and (B), the combined hybridization results using probes for sites I, II, and the irrelevant siRNAs are presented. The total RNAs were resolved using either denaturing (A, B, and D) or nondenaturing gel electrophoresis conditions (C). (D) The RNAs from a combined transfection with pIND-rev-EGFP using S/AS(I+II) were treated with a mixture of RNAse A and T1. Samples were either heated (+) or not heated (–) at 90°C before RNAse treatment. The hybridizing product in the lane treated with RNAses in the absence of preheating is 21 nt in length. Hybridizations were done using 32P-labeled DNA oligonucleotide probes complementary to the sense or antisense transcripts as indicated in each panel. (E) The northern gel analyses for monitoring reduction in rev-EGFP transcripts utilized a 1% agarose–formamide gel. Human GADPH and U6 RNAs were probed as internal controls for each experiment. 16 March 2000 Nature 404, 293 - 296 (2000) © Macmillan Publishers Ltd. <> An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells SCOTT M. HAMMOND*, EMILY BERNSTEIN†‡, DAVID BEACH*§ & GREGORY J. HANNON‡ * Genetica, Inc., P.O. Box 99, Cold Spring Harbor, New York 11724 , USA † Graduate Program in Genetics, State University of New York at Stony Brook, Stony Brook, New York 11794, USA § Wolfson Institute for Biological Sciences, University College London, Gower Street, London WC1E 6BT, UK ‡ Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA Correspondence and requests for materials should be addressed to G.J.H (e-mail: hannon@cshl.org). In a diverse group of organisms that includes Caenorhabditis elegans , Drosophila, planaria, hydra, trypanosomes, fungi and plants, the introduction of double-stranded RNAs inhibits gene expression in a sequence-specific manner1-7. These responses, called RNA interference or post-transcriptional gene silencing, may provide anti-viral defence, modulate transposition or regulate gene expression1, 6, 8-10. We have taken a biochemical approach towards elucidating the mechanisms underlying this genetic phenomenon. Here we show that 'loss-of-function' phenotypes can be created in cultured Drosophila cells by transfection with specific double-stranded RNAs. This coincides with a marked reduction in the level of cognate cellular messenger RNAs. Extracts of transfected cells contain a nuclease activity that specifically degrades exogenous transcripts homologous to transfected double-stranded RNA. This enzyme contains an essential RNA component. After partial purification, the sequence-specific nuclease co-fractionates with a discrete, 25-nucleotide RNA species which may confer specificity to the enzyme through homology to the substrate mRNAs. Although double-stranded RNAs (dsRNAs) can provoke gene silencing in numerous biological contexts including Drosophila11, 12, the mechanisms underlying this phenomenon have remained mostly unknown. We therefore wanted to establish a biochemically tractable model in which such mechanisms could be investigated. Transient transfection of cultured, Drosophila S2 cells with a lacZ expression vector resulted in -galactosidase activity that was easily detectable by an in situ assay (Fig. 1a). This activity was greatly reduced by co-transfection with a dsRNA corresponding to the first 300 nucleotides of the lacZ sequence, whereas co-transfection with a control dsRNA (CD8) (Fig. 1a) or with single-stranded RNAs of either sense or antisense orientation (data not shown) had little or no effect. This indicated that dsRNAs could interfere, in a sequence-specific fashion, with gene expression in cultured cells. Figure 1 RNAi in S2 cells. Full legend High resolution image and legend (88k) To determine whether RNA interference (RNAi) could be used to target endogenous genes, we transfected S2 cells with a dsRNA corresponding to the first 540 nucleotides of Drosophila cyclin E, a gene that is essential for progression into S phase of the cell cycle. During log-phase growth, untreated S2 cells reside primarily in G2/M (Fig. 1b). Transfection with lacZ dsRNA had no effect on cell-cycle distribution, but transfection with the cyclin E dsRNA caused a G1-phase cell-cycle arrest (Fig. 1b). The ability of cyclin E dsRNA to provoke this response was length-dependent. Double-stranded RNAs of 540 and 400 nucleotides were quite effective, whereas dsRNAs of 200 and 300 nucleotides were less potent. Double-stranded cyclin E RNAs of 50 or 100 nucleotides were inert in our assay, and transfection with a single-stranded, antisense cyclin E RNA had virtually no effect (see Supplementary Information). One hallmark of RNAi is a reduction in the level of mRNAs that are homologous to the dsRNA. Cells transfected with the cyclin E dsRNA (bulk population) showed diminished endogenous cyclin E mRNA as compared with control cells (Fig. 1c). Similarly, transfection of cells with dsRNAs homologous to fizzy, a component of the anaphasepromoting complex (APC) or cyclin A, a cyclin that acts in S, G2 and M, also caused reduction of their cognate mRNAs (Fig. 1c). The modest reduction in fizzy mRNA levels in cells transfected with cyclin A dsRNA probably resulted from arrest at a point in the division cycle at which fizzy transcription is low14, 15. These results indicate that RNAi may be a generally applicable method for probing gene function in cultured Drosophila cells. The decrease in mRNA levels observed upon transfection of specific dsRNAs into Drosophila cells could be explained by effects at transcriptional or post-transcriptional levels. Data from other systems have indicated that some elements of the dsRNA response may affect mRNA directly (reviewed in refs 1 and 6). We therefore sought to develop a cell-free assay that reflected, at least in part, RNAi. S2 cells were transfected with dsRNAs corresponding to either cyclin E or lacZ. Cellular extracts were incubated with synthetic mRNAs of lacZ or cyclin E. Extracts prepared from cells transfected with the 540-nucleotide cyclin E dsRNA efficiently degraded the cyclin E transcript; however, the lacZ transcript was stable in these lysates (Fig. 2a). Conversely, lysates from cells transfected with the lacZ dsRNA degraded the lacZ transcript but left the cyclin E mRNA intact. These results indicate that RNAi ablates target mRNAs through the generation of a sequence-specific nuclease activity. We have termed this enzyme RISC (RNA-induced silencing complex). Although we occasionally observed possible intermediates in the degradation process (see Fig. 2), the absence of stable cleavage endproducts indicates an exonuclease (perhaps coupled to an endonuclease). However, it is possible that the RNAi nuclease makes an initial endonucleolytic cut and that non-specific exonucleases in the extract complete the degradation process16. In addition, our ability to create an extract that targets lacZ in vitro indicates that the presence of an endogenous gene is not required for the RNAi response. Figure 2 RNAi in vitro. Full legend High resolution image and legend (35k) To examine the substrate requirements for the dsRNA-induced, sequence-specific nuclease activity, we incubated a variety of cyclin-E-derived transcripts with an extract derived from cells that had been transfected with the 540-nucleotide cyclin E dsRNA (Fig. 2b, c). Just as a length requirement was observed for the transfected dsRNA, the RNAi nuclease activity showed a dependence on the size of the RNA substrate. Both a 600-nucleotide transcript that extends slightly beyond the targeted region (Fig. 2b) and an 1-kilobase (kb) transcript that contains the entire coding sequence (data not shown) were completely destroyed by the extract. Surprisingly, shorter substrates were not degraded as efficiently. Reduced activity was observed against either a 300- or a 220-nucleotide transcript, and a 100-nucleotide transcript was resistant to nuclease in our assay. This was not due solely to position effects because 100-nucleotide transcripts derived from other portions of the transfected dsRNA behaved similarly (data not shown). As expected, the nuclease activity (or activities) present in the extract could also recognize the antisense strand of the cyclin E mRNA. Again, substrates that contained a substantial portion of the targeted region were degraded efficiently whereas those that contained a shorter stretch of homologous sequence ( 130 nucleotides) were recognized inefficiently (Fig. 2c, as600). For both the sense and antisense strands, transcripts that had no homology with the transfected dsRNA ( Fig. 2b, Eout; Fig. 2c, as300) were not degraded. Although we cannot exclude the possibility that nuclease specificity could have migrated beyond the targeted region, the resistance of transcripts that do not contain homology to the dsRNA is consistent with data from C. elegans. Doublestranded RNAs homologous to an upstream cistron have little or no effect on a linked downstream cistron, despite the fact that unprocessed, polycistronic mRNAs can be readily detected17, 18. Furthermore, the nuclease was inactive against a dsRNA identical to that used to provoke the RNAi response in vivo (Fig. 2b). In the in vitro system, neither a 5' cap nor a poly(A) tail was required, as such transcripts were degraded as efficiently as uncapped and non-polyadenylated RNAs. Gene silencing provoked by dsRNA is sequence specific. A plausible mechanism for determining specificity would be incorporation of nucleic-acid guide sequences into the complexes that accomplish silencing19. In accord with this idea, pre-treatment of extracts with a Ca2+-dependent nuclease (micrococcal nuclease) abolished the ability of these extracts to degrade cognate mRNAs (Fig. 3). Activity could not be rescued by addition of non-specific RNAs such as yeast transfer RNA. Although micrococcal nuclease can degrade both DNA and RNA, treatment of the extract with DNAse I had no effect (Fig. 3). Sequence-specific nuclease activity, however, did require protein (data not shown). Together, our results support the possibility that the RNAi nuclease is a ribonucleoprotein, requiring both RNA and protein components. Biochemical fractionation (see below) is consistent with these components being associated in extract rather than being assembled on the target mRNA after its addition. Figure 3 Substrate requirements of the RISC. Full legend High resolution image and legend (23k) In plants, the phenomenon of co-suppression has been associated with the existence of small ( 25-nucleotide) RNAs that correspond to the gene that is being silenced19. To address the possibility that a similar RNA might exist in Drosophila and guide the sequence-specific nuclease in the choice of substrate, we partially purified our activity through several fractionation steps. Crude extracts contained both sequence-specific nuclease activity and abundant, heterogeneous RNAs homologous to the transfected dsRNA (Figs 2 and 4a). The RNAi nuclease fractionated with ribosomes in a high-speed centrifugation step. Activity could be extracted by treatment with high salt, and ribosomes could be removed by an additional centrifugation step. Chromatography of soluble nuclease over an anion-exchange column resulted in a discrete peak of activity (Fig. 4b, cyclin E). This retained specificity as it was inactive against a heterologous mRNA (Fig. 4b, lacZ). Active fractions also contained an RNA species of 25 nucleotides that is homologous to the cyclin E target (Fig. 4b, northern). The band observed on northern blots may represent a family of discrete RNAs because it could be detected with probes specific for both the sense and antisense cyclin E sequences and with probes derived from distinct segments of the dsRNA (data not shown). At present, we cannot determine whether the 25-nucleotide RNA is present in the nuclease complex in a double-stranded or single-stranded form. Figure 4 The RISC contains a potential guide RNA. Full legend High resolution image and legend (16k) RNA interference allows an adaptive defence against both exogenous and endogenous dsRNAs, providing something akin to a dsRNA immune response. Our data, and that of others19, is consistent with a model in which dsRNAs present in a cell are converted, either through processing or replication, into small specificity determinants of discrete size in a manner analogous to antigen processing. Our results suggest that the post-transcriptional component of dsRNA-dependent gene silencing is accomplished by a sequence-specific nuclease that incorporates these small RNAs as guides that target specific messages based upon sequence recognition. The identical size of putative specificity determinants in plants19 and animals predicts a conservation of both the mechanisms and the components of dsRNA-induced, post-transcriptional gene silencing in diverse organisms. In plants, dsRNAs provoke not only post-transcriptional gene silencing but also chromatin remodelling and transcriptional repression20, 21. It is now critical to determine whether conservation of gene-silencing mechanisms also exists at the transcriptional level and whether chromatin remodelling can be directed in a sequence-specific fashion by these same dsRNA-derived guide sequences. Note added in proof: Recently, Tuschl et al. have reported the development of cell-free extracts from Drosophila embryos that can carry out RNAi (T. Tuschl, P. D. Zamore, D. P. Bartel and P. A. Sharp, Genes Dev. 13, 3191–3197; 1999). Their results also indicate that the RNAi is accomplished at least in part by nuclease degradation of targeted mRNAs. Methods Cell culture and RNA methods S2 (ref. 22) cells were cultured at 27 °C in 90% Schneider's insect media (Sigma), 10% heat inactivated fetal bovine serum (FBS). Cells were transfected with dsRNA and plasmid DNA by calcium phosphate co-precipitation23. Identical results were observed when cells were transfected using lipid reagents (for example, Superfect, Qiagen). For FACS analysis, cells were additionally transfected with a vector that directs expression of a green fluorescent protein (GFP)–US9 fusion protein13. These cells were fixed in 90% ice-cold ethanol and stained with propidium iodide at 25 µg ml -1. FACS was performed on an Elite flow cytometer (Coulter). For northern blotting, equal loading was ensured by over-probing blots with a control complementary DNA (RP49). For the production of dsRNA, transcription templates were generated by polymerase chain reaction such that they contained T7 promoter sequences on each end of the template. RNA was prepared using the RiboMax kit (Promega). Confirmation that RNAs were double stranded came from their complete sensitivity to RNAse III (a gift from A. Nicholson). Target mRNA transcripts were synthesized using the Riboprobe kit (Promega) and were gel purified before use. Extract preparation Log-phase S2 cells were plated on 15-cm tissue culture dishes and transfected with 30 µg dsRNA and 30 µg carrier plasmid DNA. Seventy-two hours after transfection, cells were harvested in PBS containing 5 mM EGTA washed twice in PBS and once in hypotonic buffer (10 mM HEPES pH 7.3, 6 mM -mercaptoethanol). Cells were suspended in 0.7 packed-cell volumes of hypotonic buffer containing Complete protease inhibitors (Boehringer) and 0.5 units ml-1 of RNasin (Promega). Cells were disrupted in a dounce homogenizer with a type B pestle, and lysates were centrifuged at 30,000g for 20 min. Supernatants were used in an in vitro assay containing 20 mM HEPES pH 7.3, 110 mM KOAc, 1 mM Mg(OAc)2, 3 mM EGTA, 2 mM CaCl2, 1 mM DTT. Typically, 5 µl extract was used in a 10 µl assay that contained also 10,000 c.p.m. synthetic mRNA substrate. Extract fractionation Extracts were centrifuged at 200,000g for 3 h and the resulting pellet (containing ribosomes) was extracted in hypotonic buffer containing also 1 mM MgCl2 and 300 mM KOAc. The extracted material was spun at 100,000g for 1 h and the resulting supernatant was fractionated on Source 15Q column (Pharmacia) using a KCl gradient in buffer A (20 mM HEPES pH 7.0, 1 mM dithiothreitol, 1 mM MgCl2). Fractions were assayed for nuclease activity as described above. For northern blotting, fractions were proteinase K/SDS treated, phenol extracted, and resolved on 15% acrylamide 8M urea gels. RNA was electroblotted onto Hybond N+ and probed with strand-specific riboprobes derived from cyclin E mRNA. Hybridization was carried out in 500 mM NaPO4 pH 7.0, 15% formamide, 7% SDS, 1% BSA. Blots were washed in 1 SSC at 37–45 °C. Supplementary information is available on Nature's World-Wide Web site (http://www.nature.com) or as paper copy from the London editorial office of Nature. Received 26 November 1999; accepted 26 January 2000 References 1. Sharp, P. A. RNAi and double-strand RNA. Genes Dev. 13, 139-141 (1999). | PubMed | ISI | 2. Sanchez-Alvarado, A. & Newmark, P. A. Double-stranded RNA specifically disrupts gene expression during planarian regeneration. Proc. Natl Acad. Sci. USA 96, 5049-5054 (1999). | Article | PubMed | 3. Lohmann, J. U., Endl, I. & Bosch, T. C. Silencing of developmental genes in Hydra. Dev. Biol. 214, 211-214 (1999). | Article | PubMed | ISI | 4. Cogoni, C. & Macino, G. Gene silencing in Neurospora crassa requires a protein homologous to RNA-dependent RNA polymerase. Nature 399, 166-169 (1999). | Article | PubMed | ISI | 5. Waterhouse, P. M., Graham, M. W. & Wang, M. B. Virus resistance and gene silencing in plants can be induced by simultaneous expression of sense and antisense RNA. Proc. Natl Acad. Sci. USA 95, 13959-13964 (1998). | Article | PubMed | ISI | 6. Montgomery, M. K. & Fire, A. Double-stranded RNA as a mediator in sequence-specific genetic silencing and co-suppression. Trends Genet. 14, 225-228 (1998). 7. Ngo, H., Tschudi, C., Gull, K. & Ullu, E. Double-stranded RNA induces mRNA degradation in Trypanosoma brucei. Proc. Natl Acad. Sci. USA 95, 14687-14692 (1998). | Article | PubMed | ISI | 8. Tabara, H. et al. The rde-1 gene, RNA interference, and transposon silencing in C. elegans. Cell 99, 123-132 (1999). | PubMed | ISI | 9. Ketting, R. F., Haverkamp, T. H. A., van Luenen, H. G. A. M. & Plasterk, R. H. A. mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner Syndrome helicase and RnaseD. Cell 99, 133-141 (1999). | PubMed | ISI | 10. Ratcliff, F., Harrison, B. D. & Baulcombe, D. C. A similarity between viral defense and gene silencing in plants. Science 276, 1558-1560 (1997). | Article | ISI | 11. Kennerdell, J. R. & Carthew, R. W. Use of dsRNA-mediated genetic interference to demonstrate that frizzled and frizzled 2 act in the wingless pathway. Cell 95, 1017-1026 (1998). | PubMed | ISI | 12. Misquitta, L. & Paterson, B. M. Targeted disruption of gene function in Drosophila by RNA interference: a role for nautilus in embryonic somatic muscle formation. Proc. Natl Acad. Sci. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. USA 96, 1451-1456 (1999). | Article | PubMed | ISI | Kalejta, R. F., Brideau, A. D., Banfield, B. W. & Beavis, A. J. An integral membrane green fluorescent protein marker, Us9-GFP, is quantitatively retained in cells during propidium iodinebased cell cycle analysis by flow cytometry. Exp. Cell. Res. 248, 322-328 (1999). | Article | PubMed | ISI | Wolf, D. A. & Jackson, P. K. Cell cycle: oiling the gears of anaphase. Curr. Biol. 8, R637-R639 (1998). Kramer, E. R., Gieffers, C., Holz, G., Hengstschlager, M. & Peters, J. M. Activation of the human anaphase-promoting complex by proteins of the CDC20/fizzy family. Curr. Biol. 8, 12071210 (1998). | PubMed | ISI | Shuttleworth, J. & Colman, A. Antisense oligonucleotide-directed cleavage of mRNA in Xenopus oocytes and eggs. EMBO J. 7, 427-434 (1988). | PubMed | ISI | Tabara, H., Grishok, A. & Mello, C. C. RNAi in C. elegans: soaking in the genome sequence. Science 282, 430-432 (1998). | Article | PubMed | ISI | Bosher, J. M., Dufourcq, P., Sookhareea, S. & Labouesse, M. RNA interference can target premRNA. Consequences for gene expression in a Caenorhabditis elegans operon. Genetics 153, 1245-1256 (1999). | PubMed | ISI | Hamilton, J. A. & Baulcombe, D. C. A species of small antisense RNA in posttranscriptional gene silencing in plants. Science 286, 950-952 (1999). | Article | PubMed | Jones, L. A., Thomas, C. L. & Maule, A. J. De novo methylation and co-suppression induced by a cytoplasmically replicating plant RNA virus. EMBO J. 17, 6385-6393 (1998). | Article | PubMed | Jones, L. A. et al. RNA-DNA interactions and DNA methylation in post-transcriptional gene silencing. Plant Cell 11, 2291-2301 (1999). | PubMed | ISI | Schneider, I. Cell lines derived from late embryonic stages of Drosophila melanogaster. J. Embryol. Exp. Morpho. 27, 353-365 (1972). | ISI | Di Nocera, P. P. & Dawid, I. B. Transient expression of genes introduced into cultured cells of Drosophila. Proc. Natl Acad. Sci. USA 80, 7095-7098 (1983). | PubMed | ISI | Figure 1 RNAi in S2 cells. a, Drosophila S2 cells were transfected with a plasmid that directs lacZ expression from the copia promoter in combination with dsRNAs corresponding to either human CD8 or lacZ, or with no dsRNA, as indicated. b, S2 cells were co-transfected with a plasmid that directs expression of a GFP–US9 fusion protein (12) and dsRNAs of either lacZ or cyclin E, as indicated. Upper panels show FACS profiles of the bulk population. Lower panels show FACS profiles from GFP-positive cells. c, Total RNA was extracted from cells transfected with lacZ, cyclin E, fizzy or cyclin A dsRNAs, as indicated. Northern blots were hybridized with sequences not present in the transfected dsRNAs. Figure 2 RNAi in vitro. a, Transcripts corresponding to either the first 600 nucleotides of Drosophila cyclin E (E600) or the first 800 nucleotides of lacZ (Z800) were incubated in lysates derived from cells that had been transfected with either lacZ or cyclin E (cycE) dsRNAs, as indicated. Time points were 0, 10, 20, 30, 40 and 60 min for cyclin E and 0, 10, 20, 30 and 60 min for lacZ. b, Transcripts were incubated in an extract of S2 cells that had been transfected with cyclin E dsRNA (cross-hatched box, below). Transcripts corresponded to the first 800 nucleotides of lacZ or the first 600, 300, 220 or 100 nucleotides of cyclin E, as indicated. Eout is a transcript derived from the portion of the cyclin E cDNA not contained within the transfected dsRNA. E-ds is identical to the dsRNA that had been transfected into S2 cells. Time points were 0 and 30 min. c, Synthetic transcripts complementary to the complete cyclin E cDNA (Eas) or the final 600 nucleotides (Eas600) or 300 nucleotides (Eas300) were incubated in extract for 0 or 30 min. Figure 3 Substrate requirements of the RISC. Extracts were prepared from cells transfected with cyclin E dsRNA. Aliquots were incubated for 30 min at 30 °C before the addition of either the cyclin E (E600) or lacZ (Z800) substrate. Individual 20-µl aliquots, as indicated, were preincubated with 1 mM CaCl 2 and 5 mM EGTA, 1 mM CaCl2, 5 mM EGTA and 60 U of micrococcal nuclease, 1 mM CaCl2 and 60 U of micrococcal nuclease or 10 U of DNase I (Promega) and 5 mM EGTA. After the 30-min pre-incubation, EGTA was added to those samples that lacked it. Yeast tRNA (1 µg) was added to all samples. Time points were at 0 and 30 min. Figure 4 The RISC contains a potential guide RNA. a, Northern blots of RNA from either a crude lysate or the S100 fraction (containing the soluble nuclease activity, see Methods) were hybridized to a riboprobe derived from the sense strand of the cyclin E mRNA. b, Soluble cyclin-E-specific nuclease activity was fractionated as described in Methods. Fractions from the anion-exchange resin were incubated with the lacZ, control substrate (upper panel) or the cyclin E substrate (centre panel). Lower panel, RNA from each fraction was analysed by northern blotting with a uniformly labelled transcript derived from sense strand of the cyclin E cDNA. DNA oligonucleotides were used as size markers. Vol. 15, No. 5, pp. 485-490, March 1, 2001 REVIEW RNA interference 2001 Phillip A. Sharp1 Center for Cancer Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, USA Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References Introduction In the few years since the discovery of RNA interference (RNAi; Fire et al. 1998 ), it has become clear that this process is ancient. RNAi, the oldest and most ubiquitous antiviral system, appeared before the divergence of plants and animals. Because aspects of RNAi, known as cosuppression, also control the expression of transposable elements and repetitive sequences (Ketting et al. 1999 ; Tabara et al. 1999 ), the interplay of RNAi and transposon activities have almost certainly shaped the structure of the genome of most organisms. Surprisingly, we are only now beginning to explore the molecular processes responsible for RNAi and to appreciate the breadth of its function in biology. Practical applications of this knowledge have allowed rapid surveys of gene functions (see Fraser et al. 2000 and Gönczey et al. 2000 for RNAi analysis of genes on chromosome I and III of Caenorhabditis elegans) and will possibly result in new therapeutic interventions. Genetic studies have expanded the biology of RNAi to cosuppression, transposon silencing, and the first hints of relationships to regulation of translation and development. The possible roles of RNA-dependent RNA polymerase (RdRp) in RNAi have been expanded. Many experiments indicate that dsRNA directs genespecific methylation of DNA and, thus, regulation at the stage of transcription in plants. Cosuppression may involve regulation by polycomb complexes at the level of transcription in C. elegans and Drosophila. This article will review these topics and primarily summarize advances in the study of RNAi over the past year. Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References Sequence and strand specificity of RNAi Restriction of virus growth in plants is mediated by posttranscriptional gene silencing (PTGS), which can be initiated by production of dsRNA replicative intermediates. This silencing of expression is gene specific, and Hamilton and Baulcombe (1999) discovered that tissue manifesting PTGS contained small RNAs (25 nt) complementary to both strands of the gene. Using extracts of Drosophila embryos that had been shown previously to be active for RNAi (Kennerdell and Carthew 1998 ), Tuschl et al. (1999) were able to reproduce RNAi in a soluble reaction. dsRNA added to this reaction is cleaved into 21-23-nt RNAs, which leads to cleavage of the target mRNA at 21-23-nt intervals (Zamore et al. 2000 ). Hammond et al. (2000) also concluded that small RNAs directed cleavage of mRNAs in Drosophila extracts prepared from Schneider cells. These experiments are best explained by a model for RNAi where dsRNA is processed to 21-23-nt RNAs that direct the cleavage of mRNA through sequence complementarity. These RNAs are referred to as siRNAs, or short interfering RNAs (see below). Fire and Mello have continued their collaboration studying the functional anatomy of dsRNA for induction of RNAi (Parrish et al. 2000 ). They first concluded, using short RNAs synthesized chemically and assayed by injection into C. elegans, that any dsRNA segment greater than ~26 bp can generate RNAi. Thus, the process for generation of siRNAs is probably sequence nonspecific. This was confirmed by the observation that individual short dsRNA formed from sequences that did not contain adenosine, uridine, or cytidine were active for RNAi. Long dsRNAs were more active than short dsRNAs; a 250-fold higher concentration of 26-bp dsRNA generated the equivalent gene silencing activity as an 81-bp dsRNA. dsRNA from a related but not identical gene can be used to target a gene for silencing if the two share segments of identical and uninterrupted sequences of significant length, probably >30-35 nt in length. Silencing was inefficient when the largest uninterrupted segments were 14 and 23 nt in length but efficient when 41 nt of such sequences of identity were shared. These results suggest that silencing will probably occur if long dsRNAs are used and the two related genes are >90% homologous. Assuming that dsRNA is processed to 21-23-nt segments, these results indicate that single basepair mismatches between the siRNA and target RNA dramatically reduce gene targeting and silencing. In the C. elegans assay used by Mello and Fire, it is likely that the injected dsRNA is directly processed to the targeting siRNAs and that these are not replicated by an endogenous RNA-dependent RNA polymerase. This conclusion rests on the effects of asymmetric modifications of the input dsRNA. Substitution of either 2'amino uracil for uracil or 2'-amino cytidine for cytidine in the sense strand of the dsRNA had little effect on the RNAi activity, while the same substitutions in the antisense strand rendered the RNA inactive. If the input dsRNA were replicated before targeting, it would be expected to lose this asymmetry. As the above assays were done in somatic tissue of C. elegans, it is possible that the long-term RNAi observed through multiple generations (Grishok et al. 2000 ) could involve replication in the germ-line tissues. Mutations in a C. elegans gene with sequence relationship to RdRp, EGO-1, have been reported to affect some aspects of RNAi (Smardon et al. 2000 ). Genesis of RNAi The structure of siRNAs is probably the same in all organisms, as the 21-23-nt length of siRNAs seems to be universal. Furthermore, siRNAs might be the best candidates for use in targeted gene silencing because their structure would match the biochemical Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References components of the RNAi system. The complex generating the siRNAs from short dsRNAs primarily recognizes the 3' termini of the duplex (Elbashir et al. 2001 ). Internal cleavage of the dsRNA occurs at a distance of ~22 nt, and a complex of siRNA and proteins targets cleavage of the complementary target RNA at a position ~10-12 nt from the terminus of the original dsRNA (see top panel of Fig. 1). The siRNA duplex probably remains associated with the initial complex because it asymmetrically targets a strand for cleavage and not its partner (the sense strand in the example illustrated in Fig. 1). This asymmetry was not observed when symmetric siRNAs with 2-nt tails on both strands were added to the reaction. Both strands of the target RNAs were cleaved within the region covered by the siRNA duplex, indicating that the siRNA duplex can bind to the complex responsible for cleavage in either orientation (see bottom panel of Fig. 1). In general, a siRNA duplex with 2-nt 3' tails is thought to be the primary intermediate of RNAi. In fact, addition of RNAs with this structure to reactions in vitro can silence translation of a target mRNA with a similar efficiency (within 10-fold) on a molar basis to dsRNAs of >50 bp. Addition of either one of the two single strands constituting a siRNA duplex generates no activity. Figure 1. Comparison of the cleavage patterns on sense and antisense target strands when either a short double strand RNA (top) or a siRNA (bottom) are added to a reaction in vitro. The dsRNA generates a siRNA complex which only cleaves the sense strand. Processing of siRNAs from the opposite end of the dsRNA would only cleave the antisense strand (not shown). Addition of a siRNA with the same structure View larger version (26K): as that processed from the dsRNA generates cleavage [in this window] of both the sense and antisense strands, suggesting [in a new window] that the siRNA can bind the complex in either orientation. Tuschl's lab developed methods for cloning of siRNAs using T4 RNA ligase to add linker segments to their 5' and 3' termini (Elbashir et al. 2001 ). The predominant structure is a 19-20-bp duplex RNA with both termini possessing 2-nt 3' singlestrand segments, and the total length of each strand is predominantly 21-22 nt. RNase III-type endonucleases cleave dsRNA releasing RNA with 2-nt 3' tails, indicating that this type of activity is probably involved in generating siRNAs (a possibility first suggested by Bass [2000]). Although the results were not described in the paper, Elbashir et al. (2001) reported the cloning of siRNAs that were endogenous to the Drosophila extract. This foretells future studies where analysis of the sequence of siRNAs in cells will indicate which genes are naturally silenced by RNAi. How are the siRNAs related to the site of cleavage on the target mRNA? As shown in Figure 1, the siRNAs direct cleavage of the target RNA in the middle of the paired segments, ~12 bp from the 3' terminus of the siRNA. This positions the site of cleavage of the target RNA about one turn of an A-type duplex helix from the cleavages that generated the siRNAs. This could indicate a rearrangement of the RNase III-type domains contacting the siRNA duplex before the second cleavage. Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References Genetic analysis of RNAi Several groups are actively pursuing the identification and characterization of enzymes implicated in RNAi and cosuppression. In C. elegans, initial mutant screens have generated ~80 candidates, of which five have been specifically identified: RDE-1, RDE-2, RDE-3, RDE-4, and Mut-7 (Ketting et al. 1999 ; Tabara et al. 1999 ; Ketting and Plasterk 2000 ; Grishok et al. 2000 ). Selection of mutations in cosuppression in Arabidopsis have identified homologs of the same genes (Dalmay et al. 2000 ; Fagard et al. 2000 ; Mourrain et al. 2000 ). Testing of previously identified mutations for defects in RNAi in C. elegans and other organisms has expanded this list. Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References Enzymes of RNAi RNase III proteins and RNAi What type of RNase III-like activity might be active in RNAi? Bacterial RNase III and its homologs in Saccharomyces cerevisiae and Schizosaccharomyces pombe function in processing of rRNA and other structural RNAs (Chanfreau et al. 2000 ). There are two general families of RNase III homologs in plants and animals. One family is represented by the drosha Drosophila gene, which is composed of two RNase III domains and one dsRNA binding domain (Filippov et al. 2000 ). Antisense experiments suggest that a ubiquitously expressed human family member closely related to drosha is important for rRNA processing (Wu et al. 2000 ). The second family of RNase III proteins contains an N-terminal ATPdependent helicase-type domain as well as two RNase III-type domains and a dsRNA motif (Filippov et al. 2000 ). Perhaps these represent the best candidates for the RNase III activity in RNAi (Elbashir et al. 2001 ). Recent results from Bernstein et al. (2001) describe the cleavage of dsRNA into 22-nt segments by a Drosophila protein of the RNase III type. Furthermore, RNA interference was used to indicate that this protein is important for RNAi activity. Mutations in an Arabidopsis gene in this family result in unregulated cell division in floral meristems (Jacobsen et al. 1999 ). This would be consistent with a relationship between RNAi and development. Interestingly, the presence of two RNase III domains in this family of proteins suggests that it might cleave dsRNA as a monomer. The dsRNAbinding domain could position the enzyme on the substrate, and the two catalytic domains could hydrolyze bonds in both strands. RNA-dependent RNA polymerase Mutations in genes encoding a protein related to RNA-dependent RNA polymerase (RdRp) affect RNAi-type processes in Neurospora (QDE-1), C. elegans (EGO-1), and plants (SGS2, Mourrain et al. 2000 ; and SDE-1, Dalmay et al. 2000 ). It has been generally assumed that this type of polymerase would replicate siRNAs as epigenetic agents permitting their spread throughout plants and between generations in C. elegans. This may still be the case; however, results from Arabidopsis indicate that SDE-1 is important for gene silencing mediated by the presence of transgenes but not for posttranscriptional gene silencing (PTGS), induced by a replicating RNA virus (Dalmay et al. 2000 ). The efficient generation of siRNAs from transgenes was dependent upon SDE-1, whereas siRNAs were generated in SDE-1 mutant plants by viral replication, which generates dsRNA. The authors conjecture that aberrant RNAs from the transgenes are recognized by the RNA-dependent RNA polymerase, SDE-1, generating dsRNA that is processed to siRNAs. RNA-dependent RNA helicase Another type of RNA helicase of the DEAH-box helicase super family has also recently been shown to be important for RNAi or PTGS in Chlamydomonas reinhardtii (Wu-Scharf et al. 2000 ). Mutations in this gene, Mut-6, relieve silencing by a transgene and also activate transposons. Helicases of the same family are important for RNA splicing in yeast; however, Mut-6 is not thought to be involved in RNA splicing. A closely related yeast gene that is involved in RNA splicing, PRP16, has been shown to have ATP-dependent RNA helicase activity (Wang et al. 1998 ). Perhaps Mut-6 unwinds duplex RNA in some step of RNAi. Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References Processes related to RNAi Nonsense-mediated decay of mRNA A link between RNAi and nonsense-mediated decay was revealed by screening of mutants in the latter process (Domeier et al. 2000 ). mRNAs containing nonsense mutations upstream of an intron are rapidly degraded in organisms as diverse as worms and vertebrates. Seven genes, SMG 1-7, are important for this process in C. elegans (Page et al. 1999 ). Surprisingly, mutants of C. elegans with lesions in either smg-2, smg-5, or smg-6 failed to efficiently maintain RNAi over the course of 4 d following injection of dsRNA. Both mutant and wild-type animals showed equivalent levels of RNAi on the first day, and this level was essentially unchanged in the wild-type animals over the same 4-d interval. Smg-1, and probably smg-3 and smg-4, are not important for maintenance of RNAi over the 4-d interval. Smg-2, based on homology, is thought to encode an ATPase with RNA binding and helicase activity (Page et al. 1999 ). Its specific role in nonsense-mediated decay of mRNA is unknown. Regulation of translation during development RDE-1, which is important for RNAi in C. elegans, is a member of a family of 23 related genes in this organism (Tabara et al. 1999 ). There are four family members in Drosophila and several in humans. In Drosophila, two of the most closely related genes have unknown functions, whereas the other two, piwi and aubergine (aub) function in oogenesis (Wilson et al. 1996 ; Cox et al. 1998 ). Specifically, aub is required for translation of two mRNAs, oskar and gurken. Arabidopsis encodes eight genes related to RDE-1. Mutations in two of these genes, Argonaute 1 (AGO1) and ZWILLE/PINHEAD (ZLL/PNH), result in defects in development. Mutations in the two genes have distinct phenotypes although they are expressed in many of the same tissues. A relationship between RNAi and development is suggested by the observation that mutants of AGO1 are also defective for cosuppression (Fagard et al. 2000 ). These results strongly suggest that multiple RDE-1 family members are likely to be involved in RNAi, perhaps in different tissues and in a redundant fashion. They also suggest that RNAi will share some processes in common with regulation of development. Interestingly, the C. elegans small RNAs lin-4 and let-7, which are 22 and 21 nt long, respectively, are known to regulate translation during development in C. elegans. These RNAs are possibly processed from dsRNA regions of a precursor RNA and are thought to pair with the 3' UTR of their targets in regulation of translation. The let-7 RNA is conserved between C. elegans, Drosophila, and humans (Pasquinelli et al. 2000 ). The similarity in lengths of siRNAs and lin-4 and let-7 suggests that these systems might share components. Regulation of transcription Three gene-silencing phenomena, cosuppression, transposon silencing, and DNA methylation, are related to RNAi by dependence on a common set of genes. For example, in C. elegans, both transposon silencing and cosuppression depend on RDE-2, RDE-3, and Mut-7, which are critical for RNAi (Ketting et al. 1999 ; Tabara et al. 1999 ; and Ketting and Plasterk 2000 ). Cosuppression is generally defined as suppression of an endogenous locus following introduction of homologous transgenes. This trans-suppression requires transcription of the transgenes but is independent of the specific-promoter sequence used to direct transcription (Dernburg et al. 2000 ). Loss of a transgene array from the germ line of C. elegans by deletion results in reactivation of the endogenous locus after a few generations. Thus, the endogenous locus is not mutated during silencing by cosuppression as it is during a related phenomenon, called quelling, in Neurospora. There is no evidence for pairing of the transgenic array and the endogenous locus during cosuppression in C. elegans (Dernburg et al. 2000 ). Thus, the silencing of the endogenous locus is probably mediated by a trans-acting factor that is sequence specific and dependent on transcription. This, and its dependence upon the RNAi related genes RDE-2, RDE-3, and Mut-7, strongly indicates that cosuppression is mediated by trans-acting RNA, probably siRNAs (see Fig. 2). Figure 2. Proposal that siRNAs might be a regulatory intermediate in mRNA cleavage, mRNA translation, DNA methylation, and suppression of transcription by the polycomb group. See text for discussion of evidence for these potential relationships. View larger version (14K): [in this window] [in a new window] Cosuppression and the polycomb complex The silencing of tandem arrays in C. elegans is dependent on the set of mes genes (maternal-effect sterile; Holdeman et al. 1998 ; Kelly and Fire 1998 ; Korf et al. 1998 ). Two of these genes are homologs of enhancer of zeste and extra sexcombs in Drosophila and are in the polycomb group of genes. In Drosophila, endogenous loci silenced by cosuppression are bound by a polycomb complex (Pal-Bhadra et al. 1997 , 1999 ), indicating that this process directs the gene-specific binding of this epigenetic regulatory machine. Polycomb complexes are thought to silence genes at the stage of transcription by forming inactive chromatin. Once associated with a gene, the polycomb complex and the transcriptionally suppressed state are stable through DNA replication and cell division. This suggests a model where siRNAs target specific genomic DNA sequences, probably by base pairing, thus directing the binding of the polycomb complex to adjacent sites, resulting in silencing of the locus. This attractive but speculative model awaits direct evidence that dsRNA or siRNAs can silence endogenous genes at the stage of transcription with concomitant association of polycomb complexes. Double-strand RNA-directed methylation of DNA Double-strand RNA-initiated gene-specific methylation of endogenous loci is a well-established phenomenon in plants. An early observation of the specific methylation of chromosomal DNA dependent on RNA replication in plants was described in Wessenegger et al. (1994) . This work has been extended to demonstrate that genomic sequences as short as 30 bp can be specifically methylated when present in cells with replicating viral RNA containing homologous sequences (Pélissier and Wessenegger 2000 ). Replicating recombinant viral RNA vectors containing different segments of an expressed gene have been used to demonstrate homology-based RNA-directed methylation (Jones et al. 1999 ; Merrett et al. 2000 ). Methylation was directed to different portions of either the body of the gene or to the promoter when the corresponding segment was part of the replicating RNA. This would be consistent with conversion of the dsRNA of the replicating intermediate into siRNAs and targeting of methylation by these short RNAs (Merrett et al. 2000 ). Interestingly, a viral protein (Hc-Pro) that suppresses PTGS (RNAi) when introduced into cells inhibited the maintenance of siRNAs, and a concomitant decrease in methylation of the corresponding specific genome sequence was observed (Llave et al. 2000 ). DNA methylation results in suppression of transcription probably by recruitment of histone deacetylases. The modified and silenced state is epigenetically transmitted, reducing expression of the gene in daughter cells. This is strikingly similar to the conjectured role of polycomb proteins in cosuppression in C. elegans and Drosophila. At present, there is no known relationship between polycomb suppression of gene expression and subsequent DNA methylation, but the possibility does not seem unreasonable. The analysis to date of cosuppression, RNAi, and PTGS strongly indicates that RNAs can specify regulation of transcription of genomic sequences. These processes probably account for suppression of expression of repetitive sequences in genomes, such as transposons and retroelements. RNAi/cosuppression has been demonstrated to be active in germ-line tissue and should be considered a ubiquitous process shaping the sequence content and structure of the genome of eukaryotic organisms. Acknowledgments I thank Tom Tuschl for the preprint; Michael McManus, Carl Novina, Tom Tuschl, Hristo Houbaviy, and Chris Burge for comments; and Helen Cargill for illustrations. Footnotes 1 Corresponding author. E-MAIL sharppa@mit.edu; FAX (617) 253-3867. Article and publication are at www.genesdev.org/cgi/doi/10.1101/gad.880001. Top Introduction Sequence and strand specificity... Genesis of RNAi Genetic analysis of RNAi Enzymes of RNAi Processes related to RNAi References References Bass, B.L. 2000. Double-stranded RNA as a template for gene silencing. Cell 101: 235-238[Medline]. Bernstein, E., Caudy, A.A., Hammond, S.M., and Hannon, G.J. 2001. Role for a bidentate ribonuclease in the initiation step of RNA interference. Nature 409: 363366[CrossRef][Medline]. Chanfreau, G., Buckle, M., and Jacquier, A. 2000. Recognition of a conserved class of RNA tetraloops by Saccharomyces cerevisiae RNase III. Proc. Natl. Acad. Sci. 97: 3142-3147[Abstract/Full Text]. Cox, D.N., Chao, A., Baker, J., Chang, L., Qiao, D., and Lin, H.A. 1998. A novel class of evolutionarily conserved genes defined by piwi are essential for stem cell renewal. Genes & Dev. 12: 3715-3727[Abstract/Full Text]. Dalmay, T., Hamilton, A., Rudd, S., Angell, S., and Baulcombe, D.C. 2000. An RNA-dependent RNA polymerase gene in Arabidopsis is required for posttranscriptional gene silencing mediated by a transgene but not by a virus. Cell 101: 543-553[Medline]. Dernburg, A.F., Zalevsky, J., Colaiacovo, M.P., and Villeneuve, A.M. 2000. Transgene-mediated co-suppression in the C. elegans germ line. Genes & Dev. 14: 1578-1583[Abstract/Full Text]. Domeier, M.E., Morse, D.P., Knight, S.W., Portereiko, M., Bass, B.L., and Mango, S.E. 2000. A link between RNA interference and nonsense-mediated decay in Caenorhabditis elegans. Science 289: 1928-1930[Abstract/Full Text]. Elbashir, S., Lendeckel, W., and Tuschl, T. 2001. RNA interference is mediated by 21 and 22 nt RNAs. Genes & Dev. 15: 188-200[Abstract/Full Text]. Fagard, M., Boutet, S., Morel, J.-B., Bellini, C., and Vaucheret, H. 2000. AGO1, QDE1, and RDE-1 are related proteins required for post-transcriptional gene silencing in plants, quelling in fungi, and RNA interference in animals. Proc. Natl. Acad. Sci. 97: 11650-11654[Abstract/Full Text]. Filippov, V., Solovyev, V., Filippova, M., and Gill, S.S. 2000. A novel type of RNase III family proteins in eukaryotes. Gene 245: 213-221[CrossRef][Medline]. Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E., and Mello, C.C. 1998. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391: 806-811[CrossRef][Medline]. Fraser, A.G., Kamath, R.S., Zipperlen, P., Martinez-Campos, M., Sohrmann, M., and Ahringer, J. 2000. Functional genomic analysis of C. elegans chromosome I by systematic RNA interference. Nature 408: 325-330[CrossRef][Medline]. Gönczey, P., Echeverri, C., Oegema, K., Coulson, A., Jones, S.J.M., Copley, R.R., Duperon, J., Oegema, J., Brehm, M., Cassin, E. et al. 2000. Functional genomic analysis of cell division in C. elegans using RNAi of genes on chromosome III. Nature 408: 331-336[CrossRef][Medline]. Grishok, A., Tabara, H., and Mello, C.C. 2000. Genetic requirements for inheritance of RNAi in C. elegans. Science 287: 2494-2497[Abstract/Full Text]. Hamilton, A.J. and Baulcombe, D.C. 1999. A species of small antisense RNA in post-transcriptional gene silencing in plants. Science 286: 950-952[Abstract/Full Text]. Hammond, S.M., Bernstein, E., Beach, D., and Hannon, G.J. 2000. An RNAdirected nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404: 293-296[CrossRef][Medline]. Holdeman, R., Nehrt, S., and Strome, S. 1998. MES-2, a maternal protein essential for viability of the germline in Caenorhabditis elegans, is homologous to a Drosophila polycomb group protein. Development 125: 2457-2467[Abstract]. Jacobsen, S.E., Running, M.P., and Meyerowitz, E.M. 1999. Disruption of an RNA helicase/RNase III gene in Arabidopsis causes unregulated cell division in floral meristems. Development 126: 5231-5243[Abstract]. Jones, L., Hamilton, A.J., Voinnet, O., Thomas, C.L., Maule, A.J., and Baulcombe, D.C. 1999. RNA-DNA interactions and DNA methylation in post-transcriptional gene silencing. Plant Cell 11: 2291-2302[Abstract/Full Text]. Kelly, W.G. and Fire, A. 1998. Chromatin silencing and the maintenance of a functional germline in Caenorhabditis elegans. Development 125: 24512456[Abstract]. Kennerdell, J.R. and Carthew, R.W. 1998. Use of dsRNA-mediated genetic interference to demonstrate that frizzled and frizzled 2 act in the wingless pathway. Cell 95: 1017-1026[Medline]. Ketting, R.F. and Plasterk, R.H. 2000. A genetic link between co-suppression and RNA interference in C. elegans. Nature 404: 296-298[CrossRef][Medline]. Ketting, R.F., Haverkamp, T.H., van Luenen, H.G., and Plasterk, R.H. 1999. Mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner syndrome helicase and RNaseD. Cell 99: 133-141[Medline]. Korf, I., Fan, Y., and Strome, S. 1998. The polycomb group in Caenorhabditis elegans and maternal control of germline development. Development 125: 24692478[Abstract]. Llave, C., Kasschau, K.D., and Carrington, J.C. 2000. Virus-encoded suppressor of post-transcriptional gene silencing targets a maintenance step in the silencing pathway. Proc. Natl. Acad. Sci. 97: 13401-13406[Abstract/Full Text]. Merrett, M.F., Aufsatz, W., van Der Winden, J., Matzke, M.A., and Matzke, A.J. 2000. Transcriptional silencing and promoter methylation triggered by doublestranded RNA. EMBO J. 19: 5194-51201[Abstract/Full Text]. Mourrain, P., Béclin, C., Elmayan, T., Feuerbach, F., Godon, C., Morel, J.B., Jouette, D., Lacombe, A.M., Nikic, S., Picault, N. et al. 2000. Arabidopsis SGS2 and SGS3 genes are required for post-transcriptional gene silencing and natural virus resistance. Cell 101: 533-542[Medline]. Page, M.F., Carr, B., Anders, K.R., Grimson, A., and Anderson, P. 1999. SMG-2 is a phosphorylated protein required for mRNA surveillance in Caenorhabditis elegans and related to Upflp of yeast. Mol. Cell. Biol. 19: 5943-5951[Abstract/Full Text]. Pal-Bhadra, M., Bhadra, U., and Birchler, J.A. 1997. Co-suppression in Drosophila: Gene silencing of Alcohol dehydrogenase by white-Adh transgenes is Polycomb dependent. Cell 90: 479-490[Medline]. -----. 1999. Co-suppression of nonhomologous transgenes in Drosophila involves mutually related endogenous sequences. Cell 99: 35-46[Medline]. Parrish, S., Fleenor, J., Xu, S., Mello, C., and Fire, A. 2000. Functional anatomy of a dsRNA trigger: Differential requirement for the two trigger strands in RNA interference. Mol. Cell 6: 1077-1087[Medline]. Pasquinelli, A.E., Reinhart, B., Slack, F., Martindale, M.Q., Kuroda, M.I., Maller, B., Hayward, D.C., Ball, E.E., Degnan, B., Müller, P. et al. 2000. Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature 408: 86-89[CrossRef][Medline]. Pélissier, T. and Wessenegger, M. 2000. A DNA target of 30 bp is sufficient for RNA-directed DNA methylation. RNA 6: 55-65[CrossRef][Medline]. Smardon, A., Spoerke, J., Stacey, S., Klein, M., Mackin, N., and Maine, E. 2000. EGO-1 is related to RNA-directed RNA polymerase and functions in germ-line development and RNA interference in C. elegans. Curr. Biol. 10: 169178[Medline]. Tabara, H., Sarkissian, M., Kelly, W.G., Fleenor, J., Grishok, A., Timmons, L., Fire, A., and Mello, C.C. 1999. The rde-1 gene, RNA interference, and transposon silencing in C. elegans. Cell 99: 123-132[Medline]. Tuschl, T., Zamore, P.D., Lehmann, R., Bartel, D.P., and Sharp, P.A. 1999. Targeted mRNA degradation by double-stranded RNA in vitro. Genes & Dev. 13: 3191-3197[Abstract/Full Text]. Wang, Y., Wagner, J.D., and Guthrie, C. 1998. The DEAH-box splicing factor Prp16 unwinds RNA duplexes in vitro. Curr. Biol. 8: 441-451[Medline]. Wessenegger, M., Heimes, S., Riedel, L., and Sanger, H.L. 1994. RNA-directed de novo methylation of genomic sequences in plants. Cell 76: 567-576[Medline]. Wilson, J.E., Connell, J.E., and MacDonald, P.M. 1996. aubergine enhances oskar translation in the Drosophila ovary. Development 122: 1631-1639[Abstract]. Wu, H., Xu, H., Miraglia, L.J., and Crooke, S.T. 2000. Human RNase III is a 160 kDa protein involved in preribosomal RNA processing. J. Biol. Chem. 275: 36957-36965[Abstract/Full Text]. Wu-Scharf, D., Jeong, B.-R., Zhang, C., and Cerutti, H. 2000. Transgene and transpson silencing in chlamydomonas reinhardtii by a DEAH-box RNA helicase. Science 290: 1159-1962[Abstract/Full Text]. Zamore, P.D., Tuschl, T., Sharp, P.A., and Bartel, D.P. 2000. RNAi: Doublestranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals. Cell 101: 25-33[Medline]. GENES & DEVELOPMENT 15:485-490 © 2001 by Cold Spring Harbor Laboratory Press ISSN 08909369/01 $5.00 This article has been cited by other articles: Reprint (PDF) Version of this Article Similar articles found in: Genes Dev. Online PubMed PubMed Citation This Article has been cited by: Search Medline for articles by: Sharp, P. A. Alert me when: Mishra, S. K., Tripp, J., Winkelhaus, S., Tschiersch, B., Theres, K., Nover, L., new articles cite this article Scharf, K.-D. (2002). In the complex family Download to Citation Manager of heat stress transcription factors, HsfA1 has a unique role as master regulator of Collections under which this article appears: thermotolerance in tomato. Genes & Dev. Post-transcriptional Control 16: 1555-1567 [Abstract] [Full Text] Silhavy, D., Molnar, A., Lucioli, A., Szittya, G., Hornyik, C., Tavazza, M., Burgyan, J. (2002). A viral protein suppresses RNA silencing and binds silencinggenerated, 21- to 25-nucleotide double-stranded RNAs. EMBO J. 21: 3070-3080 [Abstract] [Full Text] Ahlquist, P. (2002). RNA-Dependent RNA Polymerases, Viruses, and RNA Silencing. Science 296: 1270-1273 [Abstract] [Full Text] Karamouzis, M. V., Gorgoulis, V. G., Papavassiliou, A. G. (2002). Transcription Factors and Neoplasia: Vistas in Novel Drug Design. Clin Cancer Res 8: 949-961 [Abstract] [Full Text] Schwarz, D. S., Zamore, P. D. (2002). Why do miRNAs live in the miRNP?. Genes & Dev. 16: 1025-1031 [Full Text] Li, X., Scuderi, A., Letsou, A., Virshup, D. M. (2002). B56-Associated Protein Phosphatase 2A Is Required For Survival and Protects from Apoptosis in Drosophila melanogaster. Mol. Cell. Biol. 22: 3674-3684 [Abstract] [Full Text] Grams, J., Morris, J. C., Drew, M. E., Wang, Z., Englund, P. T., Hajduk, S. L. (2002). A Trypanosome Mitochondrial RNA Polymerase Is Required for Transcription and Replication. J. Biol. Chem. 277: 16952-16959 [Abstract] [Full Text] Yu, J.-Y., DeRuiter, S. L., Turner, D. L. (2002). RNA interference by expression of short-interfering RNAs and hairpin RNAs in mammalian cells. Proc. Natl. Acad. Sci. U. S. A. 99: 6047-6052 [Abstract] [Full Text] Sui, G., Soohoo, C., Affar, E. B., Gay, F., Shi, Y., Forrester, W. C., Shi, Y. (2002). A DNA vector-based RNAi technology to suppress gene expression in mammalian cells. Proc. Natl. Acad. Sci. U. S. A. 99: 5515-5520 [Abstract] [Full Text] Holen, T., Amarzguioui, M., Wiiger, M. T., Babaie, E., Prydz, H. (2002). Positional effects of short interfering RNAs targeting the human coagulation trigger Tissue Factor. Nucleic Acids Res 30: 1757-1766 [Abstract] [Full Text] Dias, N., Stein, C. A. (2002). Antisense Oligonucleotides: Basic Concepts and Mechanisms. Mol Cancer Ther 1: 347-355 [Full Text] Boutla, A., Kalantidis, K., Tavernarakis, N., Tsagris, M., Tabler, M. (2002). Induction of RNA interference in Caenorhabditis elegans by RNAs derived from plants exhibiting post-transcriptional gene silencing. Nucleic Acids Res 30: 16881694 [Abstract] [Full Text] Zhou, Y., Ching, Y.-P., Kok, K. H., Kung, H.-f., Jin, D.-Y. (2002). Posttranscriptional suppression of gene expression in Xenopus embryos by small interfering RNA. Nucleic Acids Res 30: 1664-1669 [Abstract] [Full Text] Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A., Charroux, B., Abel, L., Rappsilber, J., Mann, M., Dreyfuss, G. (2002). miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes & Dev. 16: 720-728 [Abstract] [Full Text] Jeong, B.-r., Wu-Scharf, D., Zhang, C., Cerutti, H. (2002). From the Cover: Suppressors of transcriptional transgenic silencing in Chlamydomonas are sensitive to DNA-damaging agents and reactivate transposable elements. Proc. Natl. Acad. Sci. U. S. A. 99: 1076-1081 [Abstract] [Full Text] Harborth, J., Elbashir, S. M., Bechert, K., Tuschl, T., Weber, K. (2002). Identification of essential genes in cultured mammalian cells using small interfering RNAs. J Cell Sci 114: 4557-4565 [Abstract] [Full Text] Grosshans, H., Slack, F. J. (2002). Micro-RNAs: small is plentiful. J. Cell Biol. 156: 17-22 [Abstract] [Full Text] Ling, K.-Y., Haynes, W. J., Oesterle, L., Kung, C., Preston, R. R., Saimi, Y. (2001). K+-Channel Transgenes Reduce K+ Currents in Paramecium, Probably by a Posttranslational Mechanism. Genetics 159: 987-995 [Abstract] [Full Text] Billy, E., Brondani, V., Zhang, H., Muller, U., Filipowicz, W. (2001). Specific interference with gene expression induced by long, double-stranded RNA in mouse embryonal teratocarcinoma cell lines. Proc. Natl. Acad. Sci. U. S. A. 98: 1442814433 [Abstract] [Full Text] Elbashir, S. M., Martinez, J., Patkaniowska, A., Lendeckel, W., Tuschl, T. (2001). Functional anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo lysate. EMBO J. 20: 6877-6888 [Abstract] [Full Text] Mattick, J. S. (2001). Non-coding RNAs: the architects of eukaryotic complexity. EMBO Reports 2: 986-991 [Abstract] [Full Text] Galvani, A., Sperling, L. (2001). Transgene-mediated post-transcriptional gene silencing is inhibited by 3' non-coding sequences in Paramecium. Nucleic Acids Res 29: 4387-4394 [Abstract] [Full Text] Lee, R. C., Ambros, V. (2001). An Extensive Class of Small RNAs in Caenorhabditis elegans. Science 294: 862-864 [Abstract] [Full Text] Yang, S., Tutton, S., Pierce, E., Yoon, K. (2001). Specific Double-Stranded RNA Interference in Undifferentiated Mouse Embryonic Stem Cells. Mol. Cell. Biol. 21: 7807-7816 [Abstract] [Full Text] Ketting, R. F., Fischer, S. E.J., Bernstein, E., Sijen, T., Hannon, G. J., Plasterk, R. H.A. (2001). Dicer functions in RNA interference and in synthesis of small RNA involved in developmental timing in C. elegans. Genes & Dev. 15: 2654-2659 [Abstract] [Full Text] Knight, S. W., Bass, B. L. (2001). A Role for the RNase III Enzyme DCR-1 in RNA Interference and Germ Line Development in Caenorhabditis elegans. Science 293: 2269-2271 [Abstract] [Full Text] Caplen, N. J., Parrish, S., Imani, F., Fire, A., Morgan, R. A. (2001). Specific inhibition of gene expression by small double-stranded RNAs in invertebrate and vertebrate systems. Proc. Natl. Acad. Sci. U. S. A. 98: 9742-9747 [Abstract] [Full Text] Matzke, M., Matzke, A. J. M., Kooter, J. M. (2001). RNA: Guiding Gene Silencing. Science 293: 1080-1083 [Abstract] [Full Text] Mattick, J. S., Gagen, M. J. (2001). Review ArticleThe Evolution of Controlled Multitasked Gene Networks: The Role of Introns and Other Noncoding RNAs in the Development of Complex Organisms. Mol Biol Evol 18: 1611-1630 [Abstract] [Full Text]