Nucleotides and Nucleic Acids Nucleotides have a wide variety of functions. One major function is to provide the thermodynamic driving force for a number of chemical reactions. This is especially well-known for ATP, but GTP is also used for a variety of reactions, UTP is used in glycogen and complex carbohydrate biosynthesis, and CTP is used in complex lipid synthesis. Nucleotides are used to form intracellular signaling molecules such as cAMP and cGMP. In addition, ATP, ADP, and AMP act as signals to modulate energy metabolism. Nucleotides form parts of some cofactors, including NAD, FAD, and coenzyme A. Finally, nucleotides are the monomer units that comprise the nucleic acids RNA and DNA. Cells maintain pools of free nucleotides for a variety of purposes. Adenosine derivatives are the most common free nucleotides, because ATP is used in the largest number of reactions. In addition, ATP is converted into S-adenosylmethionine, and a number of other molecules involved in metabolic reactions. As mentioned above, pools of other free nucleotides are also important in some types of reactions, although these pools tend to be much smaller than those of ATP. Synthesis of nucleic acids (and especially synthesis of DNA) requires synthesis of nucleotides, because the cellular pools of the required free nucleotides are insufficient to provide all of the monomer units required. Nucleotide nomenclature and structure Nucleotides are comprised of a nitrogen-containing molecule, called a base, attached to a ribose ring. The bases are derivatives of two possible ring structures, purine and pyrimidine, and are numbered according to their parent compound. The bases all contain significant conjugated π-systems, which absorb ultraviolet light.16 6 1 N 2 5 7 N 5 N 6 2 8 N 4 N 3 Purine 9 3 N 1 Pyrimidine HO CH2 O Base 5´ 4 O P O CH2 O O 1´ 4´ 3´ HO 2´ 1´ HO Nucleoside O 4´ 3´ OH Base 5´ 2´ OH Nucleotide Monophosphate O O Base 5´ O P O CH2 O 1´ 4´ 3´ 2´ HO 2´-Deoxyribonucleotide Monophosphate The base and ribose ring together are termed a nucleoside (the suffix “-oside” means a compound covalently bonded to carbohydrate). The base and the ribose with one or more phosphate attached are termed a nucleotide. The ribose ring numbering is 16 DNA and RNA exhibit a maximum absorption at about 260 nm, due to the absorbance of the nucleotide bases. Most proteins have a maximum absorption at 280 nm, due to the absorbance of tryptophan and tyrosine. One method for assessing the relative amount of protein and nucleic acid in a sample is to measure the A260/A280 ratio; a ratio of above 1.8 indicates that the solution contains nucleic acid with little protein. Copyright © 2000-2011 Mark Brandt, Ph.D. 71 from 1´ to 5´. The “prime” modifier to the ribose ring number designates the ribose carbons as distinct from the atoms of the purine or pyrimidine ring. Nucleotides all have the base attached to the 1´ carbon. Deoxyribonucleotides, the nucleotides present in DNA, lack the 2´-hydroxyl. The base in the nucleoside can have one of two possible conformations: syn and anti. Both are present, although the anti conformation is more common physiologically, due both to the lower steric strain and to that fact that this conformation is required for nucleic acid structure. The rotation about the base-ribose bond is restricted by steric hindrance, so the nucleotide must be synthesized in one form or the other. (For purines, the syn structure is drawn most frequently; this is not a reflection of the common conformation, but instead merely allows the drawing to occupy less space on the page!) NH2 syn O P O CH2 anti N N O NH2 N O N O N N O P O CH2 O N N O O HO OH HO OH The most common bases are the two purine and three pyrimidine derivatives shown below. Purines NH2 N N Base Adenine N Pyrimidines NH2 Nucleotide H2N N H N H Cytosine Cytidine Uracil Uridine Thymine Thymidine O O NH N Guanine N Nucleotide N Adenosine O HN Base Guanosine N H N H O O H3C NH N H Copyright © 2000-2011 Mark Brandt, Ph.D. 72 O These are not the only bases used in physiology. Others include xanthine and hypoxanthine (intermediates in purine metabolism), a methylated version of adenine (with the methyl group attached to the nitrogen attached to C6), a methylated version of cytosine (5-methyl-cytosine), pseudouridine (which has the ribose attached to C5 instead of N1 of uracil), and 1,3,7-trimethylxanthine, better known as caffeine. Uric acid is a major excretion product following purine degradation; accumulation of uric acid in tissues results in gout. O O N N N N H Hypoxanthine O N HN N H N H O HN Xanthine O Uric acid H3C NH HN N H N H O O O H N O O CH3 N N N N CH3 1,3,7-Trimethyl xanthine Pseudouracil The phosphates in nucleotide triphosphates are most commonly attached to the 5´ carbon. If the site of phosphate attachment is not given, it is assumed to be at the 5´-position, while other sites of attachment must be stated explicitly. Thus, AMP is 5´-AMP, while 3´-AMP has to be written out. NH2 AMP N O N O P O CH2 NH2 3´-AMP N N N N HO CH2 O O N N O O HO O P O OH OH O RNA contains adenosine, cytidine, guanosine, and uridine (commonly abbreviated as A, C, G, and U); DNA contains deoxyadenosine, deoxycytidine, deoxyguanosine and deoxythymidine (commonly abbreviated as A, C, G, and T). Note that in nucleic acids, thymine is therefore only present in DNA, and uracil is only present in RNA. In humans, most purines and pyrimidines present as nucleotides (the major exceptions are xanthine and uric acid). Some organisms use purines for other purposes, and maintain the purines in the free form. An example of this is provided by the series of methylated xanthine derivatives produced in plants that have biological activity in humans: 1,3 dimethylxanthine (theophylline, an active ingredient in chocolate), 3,7 dimethylxanthine (theobromine, found in tea), and 1,3,7 trimethylxanthine (caffeine). Copyright © 2000-2011 Mark Brandt, Ph.D. 73 The structure and role of nucleic acids Nucleic acids are polymers of nucleotides, in which the phosphate from the 5´ position of one nucleotide is attached to the 3´ hydroxyl of the preceding nucleotide. This phosphodiester link is created using the energy from the triphosphate form of the nucleotide being added, driven by the release of inorganic pyrophosphate. Because of this intrinsic directionality, nucleic acid sequences are typically written from 5´ to 3´ unless otherwise specified. Ribonucleic acid (RNA) is synthesized from ribonucleotide triphosphates, while deoxyribonucleic acid is synthesized from deoxyribonucleotides (as shown below). Additional DNA O Additional DNA 5´ carbon O O P O 5´ O G O CH2 O P O C O O O Direction of O P O DNA O CH2 synthesis O O A O O O O O CH2 A O P O P O P O CH2 O O O O 3´ hydroxyl O C O P O 3´ carbon H2C O O P O T O 3´ H2C O HO H2C O O P O G O O O T H2C O O P O O Additional DNA HO As with proteins, nucleic acids can have a wide variation in number of monomer units present. However, most DNA molecules have much higher numbers of monomer units than are present in any known protein. Originally, nucleic acids were thought to be too simple to contain information, because they were thought to be repeating polymers of four nucleotides. As a result, nucleic acids were thought to act as structural molecules. This concept was Copyright © 2000-2011 Mark Brandt, Ph.D. 74 challenged in 1944 when a series of experiments by Oswald Avery suggested that DNA was capable of acting as genetic material.17 Avery discovered that nonpathogenic pneumococcal bacteria could be made pathogenic by adding material extracted from killed, pathogenic pneumococcal bacteria. The transforming material was destroyed by nucleases (which degrade DNA), but not by proteases (which degrade proteins). While open to criticism, the Avery experiment was a clear indicator that proteins were unlikely to be the genetic material, and that DNA, in spite of its apparent simplicity, was capable of carrying information.18 In a classic one-page 1953 paper: “Molecular Structure of Nucleic Acids” in Nature, James Watson and Francis Crick19 proposed a structure for DNA. Their proposed structure was based on molecular models constructed on the basis of relatively limited data. Some of the data that Watson and Crick used were obtained (in a somewhat controversial manner) from the fiber X-ray diffraction work of Rosalind Franklin and Maurice Wilkins (who were acknowledged in the Nature paper, but published their own results separately). Watson and Crick also took into account experiments by Edwin Chargaff that showed that in all organisms tested, the nucleic acid A content was comparable to T, and the G content was comparable to C. Watson and Crick proposed that DNA exists as an anti-parallel double helix, in which the phosphate backbone was on the outside, and in which the strands were held together by the hydrogen bonding interactions between pairs of A and T and pairs of G and C. Examination of the structure below reveals that the ribosephosphate backbone has the same shape for both A:T pairs and G:C pairs. Thymidine Ribose CH3 O N N O H H N N N H Cytidine H Adenosine N Ribose N N N O N Ribose H O H H N N N H Studies performed since the original Watson and Crick proposal have revealed that at least six different structures are possible for DNA. The Watson-Crick double helix is known as B form DNA. The B form (shown at right) is the major form under physiological conditions. B form DNA has a rise of 3.4 Å per base pair, with 17 Guanosine N N Ribose !)*$% &%$$'( !"#$% &%$$'( Avery, et al., Studies on the Chemical Nature of the Substance Inducing Transformation of Pneumococcal Types: Induction of Transformation by a Desoxyribonucleic Acid Fraction Isolated From Pneumococcus Type III. J. Exp. Med. 79, 137-158 (1944) 18 The apparent simplicity of DNA was in part because no hypotheses had been proposed for how DNA might store information. In addition, the true size of DNA molecules had yet to be established; in fact, DNA molecules are the largest molecules known. The largest molecule in the human body, chromosome 1, has a molecular weight of ~1.5 x 1011 g/mol (0.15 million metric tonnes). 19 Watson, J.D. and Crick, F.H.C., Nature 171, 737 (1953) Copyright © 2000-2011 Mark Brandt, Ph.D. 75 each full turn requiring a total of 10.5 base pairs, and the helix is ~20 Å in diameter. The major groove and minor groove indicated allow proteins to bind to specific sequences without separating the strands of the double helix. Some DNA bases are modified after synthesis of the nucleic acid molecule. Some of these modifications, such as the methyl and hydroxymethyl derivatives shown below, are thought to play a role in proof-reading the DNA for errors during synthesis, and to have regulatory functions for assisting in controlling gene expression. Examination of the structures of these modified bases suggests that they are still capable of base-pairing normally, and that the role of the modification is to change the shape of the base to alter interactions with other molecules. H N CH3 N N CH2 NH2 N N HO CH3 Ribose Ribose N6-methyladenosine N N O 5-methylcytidine NH2 Ribose N N O 5-hydroxymethylcytidine RNA RNA differs from DNA in both structural and functional respects. RNA has two major structural differences: each of the ribose rings contains a 2´-hydroxyl, and RNA uses uracil in place of thymine. RNA molecules are capable of base pairing, but generally will not form large regions of stable RNA-RNA double helix. RNA can act as a genetic material (although this role, at least for current organisms, seems to be restricted to viruses). RNA bases The bases used for RNA are attached to ribose. However, many are significantly modified from the typical four bases normally considered to be part of RNA. This is particularly true for tRNA. The modified bases include pseudouracil and methylated versions of cytosine and adenine. RNA Structure Unlike DNA, RNA can form complex three-dimensional structures. As a result, RNA can also exhibit catalytic activity. The combination of the ability to store genetic information with the ability to catalyze reactions has resulted in a proposal for the origin of life: the “RNA World”. The RNA world hypothesis proposes that RNA molecules once filled all of the roles of protein and nucleic acid macromolecules, and acted in both an information storage capacity and as the source of the enzymatic activity required for metabolic reactions. In general, RNA is less suited to acting as genetic material than DNA, and is less suited to forming efficient catalysts than proteins. Assuming that the RNA world Copyright © 2000-2011 Mark Brandt, Ph.D. 76 once existed, nearly all of its functions have been taken over by other biological molecules. However, some vestiges of the RNA world may still exist.20 The vast majority of RNA functions are concerned with protein synthesis. The major types of RNA: Ribosomal RNA (rRNA) Ribosomal RNA molecules comprise 65 to 70% of the mass of the ribosome (the machinery responsible for protein synthesis). Ribosomes are very large objects; prokaryotic ribosomes have molecular weights of about 2500 kg/mol, while eukaryotic ribosomes have molecular weights of about 4000 kg/mol. The eukaryotic 40S Ribosome contains 1 rRNA (18 S rRNA = 1900 bases) and about 35 different proteins. The 60S ribosome contains 3 rRNA (5 S = 120 bases, 5.8 S = 160 bases, and 28 S = 4700 bases), and about 50 proteins. The 5 S rRNA has its own gene; the others are synthesized as a single transcript that is then cleaved to release the mature RNA molecules that become part of the ribosome. The original studies on ribosomes used relatively crude techniques that were unable to measure size in terms of molecular weight. Instead the size of the ribosomal particles and their components were measured by their rate of sedimentation (movement driven by gravitational acceleration or centrifugal acceleration). Sedimentation is a function of size, shape, and density, with larger objects tending to sediment faster than smaller ones. Object sizes can be measured in Svedberg units (S). Prokaryotic ribosomes are 70 S particles, with each comprised of a large (50 S) and a small (30 S) subunit. Eukaryotic ribosomes are 80 S particles, comprised of a large (60 S) and a small (40 S) subunit. You will notice that the Svedberg units are not additive for the particles sizes; this is due to the effects of shape on sedimentation. Until relatively recently, it was assumed that the ribosomal RNA performed a largely structural function. However, more recent data strongly suggests that the rRNA acts as the enzyme, with the protein acting as the structural scaffolding. These data include results from the recent high-resolution (2.4 Å) X-ray diffraction 20 Side Note: Many of the assumptions and data concerning the RNA-world hypothesis are discussed in Penny, D., “An Interpretive Review of the Origin of Life Research.” Biol Phil. 20, 633-671 (2005). Assuming that the “RNA world” hypothesis is correct, it would seem that the phosphate-based RNA structure had a major role in the origin of life. Phosphate has useful properties that make it a good candidate for participating in a variety of biologically important chemical reactions. The advantages of phosphate are summarized in: Westheimer, F.H., “Why Nature Chose Phosphates.” Science 235, 1173-1178 (1987). Recently, a controversial paper reporting the discovery of an organism (GFAJ-1) was published: Wolf-Simon, F., et al., “A Bacterium That Can Grow by Using Arsenic Instead of Phosphorus.” Science 332, 1163-1166 (2011). The results reported in this paper have been questioned at length, in part because arsenate esters are much more rapidly hydrolyzed than phosphate esters, suggesting that nucleic acids using arsenate instead of phosphate would be too unstable to support life, and criticizing the authors for making extraordinary claims without extraordinary evidence However, if the GFAJ-1 organism is actually capable of using arsenic in place of phosphorus, GFAJ-1 may become a useful model system for exploring issues related to phosphate chemistry and the origin of life. Copyright © 2000-2011 Mark Brandt, Ph.D. 77 structure of the large subunit21 and low-resolution (5 Å) structure of the complete ribosome from the bacterium Haloarcula marismortui. Examination of the high-resolution structure of the large subunit, and of subsequent structures of the entire particle strongly suggests that only RNA is present at the catalytic site. In the structure above, proteins are shown in black, and the orientation of the large subunit is similar to the center cartoon below. Examination of this structure suggests that protein is absent from the catalytic site. 21 Ban, N., et al., “The complete atomic structure of the large ribosomal subunit at 2.4 Å resolution. Science 289, 905–920 (2000). The structure figure was created in VMD from PDB ID 1FFK. Copyright © 2000-2011 Mark Brandt, Ph.D. 78 The two ribosome subunits work together with a variety of accessory protein factors and with the various tRNA molecules to synthesize new protein molecules from mRNA templates. A cartoon summarizing the basic functions is shown below.22 %"#$ !"#$ (,!- !"#$ 7"0%*)58 2)(" ' & %"#$ & ' +,!- ./(/&'()012)(" 3%41$"$()*"12'5(6"2)2 !"# $%&'$"$()*" Transfer RNA (tRNA) Transfer RNA (tRNA) molecules are a family of different nucleic acid molecules that are used for protein synthesis. Each tRNA is a ~75 base molecule that carries an amino acid, and is used by the ribosome to transfer the amino acid to the growing protein. tRNAs are thought to have a common tertiary structure (a structure based on X-ray diffraction analysis is shown below). Analysis of the tRNA sequence suggests a cloverleaf secondary structure formed by regions of base pairing between the sections of the RNA strand, with this cloverleaf folding into the threedimensional structure. !"#$%&'(#) '++'(,"-$+&.#+- !"#$%&'(#) '*" !$+#(%)%$ !$+#(%)%$&'*" 22 Protein synthesis is a topic for CHEM 331 (Biochemistry II), and will not be covered here. The textbook has more information. A recent review article by one of the Nobel Laureates involved in ribosome structural analysis is another good source of information about ribosome function and protein synthesis: Schmeing T.M.& Ramakrishnan V. “What recent ribosome structures have revealed about the mechanism of translation.” Nature 461, 1234-124 (2009). Copyright © 2000-2011 Mark Brandt, Ph.D. 79 Messenger RNA (mRNA) mRNA molecules contain the coding sequence for proteins. The mRNA molecules can vary considerably in size, with eukaryotic transcripts including the largest known ribonucleic acids. This is most obvious before splicing of introns, because many transcripts exceed 100 kb in length. micro RNA Relatively recently, it has become apparent that RNA has a much larger role in protein synthesis than was previously thought. In addition to the roles of rRNA, mRNA, and tRNA, smaller RNA molecules appear to have regulatory functions. Small RNA molecules called microRNA (miRNA) can inhibit translation of mRNA both with and without causing degradation of the mRNA. Different miRNA sequences are transcribed at different times, and probably have functions in development and differentiation of cell lineages, and may have a protective role against viruses, especially viruses with RNA genomes. Other RNA Some RNA molecule do not fall into the classes listed above. One example is the RNA portion of some enzymes, including RNaseP and telomerase, which require the RNA for function. In the case of RNaseP, the RNA has the catalytic activity, although the protein portion is required for full activity. The types of functions exhibited by RNA are thus far more diverse than they were thought to be a few years ago, and are far more diverse than those of DNA. Summary Nucleotides are widely used molecules, both as participants in reactions and as starting materials for synthesis of other compounds. Nucleotides are comprised of the aldopentose ribose and a nitrogenous base, The most commonly used base are the purines adenine and guanine, the pyrimidines cytosine, thymine, and uracil, and slightly modified forms of these compounds. The nucleotide polymer DNA, synthesized from 2´-deoxyribonucleotides, is an antiparallel duplex comprised of two complementary strands. DNA is a storage medium for information. RNA is largely used as part of the mechanism for elaboration of the information into more directly usable form. Unlike DNA, many types of RNA molecules form complex three-dimensional structures that are important for their function. The types of RNA forming complex three-dimensional structures clearly includes ribosomal RNA and transfer RNA, and may include messenger RNA as well. Copyright © 2000-2011 Mark Brandt, Ph.D. 80