Chapter 6: Gene Expression - Translation Translation = conversion of a messenger RNA sequence into the amino acid sequence of a polypeptide (i.e., protein synthesis) Topics to be covered today: Peptide bond Amino acid biochemical properties Protein structure Genetic code Topics to be covered Thursday 18th Translation mechanism Review session on Wednesday 17th 3:30 – 4:45 PM, Whitten LC 130 Topics to be covered Tuesday 23rd DNA mutation & repair (lecture notes already on website) Exam on Wednesday 24th 3:30 – 4:45 PM, Whitten LC 130 Protein: High-molecular weight, nitrogen-containing organic compound. Composed of one or more polypeptides. Polypeptides are composed of amino acids (AA). The sequence of AA gives the polypeptide its 3D shape and its properties in the cell. Amino Acid: Contains the following bonded to a central carbon atom. Amino group (NH2) Carboxyl group (COOH) Hydrogen atom R group (different in each amino acid) Typically charged in the cell (-NH3+ and -COO-) Fig. 6.1 20 different amino acids occur in living cells: Abbreviated with 3- and 1-letter codes. Classified into four chemical groups based on the composition of the R group: 1. Acidic (n = 2) 2. Basic (n = 3) 3. Neutral and polar, hydrophilic (n = 6) 4. Neutral and non-polar, hydrophobic (n = 9) Fig. 6.2. Acidic and basic amino acids. Fig. 6.2. Neutral, non-polar (hydrophobic) amino acids. Fig. 6.2. Neutral, polar (hydrophilic) amino acids. Amino acids are joined to form unbranched polypeptides by a peptide bond. Peptide bond = dehydration synthesis reaction results in a covalent bond between the carboxyl group of one amino acid and amino group of the next amino acid. Fig. 6.3 The N terminus is at the beginning of the polypeptide chain, and the C terminus is at the end of the polypeptide chain. Proteins show four hierarchical levels of structural organization: 1. Primary structure = amino acid sequence Determined by the genetic code of the mRNA. 2. Secondary structure = folding and twisting of a single polypeptide chain. Result of weak H-bonds and electrostatic interactions e.g., 3. -helix (coiled) and -pleated sheet (zig-zag). Tertiary structure = three dimensional shape (or conformation) of a single polypeptide chain. Results from the different R groups. 4. Quaternary structure = association between polypeptides in multi-subunit proteins (e.g., hemoglobin). Occurs only with two or more polypeptides. Fig. 6.4 The genetic code: how do nucleotides specify 20 amino acids? 1. 4 different nucleotides (A, G, C, U) 2. Possible codes: • 1 letter code • 2 letter code • 3 letter code 3. Three letter code with 64 possibilities for 20 amino acids suggests that the genetic code is degenerate (i.e., more than one codon specifies the same amino acid). 4 AAs 4 x 4 = 16 AAs 4 x 4 x 4 = 64 AAs <20 <20 >>20 The genetic code is a triplet code A set of 3 consecutive nucleotides make a codon in mRNA code, which corresponds to one amino acid in a polypeptide chain. 1. 1960s: Francis Crick et al. 2. Studied frameshift mutations in bacteriophage T4 (& E. coli), induced by the mutagen proflavin. 3. Proflavin caused the insertion/deletion (indels) of a base pair in the DNA. 4. Two ways to identify mutant T4: 1. Growth with E. coli B: • • 2. r+(wild type) rII (mutant) turbid plaques clear plaques Growth with E. coli K12 (): • • r+ (wild type) rII (mutant) growth no growth 1. Discovered that frameshift mutations (insertion or deletion) resulted in a different sequence of amino acids. Fig. 6.5 2. Also discovered that r+ mutants treated with proflavin could be restored to the wild type (revertants). deletion (-) corrects insertion (+) or vice versa 3. Combination of three r+ mutants routinely yielded revertants, unlike other multiple combinations. Fig. 6.6 - Three nearby insertions (+) restore the reading frame, giving normal or near-normal function. How was the genetic code deciphered? 1. Cell-free, protein synthesizing machinery isolated from E. coli. (ribosomes, tRNAs, protein factors, radio-labeled amino acids). Synthetic mRNA containing only one type of base: UUU = Phe, CCC = Pro, AAA = Lys, GGG = ? (unstable) 2. Synthetic copolymers (CCC, CCA, CAC, ACC, CAA, ACA, AAC, AAA) composed of two different bases: Pro, Lys (already defined) + Asp, Glu, His, & Thr Proportion (%AC) varied to determine exactly which codon specified which amino acid. 3. Synthetic polynucleotides of known composition: UCU CUC UCU CUC Ser Leu Ser Leu 1968: Robert Holley (Cornell), H. G. Khorana (Wisconsin-Madison), and Marshall Nirenberg (NIH). How was the genetic code deciphered (cont.): 4. Ribosome binding assays of Nirenberg and Leder (1964) (ribosomes, tRNAs charged w/AAs, RNA trinucleotides). Protein synthesis does not occur. Only one type of charged tRNA will bind to the tri-nucleotide. 5. mRNA UUU tRNA AAA (with Phe) codon anti-codon mRNA UCU tRNA AGU (with Ser) codon anti-codon mRNA CUC tRNA GAG (with Leu) codon anti-codon Identified 50 codons using this method. Combination of many different methods eventually identified 61 codons, the other 3 do not specify amino acids (stop-codons). Fig. 6.7 Universal Genetic Code Characteristics of the genetic code (written as in mRNA, 5’ to 3’): 1. Code is triplet. Each 3 codon in mRNA specifies 1 amino acid. 2. Code is comma free. mRNA is read continuously, 3 bases at a time without skipping bases (not always true, translational frameshifting is known to occur). 3. Code is non-overlapping. Each nucleotide is part of only one codon and is read only once. 4. Code is almost universal. Most codons have the same meaning in different organisms (e.g., not true for mitochondria of mammals). 5. Code is degenerate. 18 of 20 amino acids are coded by more than one codon. Met and Trp are the only exceptions. Many amino acids are four-fold degenerate at the third position. 6. Code has start and stop signals. ATG codes for Met and is the usual start signal. TAA, TAG, and TGA are stop codons and specify the the end of translation of a polypeptide. 7. Wobble occurs in the tRNA anti-codon. 3rd base is less constrained and pairs less specifically. Examples of variation in the mtDNA genetic code: http://en.wikipedia.org/wiki/File:MtDNA_Genetic_Code_variation_for_mammals,_fruit_flies_and_yeasts.jpg Wobble hypothesis: Proposed by Francis Crick in 1966. Occurs at 3’ end of codon/5’ end of anti-codon. Result of arrangement of H-bonds of base pairs at the 3rd pos. Degeneracy of the code is such that wobble always results in translation of the same amino acid. Complete set of codons can be read by fewer than 61 tRNAs. 5’ anti-codon 3’ codon G pairs with U or C C pairs with G A pairs with U U pairs with A or G I (Inosine) pairs with A, U, or C I = post-transcription modified purine Fig. 6.8 Outcomes of effects of two types of substitutions: 1. 2. Transitions • Convert a purine-pyrimidine to the other purine-pyrimidine. • 4 types of transitions; A G and T C; biochemically similar (1 1 ring, or 2 2 ring structure) • Most transitions results in synonymous substitution because of the degeneracy of the genetic code (most nucleotide substitutions DO NOT change the amino acid). • Common Transversions • Convert a purine-pyrimidine to a pyrimidine-purine. • 8 types of transversions; A T, G C, A C, and G T; biochemically dissimilar (1 2 ring in all cases) • Transversions are more likely to result in nonsynonomous substitution (most nucleotide substitutions DO change the amino acid). • More rare http://en.wikipedia.org/wiki/Human_mitochondrial_molecular_clock Transitions Transversions TTT TCT TAT TGT TTC TCC TAC TGC TTA TCA TAA TGA TTG TCG TAG TGG CTT CCT CAT CGT CTC CCC CAC CGC CTA CCA CAA CGA CTG CCG CAG CGG ATT ACT AAT AGT ATC ACC AAC AGC ATA ACA AAA AGA ATG ACG AAG AGG GTT GCT GAT GGT GTC GCC GAC GGC GTA GCA GAA GGA GTG GCG GAG GGG PHE PHE TYR TYR CYS CYS STOP STOP STOP HIS HIS GLN GLN ARG ARG ARG ARG ASN ASN SER SER MET THR THR THR THR LYS LYS ARG ARG VAL VAL VAL VAL ALA ALA ALA ALA ASP ASP GLY GLY GLY GLY LEU LEU LEU LEU LEU LEU ILE ILE ILE SER SER SER SER PRO PRO PRO PRO GLU GLU TRP NEUTRAL-POLAR CYS CYS STOP STOP STOP HIS HIS GLN GLN ARG ARG ARG ARG ASN ASN SER SER MET THR THR THR THR LYS LYS ARG ARG VAL VAL VAL VAL ALA ALA ALA ALA ASP ASP GLY GLY GLY GLY ILE ILE ILE PRO PRO PRO PRO GLU GLU TRP ACIDIC NEUTRAL-NONPOLAR TYR TYR LEU LEU LEU LEU LEU LEU SER SER SER SER BASIC PHE PHE Evolution of the genetic code: Each codon possesses an inherent set of possible 1-step amino acid changes precluding all others. As a result, some codons are inherently conservative by nature, whereas others are more radical. Phe, Leu, Ile, Met, Val (16 codons with T at 2nd pos.) possess 104 possible evolutionary pathways. Only 12 (11.5%) result in moderately or radically disimilar amino acid changes Most changes (most transitions and some transversions) are nearly neutral because they results in substitution of the same or similar amino acids. DNA sequences with different codons compositions have different properties, and may evolve on different evolutionary trajectories with different rates of substitution. Evolution of the genetic code (cont.): On average, similar codons specify similar amino acids, such that single base changes result in small chemical changes to polypeptides. For example, single base changes in the existing code have a smaller average effect on polarity of amino acids (hydropathy/hydrophily) than all but 0.02% of randomly generated genetic codes with the same level of degeneracy (Haig and Hurst 1991, J. Mol. Evol. 33:412-417). The code has evolved to minimize the severe deleterious effects of substituting hydrophilic for hydrophobic amino acids and vice versa. This is true for other biochemical properties. This is a good thing!!! Translation-protein synthesis (Overview): 1. Protein synthesis occurs on ribosomes. 2. mRNA is translated 5’ to 3’. 3. Protein is synthesized N-terminus to C-terminus. 4. Amino acids bound to tRNAs are transported to the ribosome. Facilitated by: Specific binding of amino acids to their tRNAs. Complementary base-pairing between the mRNA codon and the tRNA anti-codon. mRNA recognizes the tRNA anti-codon (not the amino acid). Translation - 4 main steps 1. Charging of tRNA 2. Initiation 3. Elongation (3 steps) 4. 1. Binding of the aminoacyl tRNA to the ribosome. 1. Formation of the peptide bond. 1. Translocation of the ribosome to the next codon. Termination Step 1-Charging of tRNA (aminoacylation) 1. Amino acids are attached to tRNAs by aminoacyl-tRNA synthetase. 2. Produces a charged tRNA (aminoacyl-tRNA). 3. Uses energy derived from ATP hydrolysis. 4. At least 20 different aminoacyl-tRNA synthetases (for each AA). 5. tRNAs possess enzyme-specific recognition sites. 6. Sequence of events: 1. ATP and amino acid bind to aminoacyl-tRNA synthetase, to form aminoacyl-AMP + 2 phosphates. 2. tRNA binds to aminoacyl-AMP. 3. Amino acid transfers to tRNA, displacing AMP. 4. Amino acid always is attached to adenine on 3’ end of tRNA by its carboxyl group forming aminoacyl-tRNA. Fig. 6.10 Step 2-Initiation-requirements: 1. 2. 3. 4. 5. 6. mRNA Ribosome Initiator tRNA (fMet tRNA in prokaryotes) 3 Initiation factors (IF1, IF2, IF3) Mg2+ GTP (guanosine triphosphate) Step 2-Initiation-steps (e.g., prokaryotes): 1. 30S ribosome subunit + IFs/GTP bind to AUG start codon and Shine-Dalgarno sequence composed of 8-12 purine-rich nucleotides upstream (e.g., AGGAGG). 2. Shine-Dalgarno sequence is complementary to 3’ 16S rRNA. 3. Initiator tRNA (fMet tRNA) binds AUG (with 30S subunit). All new prokaryote proteins begin with fMet (later removed). fMet = formylmethionine (Met modified by transformylase; AUG at all other codon positions simply codes for Met) mRNA tRNA 5’-AUG-3’ 3’-UAC-5’ start codon anti-codon 4. IF3 is removed and recycled. 5. IF1 & IF2 are released and GTP is hydrolysed, catalyzing the binding of 50S rRNA subunit. 6. Results in a 70S initiation complex (mRNA, 70S, fMet-tRNA). 7. The Ribosome is assembled on the mRNA! See 6.15 Step 2-Initiation, differences between prokaryotes and eukaryotes: 1. Initiator Met is not modified in eukaryotes (but eukaryotes possess initiator tRNAs). 2. No Shine-Dalgarno sequence; but rather initiation factor (IF-4F) binds to the 5’-cap on the mature mRNA. 3. Eukaryote AUG codon is embedded in a short initiation sequence called the Kozak sequence. 4. Eukaryote poly-A tail stimulates translation by interacting with the 5’-cap/IF-4F, forming an mRNA circle; this is facilitated by poly-A binding protein (PABP). Play Initiation Video! Step 3-Elongation of a polypeptide: 1. Binding of the aminoacyl tRNA (charged tRNA) to the ribosome. 2. Formation of the peptide bond. 3. Translocation of the ribosome to the next codon. 3-1. Binding of the aminoacyl tRNA to the ribosome. • Ribosomes have two sites, P site (5’) and A site (3’) relative to the mRNA. • Synthesis begins with fMet (prokaryotes) in the P site, and aatRNA hydrogen bonded to the AUG initiation codon. • Next codon to be translated (downstream) is in the A site. • Incoming aminoacyl-tRNA (aa-tRNA) bound to elongation factor EF-Tu + GTP binds to the A site. • Hydrolysis of GTP releases EF-Tu, which is recycled. • Another elongation factor, EF-Ts, removes GDP, and binds another EF-Tu + GTP to the next aa-tRNA. • Cycle repeats after peptide bond and translocation. Fig. 6.17 3-2. Formation of the peptide bond. • Two aminoacyl-tRNAs positioned in the ribosome, one in the P site (5’) and another in the A site (3’). • Bond is cleaved between amino acid and tRNA in the P site. • Peptidyl transferase (catalytic RNA molecule - ribozyme) forms a peptide bond between the free amino acid in the P site and aminoacyl-tRNA in the A site. • tRNA in the A site now has the growing polypeptide attached to it (peptidyl-tRNA). Fig. 6.18 3-3. Translocation of the ribosome to the next codon. • Final step of the elongation cycle. • Ribosome advances one codon on the mRNA using EF-G (prokaryotes) or EF-2 (eukaryotes) and GTP. • Binding of a charged tRNA in A site (3’) is blocked. • Uncharged tRNA in P site (5’) is released. • Peptidyl tRNA moves from A site to the P site. • Vacant A site now contains a new codon. • Charged tRNA anti-codon binds the A site, and the process is repeated until a stop codon is encountered. • Numbers and types of EFs differ between prokaryotes and eukaryotes. • 8-10 ribosomes (polyribosome) simultaneously translate mRNA. Fig. 6.17 Fig. 6.19 Play Elongation Video! Step 4-Termination of translation: 1. Signaled by a stop codon (UAA, UAG, UGA). 2. Stop codons have no corresponding tRNA. 3. Release factors (RFs) bind to stop codon and assist the ribosome in terminating translation. 1. 2. 3. 4. RF1 recognizes UAA and UAG RF2 recognizes UAA and UGA RF3 stimulates termination 4 termination events are triggered by release factors: 1. Peptidyl transferase (same enzyme that forms peptide bond) releases polypeptide from the P site. 2. tRNA is released. 3. Ribosome recycling factor (RRF) binds to A site and ribosomal subunits and RF separate from mRNA. 4. fMet or Met usually is cleaved from the polypeptide. See Fig. 6.20 Play Termination Video!