Laboratory #9B/10A: Molecular genetics simulations1 Objectives: This is partially an observational lab and partially an experimental lab, wherein you will do simulations that help to understand the genetic code. When you finish this lab, you should: 1. Have an understanding of the experiments that deciphered the genetic code 2. Understand how changes in DNA sequence produce changes in the primary structure of a protein 3. Understand how hypotheses regarding genetically-based diseases can be tested using DNA sequencing and protein analysis. Background: Even after Watson and Crick had proposed a molecular structure for DNA, it remained a mystery how 4 nucleotides could be used to code for 20 amino acids. In the decades prior to the development of modern mechanized molecular analysis such as amino acid sequencing and DNA sequencing, researchers had to use other experimental techniques to put together the puzzle. Even before these experiments, scientists were fairly certain that the genetic code would involve sequences of three nucleotides for each amino acid. You can follow their logic by answering these questions: a) If the code consisted of 2 nucleotides per amino acid, how many different amino acids could be uniquely identified? Is there any redundancy (more unique nucleotide sequences than amino acids)? Hint: to calculate the number of unique “words” in a particular alphabet, use the equation N=lx where N is the number of words, l is the total number of letters, and x is the number of letters per word. b) If the code consisted of 4 nucleotides per amino acid, how many different amino acids could be uniquely identified? Is there any redundancy? c) If the code consisted of 3 nucleotides per amino acid, how many different amino acids could be uniquely identified? Is there any redundancy? Although scientists were fairly certain that the code would consist of 3-nucleotide codons, the arrangement of those codons was unknown. In particular, would they be 1 Modified from BiologyLabsOnline, Translation and Hemoglobin exercises 1 overlapping or non-overlapping? An overlapping code would mean that if codon 1 started at position 1 in the DNA, codon 2 would start at position 2 or 3. A nonoverlapping code would mean that codon 2 would start at DNA position 4. And, not least, they still did not know what 3-nucleotide “words” would correspond to each of the 20 amino acids. Our first two exercises will address these questions. In the early 1960s, Nirenberg published his discovery that cell-free extracts of the bacterium E. coli could translate synthetic RNA into small proteins, or polypeptides. Initially synthetic RNA could be “composed” only randomly, but later developments allowed researchers to synthesize RNA with specific nucleotide sequences. By providing specific RNA nucleotide sequences to the cell-free extracts, they were able to completely decode the genetic code. We will first simulate these experiments. After the code was deciphered, the question remained of whether and how small changes in a gene, such as single nucleotide substitutions, could affect the protein. We will explore this question in the second exercise of the lab. Exercise I: General instructions: Go to the Translation Lab in BiologyLabsOnline, and click on “Start Experiment”. For each of the four bottles of ribonucleotides, you can click on an arrow to select a nucleotide. If you then click Make RNA, an RNA macromolecule will be synthesized using the ribonucleotides in the sequence you have selected. Clicking on Translation Mix will produce the amino acid sequence(s) that are synthesized by your RNA sequence. We have found that the program may crash if you write to the simulation notebook, and recommend that you make notes in your laboratory notebook for each RNA sequence and the corresponding protein sequence(s). Part a: Simple mononucleotide sequences. The simplest codons will be those involving only one ribonucleotide such as UUU. Synthesize each of these mononucleotide RNA molecules and determine the corresponding amino acids for these codons. Can you use these results to determine if the codons are overlapping or non-overlapping? Part b: Simple dinucleotide sequences. Synthesize an RNA molecule with two nucleotides (“XYXYXY”), and translate it to determine the amino acid sequence. Your nucleotide sequence here_________________ Your protein sequence here____________________ Now reverse the RNA (“YXYXYX”). Do the results change? 2 Can you use these results to determine whether the codons are overlapping or nonoverlapping? Hint: what would the codons be if the code was overlapping? What if the code was non-overlapping? Part c: Are codons overlapping? A test using trinucleotide sequences. Make an RNA molecule with three nucleotides (“XYZXYZ”). Before synthesizing the protein, consider the possible outcomes. Question Hypothesis 1 Are codons overlapping? Overlapping codons in RNA are “read” to produce an amino acid sequence. Prediction 1 Your nucleotide sequence_______________________ The successive codons if read in overlapping fashion: How many different amino acids will be incorporated into a single protein? Hypothesis 2 Non-overlapping codons in RNA are “read” to produce an amino acid sequence. Prediction 2 Your nucleotide sequence_______________________ The successive codons if read in non-overlapping fashion: How many different amino acids will be incorporated into a single protein? Repeat this experiment with three more combinations of ribonucleotides. 3 Results: Fill your results in this table Experimental RNA sequences Protein sequence(s) produced A) B) C) D) Evaluation and inferences: Use your results to answer the following questions: 1. Do your results cause you to reject either the “overlapping codon” or “nonoverlapping codon” hypothesis? Explain your reasoning. 2. From parts a and b, which codons have you now linked to a specific amino acid? 3. You likely got qualitatively different results in parts b and c, with different numbers of resulting protein sequences for each RNA sequence. What does this tell you about how translation is initiated in the cell-free synthesis? 4. Did any of your sequences from part c have a different number of proteins compared to the others? Why might this be? 4 Exercise II: The affect of changes in nucleotide sequence on protein structure and function. Modern molecular techniques combined with medical science have greatly enhanced our understanding of the bases for many genetically-linked diseases. Hemoglobin is the protein that, bound to iron in 4-protein complexes, binds oxygen and moves it from the lungs to the rest of the body. The complex is so large that it’s structure largely determines the shape of the red blood cells in which it is synthesized. In this section, we will explore how changes in nucleotide sequence (mutations) in the gene for hemoglobin B alter the structure and function of the protein. before you start: Because this laboratory alternates with a wet lab, some of you may not have yet discussed the “Central Dogma” of molecular genetics, that DNA is transcribed to RNA which is translated to a protein. If this is the case, or if you are still confused by these concepts, please take some time right now to work through a transcription and translation simulation at the University of Utah Genetics web site, http://gslc.genetics.utah.edu/units/basics/index.cfm part a: Gel electrophoresis Gel electrophoresis is a technique that separates proteins by size and shape. Longer, more bulky or highly charged proteins move through the gel more slowly. In a study, different samples are placed into each well on the gel, an external standard of “normal protein” is included in one well for comparison. Go to the Hemoglobin lab in BiologyLabsOnline. There are a variety of “cases” listed. Click on “Gel Electrophoresis” and view several different cases. Choose two cases for further study. 5 part b: Observations Explore the “patient history”, “blood samples”and “microscope samples” for one case. Briefly note below your observations. Finally, click on “peptide sequence”. Search for differences in sequence by scanning or by clicking on “find difference”. Note the difference below, and then repeat for a second patient. Case 1 : Case 2: family history: do any relatives have similar conditions? gel electrophoresis: blood color: cell structure: Protein sequence: part c) Testing hypotheses. Based upon your observations and the genetic code, develop an explicit hypothesis concerning how the gene for each patient’s heme molecule might differ from normal. In particular, note what changes in the gene you expect to find and where (which codon) you expect to find them. Note that you will be working with DNA sequences, not RNA as in exercise I. Hypothesis, case 1: Hypothesis, case 2: 6 Use the “Edit DNA” function of the lab to test your hypothesis. The general protocol is as follows: First, find the beginning of the coding region of the gene by typing in the DNA sequence that codes for “start” (methionine) and hitting return. Note the location of “start”, and then click on “bracket codons”. Now the gene is bracketed into 3 base codons. Calculate which base location(s) will correspond to your predicted differences for case 1, and scroll or jump to that number (jump by typing in the base number and hitting return). Make the change(s) that you predicted by clicking on a base, then clicking on the base you wish to change it to in the “change to” box; repeat until you have made all of your desired changes. Lastly, click “translate”. The patient’s protein sequence, the normal sequence, and the protein sequence produced by your experimental DNA will all appear. Find the appropriate amino acid location, and note your results. Repeat for the second patient. Results: What changes did your alterations of the DNA sequence generate in the protein sequence? Evaluation: For each patient and corresponding hypothesis, did you correctly predict the changes in gene sequence resulting in differences in protein sequence (in other words, do your results cause you to reject your hypothesis)? 7