Exam Protein Engineering (8S080) 27 January 2012 The exam consists of 4 questions and 5 pages. Each subquestion yields 5 points, unless indicated otherwise. Good luck! Question 1. 20 points Draw the chemical structure of a peptide with the sequence EIWIT at pH 7. Also draw a graph that shows how the charge of this peptide depends on the pH between pH 2 and pH 12. Show charge on the y-axis and pH on the x-axis. At pH 2 the overall charge will be +1 (N-terminus protonated, carboxylic acids are neutral). Around pH 3-4 both the glutamic acid (E) and the C-terminal carboxylic acid will get deprotonated, resulting in an overall charge of -1. At pH 8, the N-terminal amine will become deprotonated (rendering it neutral), resulting in an overall charge of -2 for the peptide above pH 9. Question 2. 40 points Fluorescent labeling of proteins has many applications in the area of molecular imaging, but also to understand protein folding and intracellular imaging. Genetically encoded fluorescent groups such as GFP are often used for in vivo labeling of proteins. However, a disadvantage of GFP is its size, which could affect the function of the protein of interest. Synthetic fluorescent dyes are much smaller, but site-specific labeling of proteins is often difficult. The group of Peter Schultz therefore developed a new amino acid that contains as a side chain the fluorescent group 7-hydroxycoumarine (Figure 1). Figure 1. a. Describe how one could incorporate these non-natural amino acids into proteins that are recombinantly expressed in E. coli in a site-specific manner. Explain in detail which parts of the normal protein synthetic machinery have to be adjusted, and also how one could do that (10 points). 1 Issues that should be discussed should be: - Use of 4 base codons or amber stop codon - Make tRNA that is complementary to this codon using an orthogonal tRNA that is not recognized by endogenous tRNA synthases - Develop tRNA synthase that recognizes this orthogonal tRNA and engineer its amino acid binding site such that (A) it recognizes the 7-hydroxycoumarine and (B) does not catalyze the conjugation of other amino acids. - This engineering is best done using directed evolution experiences in which one screens for both positive selection (incorporation of the non-natural amino acid) suppresses an amber stop codon yielding an essential gene product, e.g. antibiotioc resistance) and negative selection (introduction of an endogenous amino acid at the amber stop codon results in the production of a toxic gene product, e..g. barnase). - Need to alter the gene encoding GFP with a 4-base codon or amber stop codon at the place of the original tyrosine. To test their method the researchers decide to try to replace the 4th amino acid in myoglobine (Ser) by the coumarine amino acid. Because the expression level for the mutant protein is quite low, the researchers decide to attach a His-tag to the protein. b. What is a His-tag and explain why they introduced this His-tag. A his-tag is a repeat sequence of 6 or 10 Histidine residues. It is typicall attached at the N- or Cterminus of a protein as it allows easy purification of the protein using metal affinity chromatography. The His-tag has a high and specific affinity for divalent cations and thus binds readily to resins functionalized with Ni2+. Most other proteins will not bind to the resins and will flow through the column. After washing, the His-tagged protein can be eluted from the column using a high concentration of imidazole (0.1-1 M), which competes with the His-tag for binding to the Ni ions. Affinity chromatography methods such as this typically result in large enrichments for the protein of interest, allows one to purify proteins from a large excess of other proteins. To proof that the mutant protein indeed contained the non-natuaral amino acid, the isolated protein was characterized using Electro Spray Ionisation Massa Spectroscopy (ESI-MS) (Figure 2). c. Explain how ESI-MS works.. Students should discuss: method of ionization, typical number of charges on protein, d. The average molecular weight of wild-type myoglobine (containing Ser on position 4) is 18354 Da. Determine the molecular weight of the mutant protein on the basis of the mass spectrum shown below (Figure 2). Please show you calculation. Was the incorporation of the non-natural amino acid successful? (10 points) 2 (MW + n)/n = 1099.0 (MW + (n+1))/(n+1) = 1029.3 2 equations with 2 unknowns (n and MW): n = 17 -> MW = 18513 Da The difference in molecular weight between Serine (C1H3O) and 7-hydroxycoumarine (C11H9O3) is 3 x 12 + 6 x 1 + 2 x 12 = 158. So replacement of the serine would be expected to give a protein with a MW of 18354 + 158 = 18512 Da. Close enough! intensity Figure 2 m/z The researchers actually made 2 mutants of myoglobine. In addition to the mutant in which Ser4 was replaced, they also prepared a mutant in which His37 was replaced by the coumarine amino acid (myoglobin-TAG37) (see Figure 3 on the right). To compare the stability of each of these mutants with that of wild-type 3 myoglobin , the researchers use circular dichroism spectroscopy by measuring molar ellipticity at 222 nm as a function of urea concentration (see Figure 4). d. Explain what circular dichroism spectroscopy is and how it can be used to obtain information on protein folding. Based on these measurement what can you conclude about the effect of each mutation on the stability of protein folding in myoglobin? CD measures the difference in absorption of left and right circularly polarized light, which is a characteristic of chiral structures. The amide bonds are in a chiral environment and therefor give rise to CD spectra between 190 and 250 nm, where the shape of the CD spectrum is characteristic of the secondary structure. CD spectroscopy can therefore be used to monitor protein folding. Figure 4 titration experiments in which CD is used to monitor at which concentration of ureum the different Mb variants unfold. Since the curves of wt Mb and both mutant variants coincide, one can conclude that the overall stability of Mb was not affected by the incorporation of the coumarin at positions 4 and 37. Figuur 4. The fluorescence of 7-hydroxycoumarine depends on the hydrophobicity of its surrounding, which makes it a suitable probe to monitor protein folding. The stability of each mutant can therefore also be studied by monitoring coumarine fluorescence (450 nm) as a function of urea concentration (see Figure 4) e. Compare the unfolding curves for both mutants as determined using coumarine fluorescence and the unfolding curves determined using CD spectroscopy. Give an explanation for the different unfolding curve observed for myoglobine TAG4 determined using fluorescence. Unlike CD spectroscopy, which reports on the overall secondary structure content of the entire protein, the coumarin fluorescence is sensitive to the local environment around the fluorophore. The fluorescence of coumarin in TAG4 starts to change at a lower 4 concentration of ureum, which suggests that in this case ureum induces a local unfolding around the N-terminus before the protein becomes globally unfolded. The fact that this is only observed at the 4 position and not at the 37 position may be due to the fact that the former is close to the N-terminus of the protein, which is typically more flexible than the core of a protein. Question 3 40 points Most native proteins with a well-defined tertiary structure have a length of at least 50-60 amino acids. a. Why do proteins in general need at least 50 amino acids to form a well-defined folded structure? The main driving force for protein folding is the hydrophobic effect, i.e. upon protein folding hydrophobic side chains that are exposed in the folding state can be hided in the interior of the core of the protein, whereas hydrophilic amino acids will remain at the outside of the proteins. This shielding of hydrophobic amino acids can only be effective if the protein chain is long enough to form a hydrophobic core that is shielded by more hydrophilic amino acids. There is a lot of interest form the pharmaceutical industry to develop so-called mini proteins. Mini proteins consist of approx. 20 amino acids and still contain a well-defined 3D structure. The transcription factor p53 is an important protein that directs a cell towards apoptosis in case of severe DNA damage or cellular stress. This protein therefore plays an important role in the prevention of uncontrolled cell division. P53 is found to be inactive in many forms of cancer, among others because of its inhibition by enhanced expression of the protein MDM2. MDM2 binds to an -helix close to the N-terminus of p53. b. Give 2 combinatorial protein engineering approaches that one could use to develop miniproteins that bind to MDM2 and specifically interfere with the interaction between MDM2 and p53. Explain how both techniques work and discuss the way selection and/or screening works in both cases. Possible techniques: phage display, ribosome display, yeast display etc. In all these cases a library of mini proteins would be created where the mini protein is displayed on the surface of the phage, ribosome, yeast etc. Those library members that have an affinity for MDM@ can be selected on a surface functionalized with MDM2. Members that bind specifically at the p53 interaction site could be selected by elution of the bound library members using p53 protein. Successful selection of the bets binder requires multiple rounds of selection and amplification, with increasing stringency in later rounds. Yeast 2 hybrid could also be used, but is less ideal because the competition with p53 cannot be included. Because the interaction between p53 and MDM2 involves only a single -helix of 15 amino acids on p53, people have also tried to synthetically make a peptide with this sequence and test it for its ability to inhibit the interaction between p53 and MDM2 (Figure 2) 5 p53 p53 MDM2 MDM2 Figure 2. Structure of the complex of MDM2 with the p53 peptide. Last year Verdine et al. published a paper (J.Am.Chem.Soc. (2007) vol129, pp2456-2457) on a original approach to stabilize -helical peptide in the absence of the rest of the p53 protein. First they synthesized the peptide shown below with Solid Phase Peptide Synthesis (SPPS) using Fmoc chemistry. Ac-LSQETFSDLWKLLPEN-NH2 c. Explain in detail how SPPS works. What is the function of Fmoc? (10 points) Issues that should be discussed are the overall strategy of SPPS (advantages for purification etc), N-terminal protection groups, activation of carboxylic acids, deprotection of N-terminus, protection and deprotection of amino acid side chains, cleavage of peptide from resin. Phenylalanine at position 6 is known to directly contact MDM2. d. Which of the other amino acids in the sequence shown above do you expect to also participate in binding. Explain your answer. Since the peptide forms and alpha-helix residues at positions n+/-3-4 will be at the same site of the helix. If amino acid at position 6 faces MDM2, residues 2 (S), 3 (Q), 9 (L), 10(W), and 13 (L) would also be able to interact. Since protein-protein interactions are dominated by hydrophobic interactions L9, W10, and L13 are the most likely residues to also contribute to binding, The peptide was found to bind less strongly to MDM2 than the full p53 protein. One reason for this is that the peptide does not form a stable -helix in solution but instead forms a random coil. Verdine et al tried to stabilize the helix by introducing 2 non-natural amino acids with double bonds in their side chain, F-moc-R5 en F-moc-S5. In the presence of a Rh-catalyst these groups can be coupled via the so-called metathesis reaction (figure 3). 6 Figuur 3. e. Which amino acids of the original sequence would you replace by these two non-natural amino acids. Explain! Since the staple should not interfere with the binding to MDM2, it would be best to have it on the opposite face of the helix, so e.g.D8 and K12, or E4 and K12. Question 4. 10 points True or not true? Please explain your answer! a. The active form of a protein is also the thermodynamically most stable form of the protein. Most of the time true, but not always. An example would be an enzyme that is folded as an inactive pre-protein, which becomes subsequently activated after removal of an inhibiting Nterminal peptide. One could also argue here that amyloid aggregates are thermodynamically more stable than the folded protein monomer. b. The formation of hydrogen bonds between NH and C=O groups of the main chain is the most important driving force for protein folding. Not true. The main driving force is the shielding of hydrophobic amino acid side chains, which results in a favorable entropy for water molecules that are organized around hydrophobic groups in the unfolded states. It is important that the NH and C=O groups form hydrogen bonds in the folded protein as well, but only to compensate for the hydrogen bonds that they can easily form with water in the unfolded state. . 1-letter code of amino acids A = alanine C = cysteine D = aspertate E = glutamate F = phenylalanine G = glycine H = histidine I = isoleucine K = lysine L = leucine M = methionine N = asperagine P = proline Q = glutamine 7 R = arginine S = serine T = threonine V = valine W = tryptophan Y = tyrosine 8