Biology 5357 Chemistry & Physics of Biological Molecules Examination #1 Protein Structure, Folding & Design X-Ray Crystallography NMR Spectroscopy October 20, 2011 1. (10 points) A wide variety of unnatural amino acids and non-standard structures occur in biologically important molecules. (A) Pyroglutamic acid, also known as 5-oxoproline, is an unusual amino acid derivative in which the free amino group of glutamic acid cyclizes to form a lactam. It is found in many proteins including bacteriorhodopsin. Draw the structure, with correct stereochemistry and protonation state, of this modified amino acid at pH = 7. Where in a protein sequence would you expect to find a pyroglutamic acid residue? (B) Collagen is the most abundant protein in mammals, making up about 25-35% of the total protein content. The sequence contains repeats of Gly-Pro-X or Gly-X-Hyp (Hyp is hydroxyproline) with repeating phi-psi angles of (-51°, 153°). The structure is composed of three peptide strands, each possessing the conformation of a helix. These individual helices are twisted together into a triple-helix coiled coil. Are the individual helical strands right- or left-handed? What about the overall coiled coil? 2. (20 points) A ribbon structure and packing diagram for TNfn3, the 3rd fibronectin type III domain of human tenascin, are shown below. In the packing diagram, circled and numbered side chains are packed against each other in the protein interior, while black dots represent side chains facing toward the exterior. The dotted lines divide the protein into seven levels vertically, and indicate residues that interact within each level. (A) Which protein structural family does TNfn3 belong to? What common motif does the structure contain? Which strands comprise the motif? (B) A chevron plot for wild-type TNfn3 and three single-site mutants is provided below. Determine the stability of the folded state relative to the unfolded state, !GU-F, for each of the four proteins. (C) Assume the folding of TNfn3 is 2-state. We will denote the unfolded form by U, the folded form by F, and the transition state for folding by ‡. The change in the free energy difference between the U and ‡ states for folding upon mutation is !!GU-‡ = RT ln (kf / kf"), where kf and kf" are the refolding rate constants at 0M denaturant for the wild-type and mutant proteins, respectively. Compute !!GU-‡ for the mutants in (B). (D) The #-value for folding is defined as: ! ! !!!!!! !!!!!! . What is the interpretation of a #-value of 0 for a mutation? What about a value of 1? Find the #-value for the three mutants in the chevron plot. What do these values suggest about the folding of TNfn3? 3. (10 points) Structure-based design of small molecules (i.e., ligands, inhibitors, drugs, etc.) that bind to proteins is a common application of protein structural data. (A) Adding groups to a ligand molecule to provide “favorable” free energy interactions with the protein does not always lead to an increased binding constant. Explain. (Hint: Use an appropriate free energy cycle to illustrate protein-ligand binding.) (B) Binding of a series of similar ligands to a protein often exhibits “enthalpy-entropy compensation”. For example, if ligand A has a more favorable !H of binding than ligand B, then the T!S of binding will often be more unfavorable for ligand A than for ligand B. Suggest a physical explanation. (C) The compensation effect in described in part (B) is controversial. Critics argue it is a statistical artifact arising from the limited range of !G values that can be measured experimentally and the procedure used in van’t Hoff analysis of the resulting data. What do you think? Justify your answer. 4. (9 points) Provide a brief answer for each of the following X-ray crystallography questions. (A) You are trying to solve the X-ray structure of a protein-ligand complex. Assume the structure of the native unligated protein is already known. What is a “difference Fourier” map (also known as an “Fo–Fc” map)? How might such a map be useful in determining the protein-ligand structure? (B) The diagram below shows two unit cells intersected by sets of Bragg planes (also known as “Miller” planes). What are the Miller indices corresponding to each set of planes? (C) What is the formula used to compute the crystallographic “R-factor”? Define each term in the equation. How is the R-factor used? 5. (12 points) A small protein crystallizes in an orthorhombic unit cell with dimensions a=20 Å, b=30 Å and c=40 Å. The sulfur atom of the single CYS residue is located at position a=10 Å, b=10 Å and c=10 Å within the unit cell. Diffraction data was collected on a suitable protein crystal using X-rays generated from a copper source ($ = 1.54 Å) (A) What is the “resolution” associated with the reflection having Miller indices of (4,0,8)? (B) What is the “phase angle” associated with the contribution of the sulfur atom to the (4,0,8) reflection? (C) What property is most associated with the intensity of the diffraction due to an individual atom within the unit cell (for example, the CYS sulfur above)? (D) Is the total intensity associated with the (4,0,8) reflection just the sum of the intensity contribution from each atom in the unit cell? Explain. 6. (9 points) Difference Patterson maps of the mercury acetate derivative of muscle fatty-acid binding protein are shown below. The protein crystallizes in the P212121 space group with unit cell parameters a=35.4 Å, b=56.7 Å and c=72.7 Å. The symmetry operators for P212121 are (x, y, z), (-x+!, -y, z+!), (x+!, -y+!, -z) and (-x, y+!, -z+!). In the Harker section with x=!, the large peaks denoted by A and A" are at (y=0.61, z=0.5) and (y=0.39, z=0.5), respectively. In the Harker section for y=!, peaks A and A" are located at (x=0.21, z=0) and (x=0.79, z=0). (A) What symmetry element(s) are present in the P212121 space group? (B) Describe briefly how the difference Patterson maps shown above are computed? (C) The mercury atom position was determined to have coordinates of (x=0.605, y=0.445, z=0.750). Demonstrate algebraically that this is a correct location for the heavy metal site. 7. (15 points) Assign the following segment of a protein using the spectra provided below. !"#$%"&'$()*+,-+.$ 50 50 60 65 65 15 10.0 7.5 7.5 H (ppm) +$ 40 50 50 50 50 60 65 N=120 ppm 70 10.0 7.5 1 H (ppm) 13 55 60 65 15 55 13 13 60 C (ppm) 45 C (ppm) 45 C (ppm) 45 55 60 65 15 N=123 ppm 70 10.0 7.5 3$ 40 45 55 7.5 1 H (ppm) /$ 40 N=118 ppm 70 10.0 1 H (ppm) .$ 13 C (ppm) 70 10.0 1 H (ppm) 40 15 N=110 ppm 70 7.5 65 15 N=123 ppm 1 55 60 65 15 N=120 ppm 70 10.0 55 3$ 13 55 C (ppm) 50 C (ppm) 45 45 60 60 40 +$ 45 13 13 55 40 /$ 40 C (ppm) 50 C (ppm) 45 .$ 13 40 65 15 N=110 ppm 70 10.0 1 H (ppm) 7.5 1 H (ppm) 15 N=118 ppm 70 10.0 7.5 1 H (ppm) /"0"1$2"&'$()+.$ (A) What is the order for residues A-D? Draw lines to illustrate the sequential connectivities used in determining the order and label resonances corresponding to residue i, i-1. (B) What NMR interaction provides the basis for the magnetization transfers used in these experiments? Is it a through-bond or through-space interaction? What type of structural restraint can you obtain by quantitatively measuring this interaction? 8. (15 points) NMR spectroscopists like to run experiments at pH 5 to 5.5 if possible due to amide exchange rates. (A) At what pH is amide exchange slowest? Why don’t we run solution NMR experiments at this pH? (B) Using the graph above, what is the exchange rate, kex, at pH 6? (C) A typical amide proton has a chemical shift near 8.5 ppm. Water resonates at about 4.7 ppm. What is the value of !% on a 600 MHz spectrometer? (D) Draw the 1H NMR spectrum showing the peak or peaks you expect to see for water and your amide if they are in slow exchange. (E) Draw the 1H NMR spectrum showing the peak or peaks you expect to see for water and your amide if they are in fast exchange. (F) Compare your values of !% and kex in parts (B) and (C). Is your amide proton in slow exchange or fast exchange under these conditions? Why would you not want to run an experiment under the other exchange regime?