Prediction of protein functional states by multi-resolution protein modeling Cecilia Clementi Department of Chemistry Rice University Houston, Texas The challenges in molecular biophysics: The “middle way”, in between a few small molecules and bulk bulk water …in between… one water molecule Large water clusters Wet/Dry interfaces Interaction with solutes quantum chemistry gives molecular orbitals what are the relevant variables? what is the intrinsic dimensionality? thermodynamics describes the system Empirical approach Theoretical approach C.Clementi, Curr. Opin. Struct. Biol. 2008, vol.18(1), 10-15 Physicists and biochemists often perceive molecular structure and function differently Example: representation of a Heme group Biochemist view: Physicist view: Protoporphyrin ring Central Iron 1 nm Biophysics should reconcile the two! Outline Our toolbox to explore protein landscapes at multiple resolutions Application to characterize a protein functional state Photoactive Yellow Protein PYP transforms light into biological signal PYP is believed to be responsible for H.halophila's ability to respond to blue light. How? PYP PYP transforms light into biological signal PYP is interesting to study because: It is the prototype for the PAS domain (a ubiquitous domain in signaling proteins) Its photochemistry is directly analogous to rhodopsin PYP PYP’s native state. Basic outline of the photocycle How? But the structure of this state is unkown. We know the structure of these states. How? The signaling state is elusive: It’s difficult to observe experimentally (because it partially unfolds) It’s difficult to predict computationally (broad range of time scales) PYP’s signaling state? The signalling process can be characterized using a multiscale approach: 1) Coarse Graining 2) All atom reconstruction 3) All atom / quantum calculations The signaling state ensemble can be characterized using a multiscale approach: 1) Coarse graining P.Das, S.Matysiak & C.Clementi PNAS 102, 10141-10146 (2005) What’s the role of a protein coarse-grained model? Simplified models are largely used to test general ideas and principles on toysystems Recently they have been applied to make predictions on real protein systems At what extent can protein coarse-grained models be used as predictive tools on real systems? C.Clementi, Curr. Opin. Struct. Biol. 2008, vol.18(1), 10-15 Building a coarse-grained protein model Building a coarse-grained protein model 2 2 i j A realistic coarse-grained protein model 1-bead per residue (Ca model) 20 aminoacid “colors” P.Das, S.Matysiak & C.Clementi PNAS 102, 10141-10146 (2005) We “photoactivate” the coarse grained model by perturbing the coarse grained forcefield at the chromophore. Dark PYP Photoactivated PYP The free energy is computed as a function of the “Diffusion Coordinates” [“Determination of reaction coordinates via locally scaled diffusion map”, M.A.Rohrdanz, W.Zheng, M.Maggioni & C.Clementi, J.Chem.Phys. 134, 124116 (2011)] P.J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) We “photoactivate” the coarse grained model by perturbing the coarse grained forcefield at the chromophore. Dark PYP Photoactivated PYP This perturbation has a strong effect on the free energy landscape, creating an on pathway intermediate. P.J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) It is interesting to compare the results of this model (DMC) to a simpler model (GO) DMC GO The difference is in the inclusion of non-native interactions GO model DMC model Dark PYP Photoactivated PYP P.J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) Comparison with available experimental data (on D25) Fluctuations (A) experimental data from Bernard, et al. Structure, 13, 953–962 (2005) P.J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) How much can we push a prediction from a protein coarse-grained model? How accurate is the prediction? How can we test it quantitatively ? “activated” minimum? folded state ensemble chromophore in cis configuration protein “quake” activated state chromophore in cis configuration folded state ensemble chromophore in trans configuration unfolded minimum folded minimum recovery Energy photo-isomerization The signaling state ensemble can be characterized using a multiscale approach: 2) All atom reconstruction Start from only C-alpha atoms Reconstruct backbone atoms Reconstruct side-chain atoms Optimize structure (locally and globally) A.P.Heath, L.E.Kavraki & C.Clementi, Proteins 2007, 68, 646-661 The signaling state ensemble can be characterized using a multiscale approach: 2) All atom reconstruction An example rotational isomer (rotamer) Different rotamers can be obtained by twisting around all the residue bonds. Alpha-carbon Along backbone… Along backbone… Lysine The signaling state ensemble can be characterized using a multiscale approach: 2) All atom reconstruction P.J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) Problem: photo-isomerization changes the electronic structure of the chromophore Solution: use quantum chemistry to correct the force field (collaboration with Gustavo Scuseria’s group at Rice) The signaling state ensemble can be characterized using a multiscale approach: 3) All atom/quantum computations The chromophore is responsible for triggering conformational change. But there are no standard force fields for this residue. The forcefield needs to be derived from quantum chemical computations, for cis, trans and protonated forms. Existing parameters are ineffective at producing the isomerization energy Trans (ground state) results Cis results Amber predicts ~ 14 kcal/mol, while pbe1pbe/6-31++G** predicts ~ 6 kcal/mol P.J. Ledbetter & C.Clementi, unpublished results (2011) Parameter Fitting Procedure MD Simulations Cluster New parameters Quantum Calculations Goal: Converge to parameters which approximate the molecule’s free energy P.J. Ledbetter & C.Clementi, unpublished results (2011) New Parameter Fitting Procedure MD Simulations What: With initial parameters, run very long molecular dynamics simulations. Goal: Generate an ensemble large enough for statistical properties to converge New Parameter Fitting Procedure Cluster What: Select subensembles by clustering the MD trajectory, using its size to estimate as a measure of free energy. Goal: Choose a few structures on which to calculate the quantum chemical energy. New Parameter Fitting Procedure Quantum Calculations What: Use Gaussian to calculate the quantum chemical energy of the molecule. (PBE1PBE 6-311G**) Goal: Calculate the energy of the molecules in a reliable way. P.J. Ledbetter & C.Clementi, unpublished results (2011) New Parameter Fitting Procedure New Parameters Perform a least squares fit on the energy of the structures weighted by the free energy estimate by varying the parameters. If the parameters are realistic enough, stop. P.J. Ledbetter & C.Clementi, unpublished results (2011) New Parameter Fitting Procedure Results P.J. Ledbetter & C.Clementi, unpublished results (2011) The signalling process can be characterized using a multiscale approach: 2) All-atom reconstruction All-atom structures of 25 most populated intermediate structures 1) Coarse Graining 3) QM parameter fitting for chromophore force field Diffusion dynamics from the 25 reconstructed structures Lowest energy structures are solvated P. J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) Structural Analysis of the Results Native (dark) state Photoactivated ensemble P.J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) How accurate is the prediction? How can we test it quantitatively ? “activated” minimum? folded state ensemble chromophore in cis configuration pR activated state chromophore in cis configuration comparable energy pB Conformational entropy in pB much larger than pR pG unfolded minimum folded minimum folded state ensemble chromophore in trans configuration Next: design experimental tests (collaboration with Thomas Kiefhaber) P. J. Ledbetter, B.P. Lambeth & C.Clementi, unpublished results (2011) Cecilia Clementi’s research group http://leonardo.rice.edu/~cecilia/research/ Clementi’s group Dr. Mary Rohrdanz Paul Ledbetter Brad Lambeth Wenwei Zheng Amarda Shehu Payel Das Silvina Matysiak Collaborators: Prof. Kathy Matthews Prof. Lydia Kavraki Prof. Gustavo Scuseria Prof. Kurt Kremer Prof. Mauro Maggioni (Rice Chemistry) (Rice Applied Physics) (Rice Chem. Eng.) (Rice Chemistry) (now: GMU) (now: IBM Watson) (now: U Maryland) (Rice - Biochemistry) (Rice - Computer Science) (Rice - Chemistry) (MPIP Mainz) (Duke - Math) Graduate Students and Postdoctoral Positions Available $$ NSF (CAREER CHE-0349303, CCF-0523908, CNS-0454333) $$ Texas Advanced Technology Program (003604-0010-2003) $$ Norman Hackerman Welch Young Investigator Award $$ Welch Foundation C-1570 $$ Hamill Innovation Award