Structural biology should be computable! • Protein structures determined by amino acid sequences • Protein structures and complexes correspond to global free energy minima • Fundamental test of understanding and huge practical relevance Model of energetics of inter and intramolecular interactions Prediction (Given Sequence, Optimize Structure) Ab initio structure prediction ROSETTA Protein Structure Protein-protein docking Protein-protein interactions Design (Given Structure, Optimize Sequence) Protein design Interface design Model of macromolecular interactions • Removal of single methyl groups can destabilize proteins --> jigsaw puzzle-like packing crucial • Buried polar atoms almost always hydrogen bonded --> treat hydrogen bonding as accurately as possible • Exposed charge substitutions generally have little effect --> damp long range elctrostatics • Focus on short range interactions! Conformational sampling Random Start Low-Resolution Monte Carlo Search (integrate out sidechain degrees of freedom) High-Resolution Refinement with full atomic detail 105 Jeff Gray (Hopkins), Ora Furman (Hebrew University), Chu Wang Select lowest energy models Predictions Docking Low-Resolution Search • • • Monte Carlo Search Rigid body translations and rotations Residue-scale interaction potentials Protein representation: backbone atoms + average centroids O O N ... N N O O N O O N ... Docking Protocol (Target 12: cohesin-dockerin; unbound-bound) 2. Refinement Energy 1. Initial Search RMSD to arbitrary starting structure (Å) RMSD to starting structure of refinement Side Chain Flexibility Target 12 Cohesin-Dockerin dockerin 0.46Å interface rmsd 87% native contacts 6% wrong contacts Ora Furman, Chu Wang red,orange– xray blue – model; green – unbound cohesin Details of T12 Interface dockerin R53 S45 D39 L22 Y74 N37 L83 E86 cohesin red,orange– xray blue - model Accurate Side Chain Modeling Target 15 immunity protein Dcolicin D tRNase colicin 0.23Å interface rmsd Science 310, 638-642 immunity protein red,orange– xray blue - model Details of T15 Interface colicin H611 K610 K607 K608 E56 E68 D61 E59 immunity protein red,orange– xray blue - model Modeling Backbone Movement Target 20 HemK-RF1 2.34Å interface rmsd 36% native contacts RF1 HemK red,orange– xray blue – model; green – unbound Loop with methylated Gln Chu Wang QuickTime™ and a YUV420 codec decompressor are needed to see this picture. CASP6 T0198: PhoU domain repeat Phil Bradley Model 2: 4A over 210 rsds (Model 1: 3.94 over 198) CASP6 T0212 Model 2: 3.97 over 109 rsds (Model 1: 4.0 over 104) QuickTime™ and a DV/DVCPRO - NTSC decompressor are needed to see this picture. T0281 ab initio prediction (1.59Å) Phil Bradley 1r69 Science 309, 1868-1871 1ubq 2REB Boinc.bakerlab.org/rosetta David Kim QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. High resolution ab initio structure prediction from single sequences by enhanced diversity “barcode” directed sampling Outreach! High Resolution Refinement of CASP target 199 - remote homology model Bin Qian Calculations performed on SDSC teragrid clusters High Resolution NMR Model Refinement Vatson Raman Disulfide Bond Formation Protein Blue - X-ray structure Green - NMR models Red - Rosetta models Computing Structural Biology • Free energy function reasonable => Computing simple protein structures and interactions now appears to be within reach • Implications for structural genomics? • More cpu power => more accurate predictions for larger proteins • For larger complexes, experimental data essential (low resolution electron density!). • Symmetry helps! Modeling accuracy also illustrated by structures of designed proteins Top7 X-ray structure has correct topology. Backbone RMSD to design only 1.2Å!! C-a Backbone Overlay Red : X-ray structure Blue : Design model Brian Kuhlman, Gautam Dantas; Science 302 1364-8 Design of novel H bond network Q51 Design Q51 X-ray Y35 Y35 Y35 Q169 Q169 Q180 G177 interface Lukasz Joachimiak Q180 G177 G177 Design of new protein functions • Design of new protein-protein interactions • Design of enzymes catalyzing novel chemical reactions • Design of new transcription factor and endonuclease specificities • Design of HIV vaccine HIV vaccine design • Present HIV coat protein epitopes locked into conformation observed in complexes with neutralizing antibodies using designed scaffolds • Preliminary results: designed proteins fold and bind neutralizing antibodies (5nM affinity). One design confirmed crystallographically. Bill Schief in collaboration with Peter Kwong Computational design of non-HIV immunogens to elicit broadly-neutralizing antibodies Bill Schief Crystal structure of Mab 2F5 in complex with its HIV epitope Model of non-HIV scaffold-epitope (red) Redesign of DNA cleavage specificity of MsoI homing endonuclease using ROSETTA Justin Ashworth, Jim Havranek Nature in press WT-WT Design-WT WT-Design Design-Design Specific DNA cleavage by designed nuclease 1 ½ ¼ 1/2n 1/29 - wild-type design 5uM nuclease Cleavage targets wild-type I-Mso - wild-type design Design Acknowledgements Protein structure prediction • Phil Bradley (MIT) • Rhiju Das • Lars Marlstrom • Bin Qian • Vatson Raman Protein-protein docking • Ora Furman (Hebrew University) • Chu Wang • Jeff Gray (Johns Hopkins) Design • Brian Kuhlman (UNC) • Gautam Dantas • Justin Ashworth • Jim Havranek Robetta.bakerlab.org prediction and design server: David Kim (domain parsing, boinc) and Dylan Chivian Rosetta software freely available for academic use Boinc.bakerlab.org/rosetta