Systematic calculation and characterization of local motions in allosteric proteins Michael D. Introduction novel effector substrate PDB 1 G proteins 1HUR 1E0S 1AN0 1TUI 1OIV 1VG1 1HH4(A) NA* 1KAO 4Q21 1XTQ 1FTN 1G16 1KY3 1SVI 1GDD 1TAG protein kinases 1ERK 1P4O 1IRK 1GZK response 3CHY regulators 1L5Z 1DBW 1H4Y PDB 2 NA* 1HFV 1NF3(A) 1EFT 1OIW 1VG8 1MH1 1IBR(A) 2RAP 6Q21 1XTS 1A2B 1G17 1KY2 1SVW 1GIA 1TND 2ERK 1K3A 1IR3 1O6K 1FQW 1L5Y 1D5W 1H4X protein anthranilate synthase ATCase ATP sulfurylase ATP-PRT caspase chorismate mutase DAHP synthase FBPase-1 glcN-6-P deaminase glycogen phosphorylase GTP cyclohydrolase I hemoglobin lactate DH NAD-malic enzyme phosphofructokinase phosphoglycerate DH PTP1B uracil PRT • Major mechanism of control and regulation in biology PDB 1 1I7S 1RAC 1M8P 1NH8 1SHJ 2CSM 1KFL 1EYJ 1CD5 1GPB 1WPL 4HHB 1LTH(T) 1QR6 6PFK 1PSD 1T48 1XTU PDB 2 1I7Q 1D09 1I2D 1NH7 1F1J 1CSM 1N8F 1EYI 1HOT 7GPB 1IS7 1HHO 1LTH(R) 1PJ2 4PFK 1YBA 1PTY 1XTT Precise, diverse motion calculations X − understanding / treating diseases caused by malfunctioning allosteric proteins ΔCα max(|Δφ|, |Δψ|) − Designing novel allosteric proteins as biological control devices displacements PDB 2 2ARC 1XXA 1HXD 1EFA 1CMA 1I6A 1WET 1QPI Design allosteric response into a nonallosteric protein Δφ θαβ L L L backbone position relative to protein core side-chain orientation ΔSC max(fI, fA) max(|Δχ1|, |Δχ2|) T (inactive) state L L L L atomic interactions with other residues local side-chain conformation Choe et al. (2000), fig. 1 Manually compare I and A states → qualitative mechanistic models Low-resolution structural models Discriminating true motions from crystallographic noise Goals and new contributions measure X statistic |Δφ| (°)a max(|Δφ|, |Δψ|) |Δψ| (°)a ΔCα ΔCα (Å) θαβ (°)a,b θαβ max(|Δχ1|, |Δχ2|) |Δχ1| (°)a,c ΔSC ΔSC (Å)d fI max(fI, fA) fA X X X X X X X X A X X X X X X X X X X X Motion value histograms for five non-allosteric protein pairs (control 1, black), nine allosteric protein pairs in same state (control 2), and allosteric benchmark. Threshold. Dataset of 51 Allosteric Proteins Three functional classes threshold 30° 1.2 Å 28° 46° 2.0 Å 0.20 • Smaller motions may be functionally significant in some allosteric proteins Allosteric motions in protein space Sequence space DNA-binding proteins Enzymes G protein ran (1IBR.pdb) phosphofructokinase (4PFK.pdb) PurR (1WET.pdb) Control signaling pathways average 29.9 30.4 1.20 27.8 46.1 3.91 0.202 0.195 • Thresholds are intuitively reasonable cutoffs for large motions C Statistically investigate amount and structural distribution of motions 99th percentile control 1 control 2 27.1 32.7 30.5 30.3 1.38 1.01 23.9 31.8 42.8 49.5 4.12 3.69 0.213 0.190 0.200 0.190 • Set thresholds to exclude ~99% of background motion in controls X X Precisely identify local motions in known allosteric protein structures • Contact changes, backbone motions cluster strongly in space Statistics of allosteric transitions MWC “pre-existing equilibrium” model Control transcription by binding DNA Regulate reactions and biochemical pathways Δφ,ψ ΔCα Θαβ Δχ1,2 ΔSC fI,A 10 20 30 40 50 60 MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAG ---------XX-------------XX---XX--XXXX----------------------X ----------------------------------XXX---------------------XX ---------.-..-.-------------------XXXX---------.----------X. ----XX--X....-.--.--X--XXXX-XX--X.XXXXX-----X--.---XX-X--X.. ----XX--X....-.--.--X---XXX-X-XX-.--XXX-X---X--.---XX----X.. ----X----XXXX---X-------------XXXXXXXX-----------------X-XXX Δφ,ψ ΔCα Θαβ Δχ1,2 ΔSC fI,A 70 80 90 100 110 120 QEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDL XXXXXXXXXXXXX-------------------------------XXXXX----------XXXXXXXXXXXX--------------------------------X--X-----------XXXXX-XX-X-XX-.-.-----------------------------XX------.----XXXXX.XX-XXXX-.-.-----.----X--X---X-X-X--XXX-XXXX.-XXX.--X-X XXXXX.-X-XXXX-.-.-----.-----------X---X--XX--XXXX.-XXX.----X X-XXXXXXXXX--X------------------------X---X----------------- Δφ,ψ ΔCα Θαβ Δχ1,2 ΔSC fI,A 130 140 150 160 AARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKL XX-------------------------------------------------------------------------------------------------------------.----------X-.----------------..---X-XX.XXX.XXX.-.-X---.--XX.---.-X--X----X--X ..---X-XX.X-X.XX-.-.-----.---X.---.X---X---XX--X X------X---------------------------------------- Motions for ras G protein (4Q21.pdb vs. 6Q21.pdb) Cartesian correlation function Cm(r) • Switch I and II previously identified by Milburn et al. • Most of protein is rigid • Motions tend to occur in contiguous segments (especially backbone and contacts) • Strong consensus between measures in most flexible regions Allosteric transitions comprise 10-20% of protein residues max(|Δφ|, |Δψ|) ΔCα θαβ max(|Δχ1|, |Δχ2|) control 1 0.03 ± 0.03 0.02 ± 0.03 0.01 ± 0.02 0.13 ± 0.09 0.11 ± 0.08 0.03 ± 0.02 control 2 0.03 ± 0.02 0.01 ± 0.01 0.01 ± 0.01 0.16 ± 0.08 0.10 ± 0.05 0.02 ± 0.02 signaling 0.16 ± 0.06 0.22 ± 0.09 0.13 ± 0.05 0.33 ± 0.07 0.26 ± 0.06 0.24 ± 0.09 transcription 0.20 ± 0.08 0.12 ± 0.08 0.07 ± 0.03 0.35 ± 0.12 0.24 ± 0.09 0.18 ± 0.07 enzymes 0.12 ± 0.07 0.21 ± 0.11 0.06 ± 0.03 0.30 ± 0.09 0.21 ± 0.07 0.16 ± 0.06 ΔSC • Backbone and contact motions strongly correlated at short separations (< 20 res. or < 20Å) → local clustering of motions • Apparent long-range correlations for backbone and side-chain motions present only in a small minority of allosteric proteins (data not shown) • Summary: distinctly non-random organization of backbone and contact motions in allosteric proteins • PFK: motions localize between catalytic and allosteric sites (esp. contact changes) → possible allosteric pathway R (active) state Signaling proteins max(fI, fA) ≥ 0.2 0.3 0.4 0.5 • Backbone displacements, dihedral changes, and contact motions localize to similar regions of structures L L max(|Δχ1|,|Δχ2|) ≥ 46° max(|Δφ|,|Δψ°|) ≥ 30° max(|Δφ|,|Δψ°|) ≥ 65° • Most of protein is structurally conserved local backbone conformation Sequence correlation function Cm(Δi) • Correlation functions statistically measure strength and significance of local cooperative effects and distance ranges over which they occur Δψ KNF “sequential transition” model L contact changes Correlations are normalized against reference correlation (correlation expected if moving residues are distributed randomly) ΔSC ≥ 2.0 Å ΔCα ≥ 1.2 Å ΔCα ≥ 3.0 Å Previous work in allostery L dihedral changes Enzymes *obtained directly from authors; not in PDB • Improved high-resolution understanding of allostery will aid in PDB 1 2ARA 1XXC 1BIA 1TLF 1CMB 1I69 1DBQ 2TRT ras G protein Protein AraC arg repressor biotin repressor lac repressor met repressor OxyR PurR tet repressor Coupling among local allosteric motions purine repressor protein arf1 arf6 cdc42 EF-Tu rab11 rab7 rac1 ran rap2a ras rheb rhoA sec4 ypt7p YsxC Giα1 Gtα ERK2 IGF-1R IRK PKB CheY DctD fixJ SpoIIAA Three-dimensional space DNA-binding proteins Signaling proteins We have exploited the large number of known allosteric crystal structures to systematically characterize local conformational changes in allosteric proteins toward the goal of increasing the theoretical understanding of the structural basis of protein allostery on the atomic scale. We have compiled a set of 51 pairs of known inactive and active allosteric protein structures from the Protein Data Bank. We have measured changes in dihedral angles and Cartesian displacements for backbones and side chains and rearrangements in residue-residue contacts for each protein. Several examples show that these automated calculations reveal functionally interesting pictures of local motions which corroborate many features previously observed manually by crystallographers. In addition, statistical analysis of the calculated motions shows that on average, 20 percent of residues differ significantly between the two crystal structures of an allosteric protein in addition to possible changes in dynamics. Allosteric motion is more probable in weakly constrained local structural environments like loops and solvent-exposed regions than in strongly constrained environments like helices, strands, and buried regions. Backbone and contact motions are correlated at separations of up to 20 residues in sequence space and up to 20 Å in Cartesian space. Together, these observations suggest structural rules for designing allosteric protein systems. Importance and applications in Molecular & Computational Biophysics and 2Chemical & Biomolecular Engineering, Johns Hopkins University A wide variety of targets Abstract and Jeffrey J. 1,2 Gray phosphofructokinase 1Program 1 Daily max(fI, fA) 0.20 ± 0.10 0.09 ± 0.05 0.32 ± 0.09 0.24 ± 0.07 0.20 ± 0.09 all allosteric 0.15 ± 0.07 Average fraction of residues moving (mean ± s.d) by six measures in six sets of proteins: two control datasets, three classes of allosteric proteins, and all allosteric proteins. • 10-20% of an allosteric protein changes backbone conformation, moves relative to the core, or changes interactions with other residues • Side-chain motion is significant in controls (10-15%) but more common in allosteric proteins (25-30%) • Extents of motion have high σ → significant variability in allosteric mechanisms • Proteins in three classes change conformation to similar extents Local structural environment influences allosteric motion Theoretical implications • Calculate motions in three types of degrees of freedom important to protein structure • Calculated motions do not in themselves constitute comprehensive mechanistic models • Statistical analyses reveal basic insights into structural basis of protein allostery: − Significant changes in average structure (~20%) are common in allosteric proteins → not just a dynamic phenomenon − Protein structures use constraints to control location of motion, possibly to direct signal propagation between allosteric and functional sites − Local motions are correlated up to 20Å distance, enough to bridge two spatially distinct sites over several residues − Mechanical communication is an important, general phenomenon in protein allostery • Possible test resource for flexibility prediction algorithms such as COREX (Hilser & Freire 1996), elastic network models (Bahar et al. 1997), FIRST (Jacobs et al. 2001), and statistical coupling analysis (Suel et al. 2003) Future Directions Domain and subunit motion Allosteric mechanisms effector • Polar residues (especially side-chains) more likely to move than apolar residues loop 1 Tetramer interface 7° rotation • Secondary structure (backbone Hbonds) constrains backbone but not side-chain substrate loop 1 • Burial constrains all motions (contact constraints) • Polar vs. apolar effect is a proxy for exposed vs. buried (data not shown) • Summary: Motion most likely in weakly constrained environments apolar: A, C, F, I, L, M, P, V, W, Y; polar: D, E, G, H, K, N, Q, R, S, T secondary structure (helix, strand, loop) assigned by DSSP buried, exposed – all-atom ASA ≤ 30% or > 30%, respectively (naccess) loop 2 substrate loop 2 effector An important part of allostery in many oligomers Quantify organization of moving parts, connectivity between sites Funding/ Acknowledgements ARCS fellowship (M. Daily) JHU Program in Molecular and Computational Biophysics NIH training grant (M. Daily) NIH award K01-HG02316 (J. Gray) Pymol and R (figures and calculations)