Ruben Abagyan Structural Bioinformatics, Pharm 201 S203 TM5 TM3 S207 N293 S204 Y308 TM6 • • • • • Computational chemical biology and structure based drug discovery Compiling comprehensive Structural Pocketome New methods for receptor flexible docking & screening Omit models for kinase targeting (Dolphins) New GPCR structures and implications Publishing and data exchange: activeICM Bacteriorhodopsin : Docking & Scoring 7 thousand chemicals Enrichment = 99.98% . Retinal Rank 1 ! Retinal Resolution 1.43A Cavasotto, Orry Totrov, Abagyan, Proteins, 2003 Structural ChemoGenomics Docking, Screening, Profiling • Pockets: QC and continuous re-clustering of the entire PDB + pocket refinements • Ligands: bio-substrates and metabolites, chem. Libraries, drugs, etc. • Pocketome. Each pocket is a distinct functional state of a protein with conformers Ensembles profiling • Docking: site location, induced fit, sampling, energy function • Screening: ligand specific errors and entropy loss • Profiling: protein specific errors and contributions (eg entropy loss). Abagyan, Kufareva, 2009. The Flexible Pocketome Engine for Structural Chemical Genomics. In: Chemogenomics: Concepts and applications of a new design and screening paradigm. Methods in Molecular Biology, Wiley, Kufareva, Ilatovsky, Abagyan, 2011, Pocketome in 4D NAR db issue, in press Predicting Binding/Druggable Pockets Challenge: Predicting Ligand Binding Sites Without Knowing the Ligand Method: 1.Calculate this potential AaC BaC P (r) 12 6 rag a rag 0 P(r) e (( r)/ )2 P ( )d 0 QuickTime™ and a YUV420 codec decompressor are needed to see this picture. 2.6 A 2. Contour the potential 3. Filter out the small blobs Pocket Database 17,000 pockets In 82.3% of apo-cases the predicted pocket covers > 80% of the ligand contact atoms! An, Totrov, Abagyan. (2005) “Pocketome: Comprehensive Identification & Classification of Ligand Binding Envelopes”, Mol. Cell Proteomics From PDB & electron density to a relevant full-atom model Ambiguities and gaps • • • • Missing/unclear ligand density Missing/amb. loop or side chains Rotations of Asn, Gln, His Symmetry, bio-molecule, water, UNLs Fantasy Heavy Atoms I will make it fit! Ligand Density 2o5r • Invisible atoms deposited with full occupancy and low B-factors • Ligands: wrong identity! Ambigous placement Protonation and Tautomerization • e and d Histidines, His rotations, and ligand tautomers • protons in His, Asp, Glu,Arg, Lys, Cys • Protonation and tautomerization of the ligand Full Atom Models provide quantitative understanding of ligand binding UNLs (unrecognized ligands) 67K PDB analysis: Only ~ 65% of ligand poses are unambiguously density justified PDB: Ligand Issues. Classification • • • • • • • • Dens : electron density LigID : ligand identity, Lig2D: stereoisomers, protons, charges, and tautomers LigCG: ligand covalent geometry Lig3D: ligand placement into density LigShft: shifts of ligands and/or pocket atoms Prot2D3D: protons, tautomers of side chains in pocket LigInt: ligand interactions with surrounding atoms (protein, Water, Metals, cofactors) • LigDyn: ligand disorder and multiple poses • LigViz : visualization and telling a story The Seven Recipes for Receptor Flexibility 1. ICM docking with softened potential 2. Multiple Experimental Conformations (MRC) 3. 4D docking (concurrent sampling against multiconformational grids) 4. Systematic omit models and refinement (SCARE) 5. Fumigation, - simulation with repulsive density, selection by pocket density 6. Omit models with attractive pharmacophoric density (DOLPHIN) 7. Ligand Guided pocket generation Docking Chemicals to a Pocket Mazur, Abagyan (1989) “General Equations for multiple branched polymers of motion in internal coordinates”, JBSD.. Abagyan, Totrov, Kuznetsov (1994) “ICM - a new method for protein modeling and docking” J. Comp. Chem. 15, 488-506 Abagyan, and Totrov, (1994).“Biased Probability Monte Carlo searches and Electrostatics” J. Mol. Biol. 235, 983-1002Totrov, Abagyan (1997) Flexible Ligand Docking Bottegoni, Kufareva, Totrov, Abagyan, 2008, “SCARE docking to flexible receptors”, JCAMD, 22, 311 ,2009, “4D Docking”, J.Med.Chem, 52, 397 4D Docking to Flexible Pocket Ensembles Recipe 3. 4D Grid Docking • 1000 conformers, 100 proteins, 300 ligands • 77% success • Performance deteriorates as Nconf increases Bottegoni, Kufareva, Totrov, Abagyan, 2009, “4D Docking”, J.Med.Chem Ligand Binding Score BPA docks into Estrogen Receptor Pocket Sbind S Pocket EVW int EligStrain TStor 1E HBond 2 E HBDesol 3ESolEl 4 E HPhob 5QSize (Totrov, Abagyan, 1999,2001,2003) Open Eye Challenge Cup 2010-11: 165 Docking and 40 Virtual Screening Tasks Mean SD Median Min Max < 0.5 Å <1Å <2Å Top1 0.91 Å 1.1 Å 0.54 Å 0.18 Å 8.2 Å 43 % 78 % 91 % Top 3 0.67 Å 0.62 Å 0.48 Å 0.18 Å 3.8 Å 53 % 86 % 95 % Organized by Greg Warren (OE), Neysa Nevins (GSK), Georgia McGaughey (Merck) Marco Neves, Max Totrov, R.A., ACS 2011, full paper to JCAMD submitted Docking poses: the outliers OEChallenge Cup 2010-11: 40 VLS Tasks (DUD) Mean SD Median Min Max True 76.4 13.3 77.89 33.7 96.8 Null 49.9 15.2 44.8 19.4 84.7 Difference (T - N) 26.5 20.3 27.8 -20.9 58.5 Performance of tk (thymidine kinase) restored by pocket refinement with ICM sampling 0.1% FP original pocket refined pocket 1% FP 2% FP Fixing the bad performers: Limitations of the benchmark: • self docking (need CROSS-docking) • docking to a single conformer (need multiple) • Pocket refinement ( up to ½ of the AUC gap ) • Selecting better conformers • Finding Multiple complimentary conformations ( consistently gets it over 90%) Selecting Multiple Conformations for Docking & VLS • A single X-ray structure: – Insufficient: ligands undetected due to induced fit – Variable VLS performance (refine and/or select) • Many/all X-ray structures – Increased noise, reduced performance – Needs a few high-perf. complementary models • Bottegoni et al.,2009, “4D Docking”, J.Med.Chem, 52, 397 • Bottegoni G, Rocchia W, Rueda M, Abagyan R, Cavalli A Systematic exploitation of multiple receptor conformations for virtual ligand screening. PLoS One, 2011 Pocketome: encyclopedia of binding sites in 4D http://www.pocketome.org/ activeICM, iMolView for iPad An J, Totrov M, Abagyan R Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol Cell Proteomics, 2005 Jun, 4, 752-61 Abagyan R, Kufareva I The flexible pocketome engine for structural chemogenomics. Methods Mol Biol, 2009, 575, 249-79 Kufareva, Ilatovsky, Abagyan, 2011, Pocketome in 4D NAR db issue, in press Pocketome: encyclopedia of binding sites in 4D • Freely available • Experimental pockets with different partners, (+manual) • Comprehensive and automatically updated • Multiple co-crystal and apo structures • interactions analyzed and classified • consensus derived • pockets clustered and ordered by conformations and compatibility with ligands • Ready for the ICM 4D docking • Ligands -> APF models • Docking benchmarks http://www.pocketome.org/ MDM2_HUMAN_21_113 E3 ubiquitin-protein ligase Mdm2. SWIB domain [MDM2/MDM4 Family] Pairwise comparison Pocket-ligand steric clashes Pocket clash dissimilarity (4 clusters) ActiveICM Composition of the binding pockets Protein chains A1 (MDM2_HUMAN)16,17,19,24,50,51,54,55,57:59,61,62,67,69,72,73,75,… Full PDB list 1rv1,1t4e,1t4f,1ycr,2axi,2gv2,3eqs,3g03,3iux,3iwy,3jzk,3jzr,3jzs,3lbk,3lbl,3lnj,3lnz Pocket contact map (cognate ligands) Binding site backbone RMSD (0.90 Å on average) Site contact map (ligand ensemble) Binding site full-atom RMSD (1.69 Å on average) Nuclear Receptor pockets in 4D Park, Kufareva, Abagyan, 2010 JCAMD. Improved docking, screening and selectivity prediction for small molecule nuclear receptor modulators PPARγ benchmark using the DUD test set PDB entry 1FM9 (PPARγ bound to the agonist GI262570) ICM Docking AUC= 96.3 85 PPARγ agonists (13 clusters @Tanimoto distance = 0.3) + 3127 decoys Compound Specificity Profiling. Missing protein contribution to Sbinding initial energy offset optimized energy offset ERα- -6.9 -5.7 ERα+ -12.8 -9.2 ERβ- -5.1 -1.3 ERβ+ -7.9 -12.9 GR+ 0.1 6.4 MR1+ 2.0 1.7 MR2+ 4.8 7.2 PR+ 0.7 5.8 RARγ+ 7.0 -1.1 RXRα+ 8.3 9.0 VDR+ 1.1 -3.4 AR1+ -3.3 -3.1 AR2+ -6.7 -4.9 PPARα+ -0.2 1.4 PPARδ+ 8.0 4.2 PPARγ+ 2.4 0.1 TR+ 8.4 5.7 4D Pocket Omit Models of Alternative Kinase States: DOLPHINs Need a dockable model of inactive kinase? Kufareva, Abagyan Type-II Kinase Inhibitor Docking, Screening, and Profiling Using Modified Structures of Active Kinase States. J. Med. Chem. 2008; 51(24):7921-7932 Ligand-independent Pocket – score offsets derived and used icmPocketFinder • Druggability of a single conformation • Gaussian spatial averaging of the van der Waals potential near protein surface • Contour at a fix level and find envelopes An, Totrov, Abagyan. (2005) “Pocketome: Comprehensive Identification & Classification of Ligand Binding Envelopes”, Mol. Cell Proteomics Finding transient pockets The Fumigation Flowchart • Start from several backbone conformations • [ Use NMA and/or loop generator if necessary ] • Run a side-chain simulation with fumes on each backbone • Select conformations with largest icmPocketFinder envelopes • VLS against best pockets Abagyan R, Kufareva I. The flexible pocketome engine for structural chemogenomics. Methods Mol Biol., Wiley, 2009 ;575:249-79. Generating Gas for the Fumigation • Shave: trim side chains starting from Cb (!Cys) • Make free volume map • Spatial averaging (Gaussian convolution) of the map to smoothen and enlarge the repulsion in a drug site • Subtraction of the original map Ligand-Guided-Models of Pockets • Generate low-energy variations with Normal Modes or ICM Sampling • Dock, Score, ROC them • Select better discriminating models Actives Decoys Bisson, Cheltsov et al. 2007, Discovery of antiandrogen activity of nonsteroidal scaffolds of marketed drugs. PNAS, 104,11927 Katritch V, Rueda M, Lam PC, Yeager M, Abagyan R. GPCR 3D homology models for ligand screening: lessons learned from blind predictions of adenosine A2a receptor complex. Proteins. 2010 Jan;78(1):197-211. Ligand Guided Agonist-Bound Models of b2 Adrenergic Receptor Reynolds, Katritch, Abagyan,, “Identifying conformational changes of the b2 adrenoceptor that enable accurate prediction of ligand/receptor interactions and screening for GPCR modulators”, JCAMD, 2009 . Katritch, Reynolds, Cherezov, Hanson, Roth, Yeager, Abagyan. Analysis of full and partial agonists binding to beta(2)-adrenergic receptor suggests a role of transmembrane helix V in agonist-specific conformational changes J Mol Recognit. 2009 Apr 7;22(4):307-318 Katritch V, Abagyan R GPCR agonist binding revealed by modeling and crystallography. Trends Pharmacol Sci, 2011 Sep 6 LGM Models partition b2AR modulators • 14K set 14K ligand set Top ligands retrieved by the antagonist model Top ligands retrieved by the agonist model Reynolds, Katritch, Abagyan,, “Identifying conformational changes of the b2 adrenoceptor that enable accurate prediction of ligand/receptor interactions and screening for GPCR modulators”, JCAMD, 2009 . What makes an agonist: X-ray 2011 • Warne, et al., Schertler G, Tate C, The structural basis for agonist and partial agonist action on a β1 adrenergic receptor, Nature, 2011. • 0.8A Rmsd Omit Models for GPCRs • Loops are impossible to predict, what could be a solution? • Deleting ECL2 in a model can be tolerated Reynolds, Katritch, Abagyan,, “Identifying conformational changes of the b2 adrenoceptor that enable accurate prediction of ligand /receptor interactions and screening for GPCR modulators”, JCAMD, 2009 . Histamine H1 receptor • Shimamura et al., Structure of the human histamine H1 receptor complex with doxepin, Nature, 2011 • A first-generation H1R antagonist, doxepin, can cause many types of side effects due to its antagonistic effects on histamine H2, serotonin 5-HT2, a1-adrenergic, and muscarinic acetylcholine receptors GPCR modeling & docking challenge • 6 new GPCR structures from the Stevens lab • 270 models • ~25 groups per target • Model server: http://ablab.ucsd.ed u/GPCRDock2010 • Contact sim. server: http://ablab.ucsd.ed u/SimiCon/ Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan R Status of GPCR Modeling and Docking as Reflected by Community-wide GPCR Dock 2010 Assessment. Structure, 2011 Aug 10, 19, 1108-26 Ligand RMSD & contacts Aij – contact of atoms I and j in model A, a number between 1 and 0 (no contact) Bij - contacts in B CS =1-CD | A B | CD max( A , B ) ij ij ij ij Increasing modeling difficulty Contact Similarity CS Are we there yet? Naïve models & acceptable variation 170,000 protein pairs GPCR models explain subtype specificity • A2a based models of A2b, A1 and A3 Adenosine receptor subtypes built • 88 ligands for 4 adenosine receptors were used • Ligand-guided optimization results in 3D models that explain observed specificity Katritch V, Kufareva I, Abagyan R. Structure based prediction of subtype-selectivity for adenosine receptor antagonists. Neuropharmacology. 2010 Jul 15. Discovery of potent new A2a antagonists • Out of 56 high ranking compounds tested in A2a AdR binding assays, 23 showed affinities under 10 μM, 11 of those had sub-μM affinities and two compounds had affinities under 60 nM. Katritch V, Jaakola VP, Lane JR, Lin J, Ijzerman AP, Yeager M, Kufareva I, Stevens RC, Abagyan R. Structure-based discovery of novel chemotypes for adenosine A(2A) receptor antagonists. J Med Chem. 2010 Feb 25;53(4):1799-809 What do we know about 48 Nuclear Receptors? 3D vs Chemistry (Mar 2011) Gene SwissProt code N of PDBs Actives Decoys Gene SwissProt Code N of PDBs Actives Decoys NR0B2 NR0B2_HUMAN 4 6346 NR1I2 NR1I2_HUMAN 6296 THA_HUMAN 307 6042 NR2B1 RXRA_HUMAN 262 6087 NR1A2 THB_HUMAN 429 5922 NR2B2 RXRB_HUMAN 102 6243 NR1B1 RARA_HUMAN 216 6114 NR2B3 RXRG_HUMAN 124 6235 NR1B2 RARB_HUMAN 210 6132 NR3A1 ESR1_HUMAN 1131 5099 NR1B3 RARG_HUMAN 191 6140 NR3A2 ESR2_HUMAN 1003 5218 NR1C1 PPARA_HUMAN 810 5533 NR3B1 ERR1_HUMAN 46 6303 NR1C2 PPARD_HUMAN 466 5862 NR3B3 ERR3_HUMAN 17 6332 NR1C3 PPARG_HUMAN 1063 5198 NR3C1 GCR_HUMAN 1075 5092 NR1F1 RORA_HUMAN 11 6338 NR3C2 MCR_HUMAN 1062 5127 NR1H2 NR1H2_HUMAN 481 5868 NR3C3 PRGR_HUMAN 1062 5127 NR1H3 NR1H3_HUMAN 470 5875 NR3C4 ANDR_HUMAN 1335 5014 NR1H4 NR1H4_HUMAN 51 6297 NR5A1 STF1_HUMAN 9 27 2 1 64 21 4 18 9 11 13 43 2 25 NR1A1 19 6330 NR1I1 VDR_HUMAN 3 5 17 4 1 9 12 15 73 2 8 4 10 18 170 6179 • 48 Nuclear Receptors : Accumulated Knowledge in PDB and Chembl • 38 NRs with crystal structures in Protein Data Bank (PDB), • 28 with Chembl actives, 27 with both (no 3D: ERR2, steroid hr, deafness) Ligand Only 3D Models: APF Method • Build self-consistent flexible superposition • Compute average atomic property fields • Dock & Score a new chemical into that field Chem Biol Drug Des. 2008 Jan;71(1):15-27. Atomic property fields: generalized 3D pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3D QSAR. Max Totrov J Comput Aided Mol Des. 2010 Mar;24(3):173-82 Spatial chemical distance based on atomic property fields. Grigoryan et al. Unifying Docking & Ligand based methods ligand ensemble binding site ensemble 3D APF pharmacophores 1D fingerprints Neves, Yu-Chen Chen APF docking vs Pocket Docking PPARg activity models, Docking AUC= 96.3 3D-Chem-Super AUC= 94.0 Anaheim ACS 2011 DUD competition: Test set: 85 actives + 3127 decoys Training set: 15 actives (co-crystals) Test set: 35 actives + 1745 decoys Gene SwissProt Code N of PDBs Actives Decoys NR1I2 NR1I2_HUMAN 25 6296 NR2B1 RXRA_HUMAN 262 6087 NR2B2 RXRB_HUMAN 102 6243 NR2B3 RXRG_HUMAN 124 6235 NR3A1 ESR1_HUMAN 1131 5099 NR3A2 ESR2_HUMAN 1003 5218 NR3B1 ERR1_HUMAN 46 6303 NR3B3 ERR3_HUMAN 17 6332 NR3C1 GCR_HUMAN 1075 5092 NR3C2 MCR_HUMAN 1062 5127 NR3C3 PRGR_HUMAN 1062 5127 NR3C4 ANDR_HUMAN 1335 5014 NR5A1 STF1_HUMAN 9 27 2 1 64 21 4 18 9 11 13 43 2 19 6330 Three principal types of compound activity models: • 4D pockets (no bias) • 4D ligand superposition/APF (small bias) • 2D chemical machine learning (set dependent) ICMFF: better force field for protein structure prediction • • • • A correcting “hump” potential for phi-psi QM fit and X_ray fit, e=2 N-C-C angle flexible Separate hydrogen and non-H combination rules • ICM HLP ICM HLP Arnautova, Abagyan, Totrov, 2011 Development of a new physics-based internal coordinate mechanics force field (ICMFF) and its application to protein loop modeling. Proteins , 79,477 Modeling with implicit membrane Esolv = N ato ms å s (z )A i i i =1 • Peptides folded with ICM near membrane. • Membrane interaction parameters derived • • (a) acetylcholine receptor M2, (b) fd bacteriophage coat protein, (c) MerF mercury transport protein, (d) phospholamban, and (e) influenza A virus M2 proton channel NMR orientation in red Efficient molecular mechanics simulations of the folding, orientation, and assembly of peptides in lipid bilayers using an implicit atomic solvation model Andrew J. Bordner, Barry Zorman†,and Ruben Abagyan JCAMD, 2011 i ì s imem z £a ï ï 1 éë( b - z )s imem + ( z - a )s iaq ùû a < z < b s i ( z) = í b a ï ï s iaq z ³b î Conclusions & Future • 77,000 Protein Data Bank (PDB) structures translate into 2000+ experimental pockets ensembles • Docking/scoring to a few diverse pockets recognizes an activity of a chemical compound (no training or bias) • ICM docks ligands in seconds with 91% of correct top scoring poses (95% top 3) • Ligand-guided pocket models help SBDD (GPCRs) • Ligand-based pharmacophoric field models (APF) may reach a comparable performance • Promiscuous receptors/enzymes with broad specificity, eg PXR, Cyps, HERG, transporters etc. are presently undockable, but are still APF/docking models may be useful. Acknowledgements Lab Members • Irina Kufareva (P4D, gpcr2011 ) • Marco Neves (docking) • Andrey Ilatovsky (P4D) • Manuel Rueda • Yu-chen Chen Former Lab Members • William Bisson (LGM) • Giovanni Bottegoni (4D) • Seva Katritch (GPCR) • So-Jung Park (NR) • Kim Reynolds • J.H.An Molsoft • Maxim Totrov (ICM) • Eugene Raush (activeICM) • Polo Lam (GPCR models) • Elena Arnautova (force field) Funding from NIGMS Collaborators 4D Andrea Cavalli, Walter R. IIT Membranes: Andy Bornder, AZ GPCRs Ray Stevens, Vadim Cherezov Lab @ Scripps, La Jolla Tracy Handel Lab, UCSD Larry Miller, Mayo, AZ Patrick Sexton, Arthur Christopoulos, Monash, Melbourne. Family B modeling Ligand Discovery & Modeling John Reed’s, Alex Terskikh. and Bob Liddington lab, Burnham Institute, LJ Claude Cochet, Grenoble, France Susan Taylor, Palmer Taylor, UCSD Edmond Ma, KY Wong, C.M.Che Hong Kong Phil Cole, Hopkins Mark Yeager, Virginia Tim Cardozo, NYU