Computational chemical biology and structure based drug discovery Ruben Abagyan

advertisement
Ruben Abagyan
Structural Bioinformatics, Pharm 201
S203
TM5
TM3
S207
N293
S204
Y308
TM6
•
•
•
•
•
Computational chemical
biology and structure
based drug discovery
Compiling comprehensive Structural Pocketome
New methods for receptor flexible docking & screening
Omit models for kinase targeting (Dolphins)
New GPCR structures and implications
Publishing and data exchange: activeICM
Bacteriorhodopsin : Docking & Scoring
7 thousand chemicals
Enrichment = 99.98%
.
Retinal
Rank 1 !
Retinal
Resolution
1.43A
Cavasotto, Orry Totrov, Abagyan, Proteins, 2003
Structural ChemoGenomics
Docking, Screening, Profiling
• Pockets: QC and continuous
re-clustering of the entire PDB
+ pocket refinements
• Ligands: bio-substrates and
metabolites, chem. Libraries,
drugs, etc.
• Pocketome. Each pocket is a
distinct functional state of a
protein with conformers
Ensembles
profiling
• Docking: site location,
induced fit, sampling, energy
function
• Screening: ligand specific
errors and entropy loss
• Profiling: protein specific
errors and contributions (eg
entropy loss).
Abagyan, Kufareva, 2009. The Flexible Pocketome Engine for Structural Chemical Genomics.
In: Chemogenomics: Concepts and applications of a new design and screening paradigm.
Methods in Molecular Biology, Wiley,
Kufareva, Ilatovsky, Abagyan, 2011, Pocketome in 4D NAR db issue, in press
Predicting Binding/Druggable Pockets
Challenge: Predicting Ligand Binding Sites Without Knowing the Ligand
Method:
1.Calculate this potential
AaC BaC
P (r)   12  6
rag
a rag
0
P(r) 
e
((  r)/  )2
P ( )d
0
QuickTime™ and a
YUV420 codec decompressor
are needed to see this picture.
  2.6 A
2. Contour the potential
3. Filter out the small blobs
Pocket Database
17,000 pockets
In 82.3% of apo-cases the predicted
pocket covers > 80% of the ligand
contact atoms!
An, Totrov, Abagyan. (2005) “Pocketome: Comprehensive Identification & Classification
of Ligand Binding Envelopes”, Mol. Cell Proteomics
From PDB & electron density to a relevant
full-atom model
Ambiguities and gaps
•
•
•
•
Missing/unclear ligand density
Missing/amb. loop or side chains
Rotations of Asn, Gln, His
Symmetry, bio-molecule, water, UNLs
Fantasy Heavy Atoms
I will make it fit!
Ligand
Density
2o5r
• Invisible atoms deposited with full occupancy and
low B-factors
• Ligands: wrong identity! Ambigous placement
Protonation and Tautomerization
• e and d Histidines, His rotations, and ligand
tautomers
• protons in His, Asp, Glu,Arg, Lys, Cys
• Protonation and tautomerization of the ligand
Full Atom Models provide quantitative
understanding of ligand binding
UNLs (unrecognized ligands)
67K PDB analysis:
Only ~ 65% of ligand poses
are unambiguously density
justified
PDB: Ligand Issues. Classification
•
•
•
•
•
•
•
•
Dens : electron density
LigID : ligand identity,
Lig2D: stereoisomers, protons, charges, and tautomers
LigCG: ligand covalent geometry
Lig3D: ligand placement into density
LigShft: shifts of ligands and/or pocket atoms
Prot2D3D: protons, tautomers of side chains in pocket
LigInt: ligand interactions with surrounding atoms
(protein, Water, Metals, cofactors)
• LigDyn: ligand disorder and multiple poses
• LigViz : visualization and telling a story
The Seven Recipes for Receptor Flexibility
1. ICM docking with softened potential
2. Multiple Experimental Conformations (MRC)
3. 4D docking (concurrent sampling against multiconformational grids)
4. Systematic omit models and refinement (SCARE)
5. Fumigation, - simulation with repulsive density, selection by
pocket density
6. Omit models with attractive pharmacophoric density
(DOLPHIN)
7. Ligand Guided pocket generation
Docking Chemicals to a Pocket
Mazur, Abagyan (1989) “General Equations for multiple
branched polymers of motion in internal coordinates”, JBSD..
Abagyan, Totrov, Kuznetsov (1994) “ICM - a new method for
protein modeling and docking” J. Comp. Chem. 15, 488-506
Abagyan, and Totrov, (1994).“Biased Probability Monte Carlo
searches and Electrostatics” J. Mol. Biol. 235, 983-1002Totrov,
Abagyan (1997) Flexible Ligand Docking
Bottegoni, Kufareva, Totrov, Abagyan, 2008, “SCARE docking to
flexible receptors”, JCAMD, 22, 311
,2009, “4D Docking”, J.Med.Chem, 52, 397
4D Docking to
Flexible Pocket Ensembles
Recipe 3. 4D Grid Docking
• 1000 conformers, 100
proteins, 300 ligands
• 77% success
• Performance
deteriorates as Nconf
increases
Bottegoni, Kufareva, Totrov, Abagyan, 2009, “4D Docking”, J.Med.Chem
Ligand Binding Score
BPA docks into Estrogen Receptor Pocket
Sbind  S Pocket  EVW int  EligStrain  TStor  1E HBond 
 2 E HBDesol   3ESolEl   4 E HPhob   5QSize
(Totrov, Abagyan, 1999,2001,2003)
Open Eye Challenge Cup 2010-11: 165 Docking and
40 Virtual Screening Tasks
Mean
SD
Median
Min
Max
< 0.5 Å
<1Å
<2Å
 Top1
0.91 Å
1.1 Å
0.54 Å
0.18 Å
8.2 Å
43 %
78 %
91 %
 Top 3
0.67 Å
0.62 Å
0.48 Å
0.18 Å
3.8 Å
53 %
86 %
95 %
Organized by Greg Warren
(OE), Neysa Nevins (GSK),
Georgia McGaughey (Merck)
Marco Neves, Max Totrov, R.A., ACS 2011, full paper to JCAMD submitted
Docking poses: the outliers
OEChallenge Cup 2010-11: 40 VLS Tasks (DUD)
Mean
SD
Median
Min
Max
 True
76.4
13.3
77.89
33.7
96.8
 Null
49.9
15.2
44.8
19.4
84.7
Difference (T - N)
26.5
20.3
27.8
-20.9
58.5
Performance of tk (thymidine kinase) restored by
pocket refinement with ICM sampling
 0.1% FP
original pocket
refined pocket
1% FP
 2% FP
Fixing the bad performers:
Limitations of the benchmark:
• self docking (need CROSS-docking)
• docking to a single conformer (need
multiple)
• Pocket refinement ( up to ½ of the
AUC gap )
• Selecting better conformers
• Finding Multiple complimentary
conformations ( consistently gets it
over 90%)
Selecting Multiple Conformations for
Docking & VLS
• A single X-ray structure:
– Insufficient: ligands undetected due to induced fit
– Variable VLS performance (refine and/or select)
• Many/all X-ray structures
– Increased noise, reduced performance
– Needs a few high-perf. complementary models
• Bottegoni et al.,2009, “4D Docking”, J.Med.Chem, 52, 397
• Bottegoni G, Rocchia W, Rueda M, Abagyan R, Cavalli A
Systematic exploitation of multiple receptor conformations for virtual ligand
screening. PLoS One, 2011
Pocketome: encyclopedia of binding sites in 4D
http://www.pocketome.org/
activeICM, iMolView for iPad
An J, Totrov M, Abagyan R
Pocketome via comprehensive identification and classification of ligand binding envelopes.
Mol Cell Proteomics, 2005 Jun, 4, 752-61
Abagyan R, Kufareva I
The flexible pocketome engine for structural chemogenomics.
Methods Mol Biol, 2009, 575, 249-79
Kufareva, Ilatovsky, Abagyan, 2011, Pocketome in 4D NAR db issue, in press
Pocketome: encyclopedia of binding sites in 4D
• Freely available
• Experimental pockets
with different partners,
(+manual)
• Comprehensive and
automatically updated
• Multiple co-crystal and
apo structures
• interactions analyzed and
classified
• consensus derived
• pockets clustered and
ordered by conformations
and compatibility with
ligands
• Ready for the ICM 4D
docking
• Ligands -> APF models
• Docking benchmarks
http://www.pocketome.org/
MDM2_HUMAN_21_113
E3 ubiquitin-protein ligase Mdm2. SWIB domain [MDM2/MDM4 Family]
Pairwise comparison
Pocket-ligand steric clashes
Pocket clash dissimilarity (4 clusters)
ActiveICM
Composition of the binding pockets
Protein chains
A1 (MDM2_HUMAN)16,17,19,24,50,51,54,55,57:59,61,62,67,69,72,73,75,…
Full PDB list
1rv1,1t4e,1t4f,1ycr,2axi,2gv2,3eqs,3g03,3iux,3iwy,3jzk,3jzr,3jzs,3lbk,3lbl,3lnj,3lnz
Pocket contact map (cognate ligands)
Binding site backbone RMSD
(0.90 Å on average)
Site contact map (ligand ensemble)
Binding site full-atom RMSD
(1.69 Å on average)
Nuclear
Receptor
pockets in 4D
Park, Kufareva, Abagyan, 2010
JCAMD. Improved docking, screening
and selectivity prediction for small
molecule nuclear receptor
modulators
PPARγ benchmark using the DUD test set
PDB entry 1FM9 (PPARγ bound to the
agonist GI262570)
ICM Docking
AUC= 96.3
85 PPARγ agonists (13 clusters @Tanimoto distance = 0.3) + 3127 decoys
Compound Specificity Profiling. Missing
protein contribution to Sbinding
initial
energy
offset
optimized
energy offset
ERα-
-6.9
-5.7
ERα+
-12.8
-9.2
ERβ-
-5.1
-1.3
ERβ+
-7.9
-12.9
GR+
0.1
6.4
MR1+
2.0
1.7
MR2+
4.8
7.2
PR+
0.7
5.8
RARγ+
7.0
-1.1
RXRα+
8.3
9.0
VDR+
1.1
-3.4
AR1+
-3.3
-3.1
AR2+
-6.7
-4.9
PPARα+
-0.2
1.4
PPARδ+
8.0
4.2
PPARγ+
2.4
0.1
TR+
8.4
5.7
4D Pocket
Omit Models of Alternative Kinase States: DOLPHINs
Need a dockable model
of inactive kinase?
Kufareva, Abagyan Type-II Kinase Inhibitor Docking,
Screening, and Profiling Using Modified Structures of Active
Kinase States. J. Med. Chem. 2008; 51(24):7921-7932
Ligand-independent Pocket –
score offsets derived and
used
icmPocketFinder
• Druggability of a
single conformation
• Gaussian spatial
averaging of the van
der Waals potential
near protein surface
• Contour at a fix level
and find envelopes
An, Totrov, Abagyan. (2005) “Pocketome: Comprehensive
Identification & Classification of Ligand Binding
Envelopes”, Mol. Cell Proteomics
Finding transient pockets
The Fumigation Flowchart
• Start from several
backbone conformations
• [ Use NMA and/or loop
generator if necessary ]
• Run a side-chain simulation
with fumes on each
backbone
• Select conformations with
largest icmPocketFinder
envelopes
• VLS against best pockets
Abagyan R, Kufareva I. The flexible pocketome engine
for structural chemogenomics. Methods Mol Biol.,
Wiley, 2009 ;575:249-79.
Generating Gas for the Fumigation
• Shave: trim side chains
starting from Cb (!Cys)
• Make free volume map
• Spatial averaging (Gaussian
convolution) of the map to
smoothen and enlarge the
repulsion in a drug site
• Subtraction of the original
map
Ligand-Guided-Models of Pockets
• Generate low-energy variations with
Normal Modes or ICM Sampling
• Dock, Score, ROC them
• Select better discriminating models
Actives
Decoys
Bisson, Cheltsov et al. 2007, Discovery of antiandrogen activity of nonsteroidal
scaffolds of marketed drugs. PNAS, 104,11927
Katritch V, Rueda M, Lam PC, Yeager M, Abagyan R. GPCR 3D homology models for ligand screening:
lessons learned from blind predictions of adenosine A2a receptor complex. Proteins. 2010
Jan;78(1):197-211.
Ligand Guided Agonist-Bound Models of
b2 Adrenergic Receptor
Reynolds, Katritch, Abagyan,, “Identifying conformational changes of the b2 adrenoceptor that enable
accurate prediction of ligand/receptor interactions and screening for GPCR modulators”, JCAMD, 2009
.
Katritch, Reynolds, Cherezov, Hanson, Roth, Yeager, Abagyan. Analysis of full and partial agonists binding
to beta(2)-adrenergic receptor suggests a role of transmembrane helix V in agonist-specific
conformational changes J Mol Recognit. 2009 Apr 7;22(4):307-318
Katritch V, Abagyan R GPCR agonist binding revealed by modeling and crystallography.
Trends Pharmacol Sci, 2011 Sep 6
LGM Models partition b2AR modulators
• 14K set
14K ligand set
Top ligands retrieved by
the antagonist model
Top ligands retrieved by
the agonist model
Reynolds, Katritch, Abagyan,, “Identifying
conformational changes of the b2 adrenoceptor that
enable accurate prediction of ligand/receptor
interactions and screening for GPCR modulators”,
JCAMD, 2009
.
What makes an agonist: X-ray 2011
• Warne, et al., Schertler G, Tate C, The
structural basis for agonist and partial agonist
action on a β1 adrenergic receptor, Nature,
2011.
• 0.8A Rmsd
Omit Models
for GPCRs
• Loops are impossible
to predict, what could
be a solution?
• Deleting ECL2 in a
model can be
tolerated
Reynolds, Katritch, Abagyan,,
“Identifying conformational changes of
the b2 adrenoceptor that enable
accurate prediction of ligand /receptor
interactions and screening for GPCR
modulators”, JCAMD, 2009
.
Histamine H1 receptor
• Shimamura et al.,
Structure of the
human histamine H1
receptor complex with
doxepin, Nature, 2011
• A first-generation H1R
antagonist, doxepin, can cause
many types of side effects due to
its antagonistic effects on
histamine H2, serotonin 5-HT2,
a1-adrenergic, and muscarinic
acetylcholine receptors
GPCR modeling & docking challenge
• 6 new GPCR
structures from the
Stevens lab
• 270 models
• ~25 groups per target
• Model server:
http://ablab.ucsd.ed
u/GPCRDock2010
• Contact sim. server:
http://ablab.ucsd.ed
u/SimiCon/
Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan R
Status of GPCR Modeling and Docking as Reflected by Community-wide GPCR Dock 2010
Assessment.
Structure, 2011 Aug 10, 19, 1108-26
Ligand RMSD
& contacts
Aij – contact of atoms I
and j in model A, a
number between 1
and 0 (no contact)
Bij - contacts in B
CS =1-CD
| A B |

CD 
 max( A , B )
ij
ij
ij
ij
Increasing modeling difficulty
Contact Similarity CS
Are we there yet? Naïve models & acceptable
variation
170,000 protein pairs
GPCR models explain subtype specificity
• A2a based models of A2b, A1 and
A3 Adenosine receptor subtypes
built
• 88 ligands for 4 adenosine
receptors were used
• Ligand-guided optimization results
in 3D models that explain
observed specificity
Katritch V, Kufareva I, Abagyan R. Structure based
prediction of subtype-selectivity for adenosine
receptor antagonists. Neuropharmacology. 2010 Jul
15.
Discovery of potent new A2a antagonists
• Out of 56 high ranking
compounds tested in A2a AdR
binding assays, 23 showed
affinities under 10 μM, 11 of
those had sub-μM affinities
and two compounds had
affinities under 60 nM.
Katritch V, Jaakola VP, Lane JR, Lin J, Ijzerman AP, Yeager M,
Kufareva I, Stevens RC, Abagyan R. Structure-based
discovery of novel chemotypes for adenosine
A(2A) receptor antagonists. J Med Chem. 2010
Feb 25;53(4):1799-809
What do we know about 48 Nuclear
Receptors? 3D vs Chemistry (Mar 2011)
Gene
SwissProt code
N of PDBs
Actives
Decoys
Gene
SwissProt Code
N of PDBs
Actives
Decoys
NR0B2
NR0B2_HUMAN
4
6346
NR1I2
NR1I2_HUMAN
6296
THA_HUMAN
307
6042
NR2B1
RXRA_HUMAN
262
6087
NR1A2
THB_HUMAN
429
5922
NR2B2
RXRB_HUMAN
102
6243
NR1B1
RARA_HUMAN
216
6114
NR2B3
RXRG_HUMAN
124
6235
NR1B2
RARB_HUMAN
210
6132
NR3A1
ESR1_HUMAN
1131
5099
NR1B3
RARG_HUMAN
191
6140
NR3A2
ESR2_HUMAN
1003
5218
NR1C1
PPARA_HUMAN
810
5533
NR3B1
ERR1_HUMAN
46
6303
NR1C2
PPARD_HUMAN
466
5862
NR3B3
ERR3_HUMAN
17
6332
NR1C3
PPARG_HUMAN
1063
5198
NR3C1
GCR_HUMAN
1075
5092
NR1F1
RORA_HUMAN
11
6338
NR3C2
MCR_HUMAN
1062
5127
NR1H2
NR1H2_HUMAN
481
5868
NR3C3
PRGR_HUMAN
1062
5127
NR1H3
NR1H3_HUMAN
470
5875
NR3C4
ANDR_HUMAN
1335
5014
NR1H4
NR1H4_HUMAN
51
6297
NR5A1
STF1_HUMAN
9
27
2
1
64
21
4
18
9
11
13
43
2
25
NR1A1
19
6330
NR1I1
VDR_HUMAN
3
5
17
4
1
9
12
15
73
2
8
4
10
18
170
6179
• 48 Nuclear Receptors : Accumulated Knowledge in PDB and Chembl
• 38 NRs with crystal structures in Protein Data Bank (PDB),
• 28 with Chembl actives, 27 with both (no 3D: ERR2, steroid hr, deafness)
Ligand Only 3D Models: APF
Method
• Build self-consistent flexible
superposition
• Compute average atomic
property fields
• Dock & Score a new chemical
into that field
Chem Biol Drug Des. 2008 Jan;71(1):15-27.
Atomic property fields: generalized 3D
pharmacophoric potential for automated ligand
superposition, pharmacophore elucidation and 3D
QSAR.
Max Totrov
J Comput Aided Mol Des. 2010 Mar;24(3):173-82
Spatial chemical distance based on atomic
property fields.
Grigoryan et al.
Unifying Docking & Ligand based
methods
ligand
ensemble
binding site
ensemble
3D APF pharmacophores
1D
fingerprints
Neves, Yu-Chen Chen
APF docking vs Pocket Docking
PPARg activity models,
Docking
AUC= 96.3
3D-Chem-Super
AUC= 94.0
Anaheim ACS 2011 DUD competition:
Test set: 85 actives + 3127 decoys
Training set: 15 actives (co-crystals)
Test set: 35 actives + 1745 decoys
Gene
SwissProt Code
N of PDBs
Actives
Decoys
NR1I2
NR1I2_HUMAN
25
6296
NR2B1
RXRA_HUMAN
262
6087
NR2B2
RXRB_HUMAN
102
6243
NR2B3
RXRG_HUMAN
124
6235
NR3A1
ESR1_HUMAN
1131
5099
NR3A2
ESR2_HUMAN
1003
5218
NR3B1
ERR1_HUMAN
46
6303
NR3B3
ERR3_HUMAN
17
6332
NR3C1
GCR_HUMAN
1075
5092
NR3C2
MCR_HUMAN
1062
5127
NR3C3
PRGR_HUMAN
1062
5127
NR3C4
ANDR_HUMAN
1335
5014
NR5A1
STF1_HUMAN
9
27
2
1
64
21
4
18
9
11
13
43
2
19
6330
Three principal types of compound activity models:
• 4D pockets (no bias)
• 4D ligand superposition/APF (small bias)
• 2D chemical machine learning (set dependent)
ICMFF: better force field for protein
structure prediction
•
•
•
•
A correcting “hump” potential for phi-psi
QM fit and X_ray fit, e=2
N-C-C angle flexible
Separate hydrogen and non-H combination
rules
•
ICM
HLP
ICM
HLP
Arnautova, Abagyan, Totrov, 2011 Development of a new physics-based internal
coordinate mechanics force field (ICMFF) and its application to protein loop
modeling. Proteins , 79,477
Modeling with implicit membrane
Esolv =
N ato ms
å s (z )A
i
i
i =1
• Peptides folded with ICM
near membrane.
• Membrane interaction
parameters derived
•
•
(a) acetylcholine receptor M2, (b) fd
bacteriophage coat protein, (c) MerF mercury
transport protein, (d) phospholamban, and
(e) influenza A virus M2 proton channel
NMR orientation in red
Efficient molecular mechanics simulations of the
folding, orientation, and assembly of peptides in
lipid bilayers using an implicit atomic solvation
model
Andrew J. Bordner, Barry Zorman†,and Ruben Abagyan
JCAMD, 2011
i
ì
s imem
z £a
ï
ï 1
éë( b - z )s imem + ( z - a )s iaq ùû a < z < b
s i ( z) = í
b
a
ï
ï
s iaq
z ³b
î
Conclusions & Future
• 77,000 Protein Data Bank (PDB) structures translate
into 2000+ experimental pockets ensembles
• Docking/scoring to a few diverse pockets recognizes an
activity of a chemical compound (no training or bias)
• ICM docks ligands in seconds with 91% of correct top
scoring poses (95% top 3)
• Ligand-guided pocket models help SBDD (GPCRs)
• Ligand-based pharmacophoric field models (APF) may
reach a comparable performance
• Promiscuous receptors/enzymes with broad specificity,
eg PXR, Cyps, HERG, transporters etc. are presently
undockable, but are still APF/docking models may be
useful.
Acknowledgements
Lab Members
• Irina Kufareva (P4D,
gpcr2011 )
• Marco Neves (docking)
• Andrey Ilatovsky (P4D)
• Manuel Rueda
• Yu-chen Chen
Former Lab Members
• William Bisson (LGM)
• Giovanni Bottegoni (4D)
• Seva Katritch (GPCR)
• So-Jung Park (NR)
• Kim Reynolds
• J.H.An
Molsoft
• Maxim Totrov (ICM)
• Eugene Raush (activeICM)
• Polo Lam (GPCR models)
• Elena Arnautova (force
field)
Funding from NIGMS
Collaborators
4D Andrea Cavalli, Walter R. IIT
Membranes: Andy Bornder, AZ
GPCRs Ray Stevens, Vadim
Cherezov Lab @ Scripps, La
Jolla
Tracy Handel Lab, UCSD
Larry Miller, Mayo, AZ
Patrick Sexton, Arthur
Christopoulos, Monash,
Melbourne. Family B modeling
Ligand Discovery & Modeling
John Reed’s, Alex Terskikh. and
Bob Liddington lab, Burnham Institute, LJ
Claude Cochet, Grenoble, France
Susan Taylor, Palmer Taylor, UCSD
Edmond Ma, KY Wong, C.M.Che Hong Kong
Phil Cole, Hopkins
Mark Yeager, Virginia
Tim Cardozo, NYU
Download