Sternberg_Movie

advertisement
in silico small molecule discovery
Target
gene
Discover Hit to
lead
hit
Target gene High
identified
throughput
with a viable screen
assay
in silico
Optimise Clinical
Sales
lead
Case 1 – receptor structure known
dock
molecules
into receptor
Computer
Database of
Molecules
100,000 +
Novel in silico hits
~ 100
Secondary
assay
? hits
IC50 < 10 µM
How successful is this method?
• From Shoichet’s group on target –
protein tyrosine phosphate 1B
Method
Compounds
tested
Hits with IC50 < 10µ M Hit rate
High throughput
screening (HTS)
400,000
6
0.001%
In silico docking
365 from
docking
18
5%
• None of the in silico hits found by HTS
• But unpredictable - other systems yielding < 1%
How does one get the receptor
structure?
• X-ray structure available already at RCSB
databank
• Set up a structure determination
• Predict structure
X-ray crystallography pipeline
Cloning
Protein structure
Recombinant protein
Expression
Electron density
map
Protein purification –
mg quantities
X-ray diffraction pattern
Crystallization
Protein crystals
Prediction protein structure by homology
Query
sequence
Match sequence against library of known folds
Matched
fold
Phyre- www.sbg.bio.ic.ac.uk
Phyre and predecessor 3DPSSM
> 1,000 citations
Case 2: Ligand activity data available
Observed activity
Structureactivity rules
Screen
Novel in silico hits
database
INDDExTM –A logic-based method
• Muggleton & Sternberg developed a logic-based
strategy
• Method now incorporated into INDDEx within an
Imperial spin-out Equinox Pharma
• INDDEx designed to exploit availability of active
and inactive data on a at least c. 5 but ideally more
ligands
Logic-rules lead to new chemotypes
7Å
B
B
C
C
D
A
Fragment C
is bonded to
fragment D
Fragment B
is bonded to
fragment C
Fragment A
is 7Å from
fragment B
INDDEx can learn complex rule from simpler facts
7Å
B
C
D
A
Fragment A is 7Å from fragment B which is bonded
to fragment C which is bonded to fragment D

Rules can be understood by chemists

Standard programs:
Activity = 0.45 LogP + 0.56667 Lumo +1.65 V
ILP rule:
In an active molecule:
Fragment A is 7Å from fragment B which is bonded
to fragment C which is bonded to fragment D
7Å
A
B
C
D

Blind trial of hit discovery on GPCR-1
Data from literature
Observed activity
- From Literature
INDDEx
250 novel
in silico hits
in silico at Equinox
Equinox outsourced wet chemistry and biology
30 Verified in vitro hits
NEW CHEMOTYPES
Chemistrt
Test a
Cerep
Order
157 Compounds
GPCR-1: training set
Distribution of 686 training molecules collected from public domain
Actives
Inactives
GPCR Target 1 hits for
optimatisation
4.7M molecules in Zinc database
400,000 drug like molecules
500 in silico hits
250 hits & new chemotypes
157 tested for inhibition
76 actives
39 for IC50
30 confirmed
30 chemotypes
30
GPCR-1: results of primary screening
CB1 results - primary screening
Number of hits
90
81
80
70
60
50
40
30
20
10
19
10
9
22
16
0
>70%
60%-70% 50%-60% 40%-50% 30%-40%
Percent of specific binding
Number of in silico hits: 157 (10µM concentration)
Number of actives: 76
Number of inactives: 81
Primary screen success rate = 48%
<30%
True hits
False hits
GPCR-1: new chemotypes
Distribution of hits based on their diversity (Tanimoto coefficients)
CB1 results - new chemotype
14
Number of hits
14
12
10
8
8
8
6
4
2
0
<0.60
0.60-0.70
0.70-0.75
Tanimoto coefficient
New chemotype
Equinox hit discovery on GPCR-2
- Data from BioPrint (Cerep)
Observed activity
- From BioPrint
INDDEx
250 novel
in silico hits
in silico at Equinox
Equinox outsources wet chemistry and biology
28 Verified in vitro hits
Test a
Chemistrt
Cerep
Order
94 Compounds
Confirmed hit rate of in silico predictions
on secondary screen c. 35%
Target 1
Target 2
In silco hits
157
94
Primary screen hits
(>30% binding at 10µM)
76
42
No. compounds tested for IC50
39
28
IC50 results (<12µM)
30
28
Estimated secondary hits if all
primary hits tested
40
42
Estimated hit rate =
38/157
= 24 %
42 /94
= 45 %
estimated secondary hits
In silico hits
Comparative hit rates
Company / approach
Target
Hit
Rate
Technology
INDDEx
GPCR 1 & 2 + 35 %
unknown
target
Ligand-based
Structure-based
Multiple
targets
Docking into 3D
structure
High throughput
Multiple
targets
Average
< 2%
Average
0.001%
Experimental
screening
Concluding remarks
• If protein structure available can initiative an
in silico screening approach to find hits.
– Success rate generally <.2%
– X-ray structure determination requires mgs of material
– Prediction of structure if sequence identity > 50%
• If structure- activity data available then in silico
methods can yield far better hit rates c. 35%
• in silco methods complement high throughput
and can find different hits
In silico small molecule discovery
• Michael Sternberg, Ata Amini, Paul Freemont & Michael
Sternberg
• Imperial Collge Lond
– www.sbg.bio.ic.ac.uk & www.doc.ic.ac.uk/~shm
– www.equinoxpharma.com
Download