Presentation - Chemical Information BULLETIN

advertisement
Privileged Substructures Revisited:
Target Community-Selective Scaffolds
Jürgen Bajorath
Life Science Informatics
University of Bonn
Privileged Substructures
 First postulated by Evans et al. in 1988 based on the observation that
many cholecystokinin antagonists contained conserved substructures
not frequently seen in other active compounds
 Since then the search for target class-privileged chemotypes has
continued in medicinal chemistry
 Generally accepted definition:
- Recurrent fragments in ligands of a given target family
- Selective at the family level, but not for individual targets
Evans BE et al. J. Med.Chem.1988, 31, 2235-2246
Privileged Substructures
 Existence of truly target family-privileged substructures has
remained controversial
 Intrinsic limitation: Search for privileged substructures has
been based on frequency of occurrence analysis of preselected substructures
 Often drawn conclusion: Substructure might occur with high
frequency among ligands of a particular target family but also
act on other families
Privileged Substructures
Are target family-privileged substructures truly privileged?
Target Family Set
# Compounds
# Substructures
GPCR class A
21620
1190
Ligand gated ion
channels
3792
297
Nuclear hormone
receptors (NHRs)
2176
121
Protein kinases
1079
101
Serine proteases
3015
323
Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Privileged Substructures
Are target family-privileged substructures truly privileged?
Ligand sets
Target Family
Substructure
Sets
GPCR
Ion
channels
NHRs
Protein
kinases
Serine
proteases
Random
cpd sets
GPCR class A
-
26%
10%
11%
17%
46%
Ligand gated ion
channels
47%
-
15%
19%
92%
99%
Nuclear hormone
receptors (NHRs)
40%
30%
-
17%
15%
45%
Protein kinases
48%
34%
16%
-
20%
57%
Serine proteases
25%
11%
7%
91%
-
37%
Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Changing the Analysis Concept
 Do molecular scaffolds exist that exclusively occur in
ligands of individual target families ?
Peptidases
GPCRs
Kinases
...
- Bemis & Murcko framework (scaffold)
- Large-scale distribution in target families
 Departing from frequency of occurrence analysis of preselected substructures
 Systematic compound data mining taking all available
activity annotations into account
Hierarchical Scaffolds
Compound
1 R-groups
Framework
2 Ring System
3 Linker
Bemis GW and Murcko MA. J. Med. Chem.1996, 39, 2887-2893
Public Data Source - BindingDB
 BindingDB database:
-
Public repository of activity information of small
molecules
~31,000 compound entries with ~57,000 activity
annotations
17,745 compounds active against human targets
extracted
Analysis Strategy - Compound Sets
 Target pair sets:
- Active compounds are organized into target pair sets
- A set contains all compounds active against two individual
targets (i.e. compounds might belong to multiple sets)
 Binding DB target pair sets:
- Sets obtained for 520 pairs of targets that share >= 5
-
compounds
6,343 compounds active against 259 human targets
 Pubchem confirmatory bioassays:
- Only 3 relevant human target pairs meet the >= 5 compound
criterion
Compound-Based Target Network
 520 target pairs are visualized in a network
representation
- Nodes: targets
- Edges: target pair sets
- Edge width: number of
1
2
3
4
5
6
shared compounds
 Densely connected
communities
- 18 communities
- >= 4 targets
- Different target families
7
9
10
11
12
13
14
15
16
8
17
18
Community-Selective Scaffolds
 520 human target pair sets (6,343 BDB compounds;
259 targets); 18 target communities
 206 community-selective scaffolds:
- Exclusively act in a single community
- With 5 - 45 compounds/scaffold (av. ~12)
- Yielding 147 distinct carbon skeletons (topological
diversity)
Adding Selectivity Information
 For each compound active against a target pair, its target
selectivity (TS) is calculated as:
TS  pKi A  pKi B
 Compound |TS| values range from 0 to 6.86
- 0: equal potency, no selectivity
- 6.86: potency difference of nearly 7 orders of magnitude, i.e.
highly selective for one target over another
 Selectivity profiles of scaffolds
- Community-based
- Target-based
Selectivity Profiles
 Community-based selectivity profile:
-
For each scaffold found in a given community
 All corresponding compounds active against any target pair in
this community pooled
 Median of their absolute TS values determined (median |TS|)
 Target-based selectivity profile:
-
For each scaffold active against a given target
 All corresponding compounds active against this target pooled
 Selectivity against any other target calculated
 Median of their TS values determined (median TS)
Community Selectivity of Scaffolds
 Scaffold / Community heat map:
- Columns: target communities
- Rows: scaffolds
- Color spectrum: median |TS|
 Red: scaffold yields many
compounds with different potency
against individual targets
 Yellow: scaffold does not yield
selective compounds
 Non-selective scaffolds
- Occur in multiple communities
 Community-selective scaffolds
- Exclusively occur in one community
Target Selectivity of Scaffolds
 Scaffold / Target heat map:
- Columns: targets in a community
- Rows: scaffolds
- Cell: the scaffold represents >= 5
compounds active against the target
- Color spectrum: median TS
 Red (positive): more selective for the
target over others in the community
 Yellow (negative): more selective for
other members of the community
Target Selectivity of Scaffolds
Community 3: 16 serine proteases
 Different scaffolds display same
selectivity profile
-
e.g. Factor Xa/Thrombin
 Scaffolds with no apparent target
selectivity
 Number of scaffolds per target varies
-
Factor Xa: 17; Thrombin: 18
Tryptase: 0; Hepsin: 0
Target Selectivity Ranking
 Community-selective scaffolds are ranked according to
median |TS|
5.2
37 scaffolds
at least half of compounds
having >= 100-fold potency
differences against >= 2
community targets
111 scaffolds with targetselective tendency
2
1
0
Community-Selective Scaffolds
98: 1.10
3: 4.03
Rank
Median |TS|
DPP8
CA9
CA2
DPP4
CA1
CA14
CA12
Color spectrum: median TS
Red: high potential to yield
target-selective compounds
Yellow: low potential
CA5A
CA7
CA5B
CA4
CA3
CA6
Selectivity Searching (MDDR)
Thrombin
FXa
Highly selective for
FXa over other serine
proteases
Selectivity Searching
Caspase 7
Caspase 3
Inhibit both caspase 3
and 7 with nM potency;
~200-fold selective over
caspases 1, 6, 8
Extending the Analysis: ChemblDB
 Recent public domain database: ChemblDB
-
~500,000 compounds with activity information
32,848 compounds with high-confidence annotations
active against 671 human targets
 High-confidence activity annotations:
-
Target confidence level: 9
Interaction type: D(irect)
ftp://ftp.ebi.ac.uk/pub/databases/chembl/latest/
ChemblDB vs. BindingDB
 Comparison at different levels
-
Active compounds (human targets)
Scaffolds
Network
Community-selective scaffolds
Topologically distinct scaffolds
ChemblDB
BDB
3,589
32,848
17,745
Compounds
ChemblDB
BDB
1,409
12,902
6,291
Scaffolds
ChemblDB vs. BindingDB
 Comparison at different levels
-
Active compounds (human targets)
Scaffolds
Network
Community-selective scaffolds
Topologically distinct scaffolds
BDB
GPCRs
shared targets
unique targets
tyrosine
kinases
CDB
ChemblDB vs. BindingDB
 Comparison at different levels
-
Active compounds (human targets)
Scaffolds
Network
Community-selective scaffolds
Topologically distinct scaffolds
ChemblDB
BDB
34
311
206
Community-selective
ChemblDB
BDB
85
227
147
Topologically distinct
Community-Selective Scaffolds
 Distribution in drugs?
- DrugBank: 1,247 approved drugs with 726 unique scaffolds
- Only 11 overlap with 206 community-selective BDB scaffolds
- Community-selective scaffolds currently underrepresented in
drugs; opportunities for further chemical exploration
Conclusions
 The existence of target class-privileged substructures has
remained controversial over the years
 From putative privileged substructures to confirmed target
community-selective scaffolds through systematic data mining
 Community-seletive scaffolds are abundant and topologically
diverse
 A subset of community-selective scaffolds displays a notable
tendency to produce compounds with different target selectivity
 BDB and CDB contain complementary target and scaffold
information
Acknowledgments
Ye Hu
Anne Mai Wassermann
Eugen Lounkine
Download