Title: Plate-Based Diversity Subset Screening: An Efficient Paradigm

advertisement
SUPPLEMENTARY INFORMATION for
Title: Plate-Based Diversity Subset Screening: An Efficient Paradigm for High
Throughput Screening of a Large Screening File
Journal: Molecular Diversity
Authors: Andrew S. Bell, Joseph Bradley, Jeremy R. Everett, Michelle Knight, Jens Loesel,
John Mathias, David McLoughlin, James Mills, Robert E. Sharp, Christine Williams, Terence
P. Wood
Corresponding Authors
Jeremy R Everett: j.r.everett@greenwich.ac.uk
Jens Loesel: Jens.Loesel@ecotoxchem.co.uk
Total
number
Total
Number of
BCUT Cell of BCUT number of cells covered
Occupancy cells
compounds
by PBDS
1
6,968
6,968
1,301
2
4,379
8,757
1,508
3
2,482
7,447
1,149
4
1,994
7,975
1,105
5
1,514
7,570
960
6
1,185
7,110
828
7
948
6,636
697
8
882
7,056
698
9
740
6,660
609
Totals
21,092
66,180
Number of
cells
% of cells
double% of cells
doublecovered by covered by covered by
PBDS
PBDS
PBDS
N/A
19%
N/A
163
34%
4%
263
46%
11%
377
55%
19%
413
63%
27%
407
70%
34%
407
74%
43%
387
79%
44%
390
82%
53%
Supplementary Table 1: the profile of low occupancy cells, with 1 to 9 compounds per cell, in the Pfizer
screening file at the time of the construction of PBDS and the coverage of those cells by PBDS. N/A = not
applicable, as single occupancy cells cannot be double-covered. Even though these low occupancy cells were
not targeted by the PBDS design algorithm, they are enriched relative to random selection.
Plate-Based Diversity Screening (PBDS) Supplementary Information C
9-Feb-16
Page 1 of 6
HTS
Campaign
Beta2
Beta2
Beta2
D3
D3
D3
MC3
MC3
MC3
HIV RT
HIV RT
HIV RT
V1A
V1A
Method of
Series
Definition
BCI Cluster
Murcko
BCUT
BCI Cluster
Murcko
BCUT
BCI Cluster
Murcko
BCUT
BCI Cluster
Murcko
BCUT
BCI Cluster
Murcko
Number
of series
identified
from
primary
HTS data
468
123
213
499
162
240
444
86
211
455
124
209
1426
484
Mean
Series
Size
9.0
13.2
10.1
9.4
11.2
10.2
9.8
10.7
9.7
8.6
10.8
10.8
10.4
11.9
Minimum Maximum
Series
Series
Size
Size
5
57
5
193
5
51
5
35
5
126
5
63
5
40
5
74
5
47
5
41
5
96
5
205
5
56
5
247
Median
Series
Size
7
8
7
8
7
8
8
8
7
7
7
7
9
7
Number
of Series
contained
in PBDS
273
71
128
291
83
133
253
38
117
181
52
89
614
201
Supplementary Table 2: the distribution of the number of primary hit series found by full file singleton HTS in
five representative campaigns and the properties of those series, together with an analysis of the number and %
of those series that would have been found if the Plate-Based Diversity Subset, PBDS had been used instead of
the full Pfizer screening file. Three different methods of series definition were used: BCI fingerprints, Murcko
scaffolds and BCUTs. A minimum series size of 5 compounds was defined in advance. The five targets were:
the human beta-2 adrenergic receptor (Beta2), the human D3 dopamine receptor (D3), the human melanocortin
3 receptor (MC3), the HIV reverse transcriptase enzyme (HIV RT) and the human V1a vasopressin receptor
(V1A).
Plate-Based Diversity Screening (PBDS) Supplementary Information C
9-Feb-16
Page 2 of 6
Relative
Recall %
(PBDS/
HTS)
58%
58%
60%
58%
51%
55%
57%
44%
56%
40%
42%
43%
43%
42%
Supplementary Figure 1: a) the % single-coverage of the target BCUT space achieved against the number of
screening plates selected in 17 sequential iterations of the plate selection process. As the process progresses
from the random starting point (magenta) to the final, 17 th iteration (light blue), the single-coverage of the target
BCUT space improves significantly, such that more BCUT space is covered by fewer plates. b) the
corresponding % double-coverage of the target BCUT space achieved in the same optimisation. The process
converged by iteration 17.
Supplementary Figure 2: the distribution of hydrogen bond donors for compounds in the Plate-Based Diversity
Subset, PBDS.
Plate-Based Diversity Screening (PBDS) Supplementary Information C
9-Feb-16
Page 3 of 6
Supplementary Figure 3: the distribution of the number of hydrogen bond acceptors (sum of nitrogen and
oxygen atoms) in each molecule of the Plate-Based Diversity Subset, PBDS.
Supplementary Figure 4: the distribution of molecular weight for compounds in the Plate-Based Diversity
Subset, PBDS.
Plate-Based Diversity Screening (PBDS) Supplementary Information C
9-Feb-16
Page 4 of 6
Supplementary Figure 5: the distribution of the number of rotatable bonds across the compounds in the PlateBased Diversity Subset, PBDS.
Supplementary Figure 6: the distribution of the calculated (Ertl method) topological polar surface area in Å2
across the compounds in the Plate-Based Diversity Subset, PBDS.
Plate-Based Diversity Screening (PBDS) Supplementary Information C
9-Feb-16
Page 5 of 6
b)
100
% of series retrieved
% of series retrieved
a)
80
60
40
80
BCI cluster
60
Murcko
BCUT
40
Hit rate
20
20
0
100
0
10,000
20,000
30,000
40,000
number of compounds in iteration
0
0
10,000
20,000
30,000
40,000
number of compounds in iteration
Supplementary Figure 7: a) the additional % of primary series retrieved for the PBDS vs the full file HTS for the
beta-2 adrenergic receptor assay (with three different series definitions), plotted against the number of
compounds added in a single iteration. The compounds in the iteration were in descending order according to
their score in a Bayesian model of probability of beta-2 activity. The Hit rate line (light blue) plots the % of
compounds found in the PBDS plus the single iteration relative to the full-file primary screen. b) the
corresponding additional % of primary series and hit rate retrieved for the GDRS vs the full file HTS for the
same assay.
Plate-Based Diversity Screening (PBDS) Supplementary Information C
9-Feb-16
Page 6 of 6
Download