What is 3D searching? - Turquoise Consulting

Overview of 3D Searching:
A Powerful Tool for Computer-Assisted
Molecular Design
R. S. Pearlman1 and O. F. Güner2
1College
of Pharmacy, University of Texas, Austin, TX
2Turquoise Consulting, San Diego, CA
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Introductory Remarks
Drug action is a 2-phase process; drug discovery
must address both phases:
 Transport of drug from site of administration (e.g., GI-tract if
oral, blood or muscle if injected) to the “biophase” where the
receptor is located
 Does the compound have appropriate physical chemical
properties?
 Is it drug-like?
 Is it bioavailable?
 Interaction of drug with receptor
 Does the compound have the right size, shape, and
substructural features (e.g., pharmacophore) to enable
favorable interaction with receptor?
 Does it show sufficient “affinity” for receptor and/or display
intrinsic “activity”?
2
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Outline
What is 3D searching?
Why perform 3D searching?
Essential components required
Brief example
3
Copyright © 2006 Turquoise Consulting. All Rights Reserved
What is 3D Searching?
Searching within large databases of 3D chemical structures for those
compounds which satisfy both the chemical and geometric
requirements specified in the 3D search query
The search typically reflects the chemical and geometric requirements
for a ligand to interact favorably with a particular bio-receptor
 That is, the search query usually reflects “the pharmacophore”
3D Searching review articles
 VanDrie, J. H. “3D Database Searching in Drug Discovery,”
http://www.netsci.org/Science/Cheminform/feature06.html
 Güner, O. F. and Henry, D. R. “Three-dimensional Structure Searching,” in
The Encyclopedia of Computational Chemistry; Schleyer, P. v. R.; Allinger,
N. L. Clark, T.; Gasteiger, J.; Kollman, P. A.; Schaefer III, H. F.; Schreiner, P.
R. (Eds.): John Wiley & Sons: Chichester, 1998, vol 5, pp 2988-3003.
 Kurogi, Y. and Güner, O. F. “Pharmacophore Modeling and Threedimensional Database Searching for Drug Design Using Catalyst,” Curr.
Med. Chem. 2001, 8, 1035-1055.
4
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Outline
What is 3D searching?
Why perform 3D searching?
Essential components required
Brief example
5
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Why Perform 3D Searching?
Ideas for extending existing leads
Ideas for new leads
Retrieving “active” compounds from commercial databases
“We use 3D searching to break other people’s patents.”
(anonymous --- AAPS, 1989)
“Cover” planned patents
 Examples of pharmacophores in patents:
 WO 98/04913 – Biogen patent on VLA-4 inhibitors
 WO 98/46630 – Peptide therapeutics on Hepatitis C NS3 inhibitors
 US 2002/0013372 – Pfizer on CYP 2D6 inhibitors
Used to validate pharmacophore models
Used in “reverse” to predict other activities
6
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Three “Levels” of Computer-Assisted
Drug Design
No knowledge of receptor structure
 Classical QSAR
 Modern QSAR
 Recursive partitioning
Limited knowledge of receptor structure
 3D searching/screening based upon structural complementarity (e.g.,
Catalyst, ISIS/3D, Unity, Phase)
 Ligand-based pharmacophores (e.g., DISCOtech, GASP, HipHop)
 Pharmacophore-based QSAR, (e.g., CoMFA, HypoGen)
Complete knowledge of receptor structure
 “Structure-based drug design”
 Docking (e.g., Glide, C2.LigandFit, DOCK, FlexX, GOLD)
 De Novo design (e.g., LeapFrog, Ludi)
 Receptor-based pharmacophores (e.g., C2.SBF, Catalyst, Unity)
 Screening based on computational estimates of drug-receptor “interaction
energy”
7
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Searching Software – 2D vs 3D
2D Substructure (and similarity) searching
User specifies atoms and how they are connected
 Users can’t discard undesired “hits” based on 3D
geometry of those hits (no 3D information allowed)
 Therefore, user must discard undesired hits based on
connectivity and, thereby, pre-determines a large fraction
of sub-structural information about hits
3D Searching
User specifies atom-types and their relative
position in 3D-space
 User does not specify how atoms are connected
 User does not pre-determine “chemistry” of hits
8
Copyright © 2006 Turquoise Consulting. All Rights Reserved
2D vs 3D Searching
2D search: uses 2D
connectivity to constrain
positions of chemical features
3D search: uses 3D geometry
to constrain positions of
chemical features
Unconstrained search for all compounds containing >C=O, -OH,
and –CL would return many spurious hits (“false positives”)
9
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Outline
What is 3D searching?
Why perform 3D searching?
Essential components required
Brief example
10
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Essential Components Required
for 3D Searching
Large number of interesting 3D structures
Database management software for storage and
retrieval
Software to perform search based upon 3D criteria
Rational 3D search criteria
11
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Methods for Acquiring 3D
Structures
Experimental determination (e.g., X-Ray, NMR)
 CSD (CCDC), ca. 100,000 organic compounds
 Most are not pharmacologically relevant
MM or MO geometry optimization
 Requires 3D (not 2D) initial structures
License commercial database of 3D structures
Convert corporate 2D database to 3D structures and hunt for
“buried treasure”
 Corporate structures, virtual libraries, combinatorial libraries
12
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Comments on 3D Databases
The value of searching software is limited by the
value of databases being searched
Size matters (not too big, not too small)
Diversity (sometimes low in corporate databases)
“Richness”
 “Relevance” of structures
 Non-structural information
 Availability of compounds
Stereochemistry
 Needs to be dealt with either within the database, search
query, or search algorithm
13
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Software for Generating 3D
Databases
CONCORD
 Pearlman, R. S. “Rapid Generation of High Quality Approximate 3D
Molecular Structures,” Chem. Des. Aut. News, 1987, 2, 1-7.
CORINA
 Hiller, C.; Gasteiger, J. “Ein Automatisierter Molekülbaukasten,” in
Software-Entwicklung in der Chemie, vol. 1; Gasteiger, J. Ed.;
Springer, 1987, Berlin, pp 53-66
WIZARD
 Dolata, P. D.; Leach, A. R.; Prout, K., “Wizard: AI in conformational
analysis,” J. Comput.-Aided Mol. Des., 1987, 1, 73-85.
AIMB
 Wipke, W. T.; Hahn, M. A., “AIMB: Analogy and intelligence in
model building. System description and performance
characteristics,” Tetrahedron Comput. Meth., 1988, 1, 141.
14
Copyright © 2006 Turquoise Consulting. All Rights Reserved
CONCORD – General
Capabilities
Converts CONnection table to 3D CoORDinates
 2D CT contains information about connectivity per se
 2.5D Contains additional information about stereochemistry
Handles almost all “drug-sized” compounds
Handles input/output in a wide variety of ways
Very fast
Good to excellent structures
Limitations
 No inorganics or metallo-organics
 Single low energy conformation
 Not intended for macrocycles, polymers, or other highly flexible
structures for which 3D structure is determined by extrinsic rather
than intrinsic forces
15
Copyright © 2006 Turquoise Consulting. All Rights Reserved
CONCORD Algorithm
“Expert-system approach with MM cleanup
Uses rule-based “chemical intuition” when applicable (most
acyclic substructures)
Uses pseudo-molecular mechanics approach when “intuition” not
applicable (most cyclic substructures)
 A novel strain function is minimized
 Strain is a function defined such that minimization is performed over
a single, composite variable
Initial structure is checked for close-contacts; dihedrals causing
close-contacts (or all acyclic dihedrals) are then relaxed by ultrafast MM optimization
 Carried out in torsion space, using analytical gradients, and with
substantial topological speed-ups
16
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Commercially Available 3D
Databases
ACD, CMC, MDDR – MDL Information Systems, Inc.
Pomona-90C – Daylight Information Systems, Inc.
CAST-3D, CAS Registry File – Chemical Abstract
Services
TRIAD – UC Berkeley
CHDC, NCI – Tripos Inc,
CAP, Maybridge, NCI, WDI– Accelrys Inc.
NCI – Nat’l Cancer Institute
Plus Corporate databases and chemical supply
houses
17
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Essential Components Required
for 3D Searching
Large number of interesting 3D structures
Database management software for
storage and retrieval
Software to perform search based upon 3D criteria
Rational 3D search criteria
18
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Database Management Software
(DBMS)
General database management software
 There are many examples… all chemistry ignorant (data
“name” oriented)
Chemical database management software
 Accelrys, CAS, Daylight, MDL, Tripos, Oracle*
 Unique, structure-related storage key
 Searchable by structure, as well as name, etc.
 Searchable by 2D substructure keys
 Searchable by 3D substructure keys
 Integrated queries (including biological, chemical data, etc.)
 Some include 3D shape based searches
 Some interface with other modeling and analysis software tools
19
Copyright © 2006 Turquoise Consulting. All Rights Reserved
DBMS -- Keys
“short-cuts”
Bit strings (a.k.a. “fingerprints”) – is a particular
feature present? Yes or no?
2D Substructural keys
3D object/distance keys
 Object: atom, lone-pair, ring-centroid, projected point, etc.
 Distance (rigid): single distance bins
 Distance (flexible): ranges of bins
 3D shape-based keys
 3, 4-point pharmacophore keys
20
Copyright © 2006 Turquoise Consulting. All Rights Reserved
2D Substructural Keys
Which pre-defined 2D substructures are present in
this compound?
21
Copyright © 2006 Turquoise Consulting. All Rights Reserved
3D Object/distance Keys
Which inter-object distances are present within this
conformation?
22
Copyright © 2006 Turquoise Consulting. All Rights Reserved
3D Object/distance Keys
[Flexible Search with UNITY and ISIS/3D]
Which inter-object distances could be achieved by this
compound?
Max/min distance ranges
23
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Flexible Search with Catalyst
crefs from database
crefs from rigid hits
=
Key-based
screening
with loose
tolerance
query
Fitting
with
loose
tolerance
query
crefs for flexible fit
Fit optimization
Check thresholds?
Energy
minimization
Bad - loop back
crefs for screen
crefs for fit
Parallelized !
Good - break out
Flexible Hit
24
Flexible + Rigid Hits
Hit List
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Essential Components Required for
3D Searching
Large number of interesting 3D structures
Database management software for storage
and retrieval
Software to perform search based upon
3D criteria
Rational 3D search criteria
25
Copyright © 2006 Turquoise Consulting. All Rights Reserved
History and Evolution of 3D DBMS
Software
1974 MOLPAT – Gund, Wipke, Langridge - Princeton
1982 DOCK – Kuntz et al. – UC San Francisco
1987-88 – Fast 3D Builders
 CONCORD, CORINA, WIZARD, AIMB
1988 Caveat – Bartlett – UC Berkeley
1988 3D Search – Sheridan et al. Lederle Labs
1989 Aladdin – Van Drie, Martin – Abbott Labs
1989 MACCS-3D – Henry et al., MDL
1990 ChemDBS-3D – Davies et al., Accelrys
1991 UNITY – Hurst et al., Tripos
1992 ISIS/3D - Henry et al., MDL
1992/3 Catalyst – Van Drie, Kahn - Accelrys
26
Copyright © 2006 Turquoise Consulting. All Rights Reserved
MACCS-3D
Screen from MACCS-3D displaying a hit retrieved from MDDR3D based on a CNS active drugs pharmacophore from: Lloyd,
E.J. and Andrews, P.R. J. Med. Chem. 1986, 29, 453.
27
Copyright © 2006 Turquoise Consulting. All Rights Reserved
ISIS/3D
Screen from ISIS/3D displaying a dopamine antagonist
pharmacophore proposed by Martin, Y.C. Tetrahedron Comput.
Meth. 1990, 3, 15-25; and a hit retrieved from MDDR-3D.
28
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Unity
A UNITY query based on a set of muscarinic M3 receptor antagonists
(Marriott, D.P., Dougall, I.G., Meghani, P., Liu, Y., and Flower, D.R. J. Med.
Chem. 1999, 42, 3210) developed using DISCOtech and refined via Tripos’
Pharmacophore Model Analysis tools.
29
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Catalyst
Screen from Catalyst displaying a hit retrieved from Maybridge database
based on an angiotensin II blockers pharmacophore developed by Peter
Sprague. The conformation of the hit with the highest score is shown
overlaid with the original query.
30
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Phase
31
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Query Development
2-Fold objective
 Find leads to active compounds
 Don’t find leads to inactive compounds
Iterative process
 Initial query
 Validate
 Modify or improve
Build small, development database (training set)
 Include actives and “relevant” inactives
 Avoids ambiguities caused by hits of unknown activity
32
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Searching Software -- Queries
Non-structural criteria
Chemical criteria: atom types or 2D substructures
(fragments)
Geometric criteria: 3D constraints between objects
 Representation of pharmacophore
Shape criteria
 Inclusion volumes, exclusion volumes
33
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Searching Software -- Queries
Atom-types:
 Element
 Charge, hybridization, connectivity, ring, etc.
Objects
 Atoms, pharmacophore features (e.g., hydrophobe)
 Points, vectors, planes, spheres, etc.
Constraints
 Distances
 Angles, dihedrals
 RMS deviations
 Inclusion, exclusion volumes
34
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Importance of “Forbidden Regions”
Example: rigid search,
same atomic constraints
 No forbidden regions: 2,016
hits
 2 “sides of box”: 1,377 hits
 4 “sides of box”: 383 hits
 5 “sides of box”: 46 hits
Forbidden regions are even
more important for flexible
searching
Explore and report inactivity
35
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Searching Software – Search
Phases
Key-based screening (rapid screen-out phase)
 The objective is to quickly eliminate all those compounds
that cannot possibly satisfy the query
 Use keyed structural information
 Consider one-constraint at a time
Atom-by-atom mapping (slower geometric search)
 The objective is to actually verify that the compound satisfies
the query
 Consider all constraints simultaneously
 Conformational flexibility issue needs to be addressed
 Stereochemistry
36
Copyright © 2006 Turquoise Consulting. All Rights Reserved
3D Searching Process
Query input
Databases and
Spreadsheets
Database
Subset
Key-based screening
2D, 3D, 4D, 1D
Hits
Atom-by-atom
mapping
Hit list output
37
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Conformational Flexibility
The issue:
 If the query reflects putative bound conformation which is
different than the low energy conformation, then search over
database of low energy conformations might miss some
interesting hits.
Approaches to the issue:
 Handle the conformational flexibility within the database, by
storing multiple conformations of each compound
 Handle the conformational flexibility within the searching
query via flexible queries
 Handle the conformational flexibility within the search
process, via on-the-fly conformational exploration
38
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Addressing Conformational
Flexibility – in the Database
Store multiple conformations of each compound
 Too many to store (33 = 27, 124 = 20,736)
 Too many to search
 Still no guarantee that bound conformation is amongst those
stored
 Example reference:
 Murrall, N. W.; Davies, E. K. “Conformational Freedom in 3-D
Databases,” J. Chem. Inf. Comput. Sci., 1990, 30, 312-316.
39
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Addressing Conformational
Flexibility – in Search Query
Build conformational flexibility into query
 Specify ranges for geometric constraints
 Increasing range increase false positives
 Decreasing range increase false negatives
 Specify “hinge” points
 Differentiate parts of the query dealing with flexible regions
 Generality of query may be compromised
 Example references:
 Güner, O. F.; Henry, D. R.; Pearlman, R. S. “Use of Flexible Queries for
Searching Conformationally Flexible Molecules in Databases of ThreeDimensional Structures,” J. Chem. Inf. Comput. Sci. 1992, 32, 101-109.
 Güner, O. F.; Henry, D. R.; Moock, T. E.; Pearlman, R. S. “Flexible Queries
in 3D Searching. 2. Techniques in 3D Query Formulation,” Tetrahedron
Comp. Meth. 1990, 3(6C), 557-563.
40
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Addressing Conformational
Flexibility – during Search
Explore conformational flexibility at search-time
 Rigid: does this conformation match query?
 Flexible: could this compound match query?
 2-phase process:
 First rapid screen based upon max/min distance keys
 Then, slower conformational search
 Ensures that no hits are missed (however, it is important note local
minima problem)
 Example references:
 Moock, T. E.; Henry, D. R.; Ozkabak, A. G.; Alamgir, M. “Conformational
Searching in ISIS/3D Databases,” J. Chem. Inf. Comput. Sci., 1994, 34,
184-189.
 Hurst, T. “Flexible 3D Searching: the Directed Tweak Technique,” J.
Chem. Inf. Comput. Sci., 1994, 34, 190-196.
41
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Conformational Coverage in 3D
Databases
A. Smellie, S.L. Teig, and P. Towbin, "Poling: Promoting Conformational
Coverage", J. Comp. Chem., 1995, 16, 171-187.
A. Smellie, S.D. Kahn, and S. Teig, "An Analysis of Conformational
Coverage 1. Validation and Estimation of Coverage", J. Chem. Inf.
Comput. Sci., 1995, 35, 285-294.
A. Smellie, S.D. Kahn, and S. Teig, "An Analysis of Conformational
Coverage 2. Applications of Conformational Models" , J. Chem. Inf.
Comput. Sci., 1995, 35, 295-304.
–
The most commonly used approach is to store multiple diverse
conformations in the database and perform a flexible search
• ISIS/3D – performs a random kick before giving up on a conformation
• UNITY – performs user defined number of kicks
• Catalyst – uses “poling” algorithm to store multiple conformations (av.
>30 conf.s)
42
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Essential Components Required for
3D Searching
Large number of interesting 3D structures
Database management software for storage
and retrieval
Software to perform search based upon 3D
criteria
Rational 3D search criteria
43
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Rational 3D Search Criteria:
Pharmacophore Perception
Relatively easy if receptor structure is known
 Otherwise, based on analysis of actives and inactives
Relatively easy if compounds are rigid
 Otherwise difficult and expensive
E.g., Active analog approach:
 Constrained systematic search, considering intersection of
accessible conformation-space of all compounds in the training set.
The concept of pharmacophores and detailed examples of
pharmacophore development will be covered in the next lecture
44
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Outline
What is 3D searching?
Why perform 3D searching?
Essential components required
Brief example
45
Copyright © 2006 Turquoise Consulting. All Rights Reserved
ACE Inhibitors
46
Copyright © 2006 Turquoise Consulting. All Rights Reserved
ACE Inhibitors Pharmacophore
Model
Mayer, D.; Naylor, C. B.; Motoc, I.;
Marshall, G. R. “A unique
geometry of the active site of
angiotensin-converting enzyme
consistent with structure-activity
studies,” J. Comput.-Aided Mol.
Des. 1987, 1(1), 3-16.
Object-1: Zn-ligand (sulfhydryl or carboxylate oxygen)
Object-2: H-bond acceptor (N, O, or F)
Object-3: anion (--CS-, --COO-, --SO4-2, or –-PO4-3)
Object-4: indicates direction of lone-pair on object-2
Object-5: “central” atom in anion labeled object-3
47
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Note Regarding Geometric
Search Criteria
Conformation space of potential ligands is, generally,
multi-dimensional space of high volume
Combination of apparently broad criteria on each of
the several axes results in greatly reduced volume of
conformation space to be explored
Rubics cube example
48
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Search for ACE Inhibitors
Search performed at Lederle laboratories by
 Sheridan, R. P.; Nilakantan, R.; Rusinko, A. III; Bauman, N.;
Haraki, K. S.; Venkataraghavan, R., “3DSEARCH: A system
for three-dimensionsl substruture searching,” J. Chem. Inf.
Comput. Sci., 1989, 29, 255-260.
Found 96 “hits” in their corporate database of
223,988 structures
Required ca. 7 VAX-8650 CPU minutes
[would require ca. 2 SGI R10k seconds]
49
Copyright © 2006 Turquoise Consulting. All Rights Reserved
Summary
3D Searching works but requires a team effort:
 Laboratory synthesis and testing (and/or HTS)
 Molecular modeling for query refinement
 Tight interface between modeling and searching software
 Hit list analysis, prioritization, post processing
3D Searching can spark chemists’ imagination
The more information provided by chemists, the more
information returned by 3D search
50
Copyright © 2006 Turquoise Consulting. All Rights Reserved