Advanced Structural Search Using ChemAxon Tools

advertisement
Slide 1
Structural Search Using
ChemAxon Tools
Szabolcs Csepregi
JChem version 5.3, April 2010
Structural Search Using ChemAxon Tools — JChem version 5.3
1
Slide 2
Structural Search Using ChemAxon Tools
Interfaces
Search types and options
Query features, Stereo searching
Special search types: reaction, R-group search, Chemical Terms filters
Searching against Combinatorial Markush structures
Fingerprint screening
Performance
Applications of structural search: R-group decomposition, Standardizer,
Reactor, Pmapper, Fragmenter
Future plans
All examples were generated by Marvin
Structural Search Using ChemAxon Tools — JChem version 5.3
2
Slide 3
Structural search interfaces
• Example web GUI-s:
– JSP (Java Server Pages)
– AJAX example: Javascript and
JChem Web Services
• Command line: jcsearch
• Java and .NET API:
– MolSearch class:
in memory
– JChemSearch class:
in database
• Cartridge: Oracle SQL
• Instant JChem
• JChem Web Services
• JChem For Excel
Structural Search Using ChemAxon Tools — JChem version 5.3
3
Slide 4
Search types in JChem
• Atom By Atom
Search or
structural search:
Structural search
type
Query
Result
Substructure
Superstructure
Full structure
Duplicate
MC(E)S –
• Similarity search:
maximum common
(edge) substructure
– Different Descriptors
– Different Metrics
Structural Search Using ChemAxon Tools — JChem version 5.3
4
Slide 5
Search options
Some selected structure search options:
•
Stereo on/off/diastereomers
•
Ignore charge/isotope/radical/
valence/polymers, etc.
•
Vague bond matching options
•
Chemical Terms filter
•
Tautomer search (even in substructure search)
•
Inverse hit list
•
Maximum search time / number of hits
•
Combine with non-structure
conditions
•
Ordering of results
•
Similarity type / metric
Structural Search Using ChemAxon Tools — JChem version 5.3
5
5
Slide 6
Hit coloring and alignment
Structural Search Using ChemAxon Tools — JChem version 5.3
6
Slide 7
Query features 1. Atomic features
• Query atom types:
• any(A, AH)
• hetero (Q, QH)
• list, not list
• metal (M, MH)
• halogen (X, XH)
• periodic table
groups (G1-18)
• Pseudo atoms e.g. “Resin”
• Explicit lone pairs (match to implied
lone pairs as well.)
• Charge, isotope, radical
• Link nodes (repeatable):
Structural Search Using ChemAxon Tools — JChem version 5.3
7
Slide 8
Query features 2. Query properties
Symbol
Description
H<n>
Total hydrogen count
a
Aromatic
A
Aliphatic
R<n>
Ring count in SSSR
r<n>
Ring size in SSSR
v<n>
Valence
X<n>
Connectivity
D<n>
Degree
h<n>
Implicit H count
rb<n>
rb*
Ring bond count
s<n>
s*
Substitution count *: as drawn
u
*: as drawn
Unsaturated atom
Structural Search Using ChemAxon Tools — JChem version 5.3
8
Slide 9
Query features 3. Atomic SMARTS
features
• SMARTS atoms:
• Additional query properties:
Symbol
Description
&;,!
Logical operators
$(<smarts>)
Recursive smarts
+0, -0
Zero charge
• Example:
Carbonyl C, but not amide
Structural Search Using ChemAxon Tools — JChem version 5.3
9
Slide 10
Query features 4. Homology atoms
• Can be used:
– In queries against molecule and reaction tables.
– In Markush structures
• Built-in and user-defined groups
Structural Search Using ChemAxon Tools — JChem version 5.3
10
Slide 11
Query features 5. Bond features &
components
• Query bond types: Any, single or double, single or
aromatic, double or aromatic
• Bond topology: chain/ring
• Smarts bonds
• Component
level grouping
Structural Search Using ChemAxon Tools — JChem version 5.3
Symbol
Description
-=#
Single, double, triple
:
aromatic
&,;!
Logical operators
@
Ring bond
/ \ /? \?
Directional bond (cis/trans)
Symbol
Description
(C.C)
Same component
(C).(C)
Different component
C.C
No component restrictions
11
Slide 12
Coordination compounds
Atom-to-atom (dative) and multicenter coordinate
bonds.
Alternative representations:
Structural Search Using ChemAxon Tools — JChem version 5.3
Position
variation bond
12
Slide 13
Hydrogens
• H representations:
– Explicit
– Implicit
– Query H count:
Considered in ABAS
Explicit H
Implicit H Query H count
Query
– total (H<n>)
Target
– implicit (h<n>)
• Example:
Target
Query
Structural Search Using ChemAxon Tools — JChem version 5.3
13
Slide 14
Stereo searching 1. Double bonds
Depiction
• Levels of check:
– All
– Only marked double bonds
(MDL: stereo care flag)
Meaning
Cis
Trans
Cis or trans
(unknown)
– None
Not trans
Not cis
Structural Search Using ChemAxon Tools — JChem version 5.3
14
Slide 15
Stereo searching 2. Tetrahedral chirality
• Stereo bond types:
Up
Down
Up or down
• Relative stereo configuration
• Chiral flag model
• Enhanced stereo representation: AND<n>, OR<n>, ABS
groups
Structural Search Using ChemAxon Tools — JChem version 5.3
15
Slide 16
Groups integration (query & target)
Both sides are treated similarly by the search:
• Abbreviations
(super-atom
S-groups):
• Multiple groups:
Other S-groups supported: component, mixture, formulation ,
many polymer brackets:
Structural Search Using ChemAxon Tools — JChem version 5.3
16
Slide 17
Reaction search
• Reactants, agents, products
• Transformation recognition (mapping)
• Stereospecific reactions (inversion, retention)
• Reactant grouping
• Reacting center
Structural Search Using ChemAxon Tools — JChem version 5.3
17
Slide 18
R-group search
• Scaffold,
R-group definitions
• Monovalent,
divalent R-groups
• R-logic
•Occurrence
•If-then
•Rest H
Structural Search Using ChemAxon Tools — JChem version 5.3
18
Slide 19
Undefined R-atoms
- No substitution elsewhere
retrieves:
Structural Search Using ChemAxon Tools — JChem version 5.3
19
Slide 20
Polymer storage and search
• Comprehensive representation
– Source based and structure based
– Copolymer types, mixtures, ladder-type
polymers, etc
– Phase shifting
– End groups: specific, undefined, etc.
• Flexible
– Attached data search
– Wide range of polymer search options
Structural Search Using ChemAxon Tools — JChem version 5.3
20
Slide 21
Chemical Terms filter
• Chemically aware filtering for structure and similarity
searches
• Elements of the Chemical Terms language
– structure matching functions (describing functional groups, reaction sites, similarity,
etc)
– property calculations (partial charge distribution, pKa, logP, HB donors, acceptors,
topological descriptors, etc)
– arithmetic and logic-operators
Examples
Lipinski rule of 5
(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) &&
(acceptorCount() <= 10);
Veber filter
(rotatableBondCount() <= 10) &&
(PSA() <= 140);
Structural Search Using ChemAxon Tools — JChem version 5.3
21
Slide 22
Markush structures
Markush structure registration and search
• Markush features
― R-groups
― Atom lists, bond lists
― Position variation bond
― Link nodes and
repeating units
― Homology groups
• Compatible enumeration plugin
Structural Search Using ChemAxon Tools — JChem version 5.3
22
Slide 23
Fingerprint screening in the database
•
JChem database searches use fingerprint technology
for fastest search results.
•
It rapidly* filters out most non-hits usually more than 99% of them
are rejected.
Search
Hits for
the query
•
query
JChem
table
Fingerprint
screening
Supported fingerprint types:
– Chemical hashed fingerprints
– User-defined additional structural keys
Screened
out
Need to be
searched
Results
* Average screening time in a 3-million cached table: ~0.1s
Structural Search Using ChemAxon Tools — JChem version 5.3
Atom by atom
search
23
Slide 24
Application: R-group decomposition
JChem is able to identify the ligands of a given scaffold at
specified substitution positions:
Query(scaffold)
Library
Structural Search Using ChemAxon Tools — JChem version 5.3
Result
R-group
decomposition
24
Slide 25
Further applications of structural search in
JChem
• Transformations - Standardizer & Reactor
Converting covalent form of
alcoholates to ionic form:
Enamine-amine
tautomerism:
• Identification of pharmacophoric groups Pmapper
nitro:
amidine:
• Identification of bond cleavage - Fragmenter
ether cut:
Structural Search Using ChemAxon Tools — JChem version 5.3
25
Slide 26
Performance
Query
Substructure searching in
19.5 million structures
(Pubchem)
JChem Base 5.2.2,
Intel Quad Q6600 2.4GHz,
8 GB RAM; Oracle 10.2.0.3
Compound registration:
Number of
compounds
Number of hits
Search time
2
0.91 s
93
0.98 s
6,001
1.30 s
146,256
5,66 s
Elapsed time
Duplicates not
checked
Duplicates
checked
10,000
21 s
26 s
100,000
2 min 4 s
2 min 34 s
200,000
4 min 24 s
5 min 13 s
26
Structural Search Using ChemAxon Tools — JChem version 5.3
Slide 27
Future plans
• R-group decomposition GUI in client applications
• Visualization of similarity search results using MCS
• Diastereomer search
• Markush search enhancements (homology variation
conditions, maximum common substructure, etc)
Structural Search Using ChemAxon Tools — JChem version 5.3
27
Slide 28
Summary
• JChem suite: contains a broad range of chemical
search facilities, including Markush
structure analysis.
• Structural search is a useful tool for many applications.
Structural Search Using ChemAxon Tools — JChem version 5.3
28
Slide 29
References
• JChem Query Guide http://www.chemaxon.com/jchem/doc/user/Query.html
• Chemical Terms reference
http://www.chemaxon.com/jchem/marvin/help/chemicalterms/ChemicalTerms.html
• JChem Base JSP demo page
http://www.chemaxon.com/jchem/examples/db_search/index.jsp
• Jcsearch command line tool
http://www.chemaxon.com/jchem/doc/user/Jcsearch.html
• API documentation http://www.chemaxon.com/jchem/doc/api/index.html
(chemaxon.sss.search.MolSearch, chemaxon.jchem.db.JChemSearch)
• JChem Base http://www.chemaxon.com/product/jc_base.html
• JChem Cartridge http://www.chemaxon.com/product/jc_cart.html
• Instant JChem http://www.chemaxon.com/product/ijc.html
• JChem for Excel http://www.chemaxon.com/products/jchem-for-excel/
Structural Search Using ChemAxon Tools — JChem version 5.3
29
Slide 30
Thank you for your attention
Máramaros köz 3/a
Budapest, 1037
Hungary
info@chemaxon.com
www.chemaxon.com
Structural Search Using ChemAxon Tools — JChem version 5.3
30
Download