Slide 1 Structural Search Using ChemAxon Tools Szabolcs Csepregi JChem version 5.3, April 2010 Structural Search Using ChemAxon Tools — JChem version 5.3 1 Slide 2 Structural Search Using ChemAxon Tools Interfaces Search types and options Query features, Stereo searching Special search types: reaction, R-group search, Chemical Terms filters Searching against Combinatorial Markush structures Fingerprint screening Performance Applications of structural search: R-group decomposition, Standardizer, Reactor, Pmapper, Fragmenter Future plans All examples were generated by Marvin Structural Search Using ChemAxon Tools — JChem version 5.3 2 Slide 3 Structural search interfaces • Example web GUI-s: – JSP (Java Server Pages) – AJAX example: Javascript and JChem Web Services • Command line: jcsearch • Java and .NET API: – MolSearch class: in memory – JChemSearch class: in database • Cartridge: Oracle SQL • Instant JChem • JChem Web Services • JChem For Excel Structural Search Using ChemAxon Tools — JChem version 5.3 3 Slide 4 Search types in JChem • Atom By Atom Search or structural search: Structural search type Query Result Substructure Superstructure Full structure Duplicate MC(E)S – • Similarity search: maximum common (edge) substructure – Different Descriptors – Different Metrics Structural Search Using ChemAxon Tools — JChem version 5.3 4 Slide 5 Search options Some selected structure search options: • Stereo on/off/diastereomers • Ignore charge/isotope/radical/ valence/polymers, etc. • Vague bond matching options • Chemical Terms filter • Tautomer search (even in substructure search) • Inverse hit list • Maximum search time / number of hits • Combine with non-structure conditions • Ordering of results • Similarity type / metric Structural Search Using ChemAxon Tools — JChem version 5.3 5 5 Slide 6 Hit coloring and alignment Structural Search Using ChemAxon Tools — JChem version 5.3 6 Slide 7 Query features 1. Atomic features • Query atom types: • any(A, AH) • hetero (Q, QH) • list, not list • metal (M, MH) • halogen (X, XH) • periodic table groups (G1-18) • Pseudo atoms e.g. “Resin” • Explicit lone pairs (match to implied lone pairs as well.) • Charge, isotope, radical • Link nodes (repeatable): Structural Search Using ChemAxon Tools — JChem version 5.3 7 Slide 8 Query features 2. Query properties Symbol Description H<n> Total hydrogen count a Aromatic A Aliphatic R<n> Ring count in SSSR r<n> Ring size in SSSR v<n> Valence X<n> Connectivity D<n> Degree h<n> Implicit H count rb<n> rb* Ring bond count s<n> s* Substitution count *: as drawn u *: as drawn Unsaturated atom Structural Search Using ChemAxon Tools — JChem version 5.3 8 Slide 9 Query features 3. Atomic SMARTS features • SMARTS atoms: • Additional query properties: Symbol Description &;,! Logical operators $(<smarts>) Recursive smarts +0, -0 Zero charge • Example: Carbonyl C, but not amide Structural Search Using ChemAxon Tools — JChem version 5.3 9 Slide 10 Query features 4. Homology atoms • Can be used: – In queries against molecule and reaction tables. – In Markush structures • Built-in and user-defined groups Structural Search Using ChemAxon Tools — JChem version 5.3 10 Slide 11 Query features 5. Bond features & components • Query bond types: Any, single or double, single or aromatic, double or aromatic • Bond topology: chain/ring • Smarts bonds • Component level grouping Structural Search Using ChemAxon Tools — JChem version 5.3 Symbol Description -=# Single, double, triple : aromatic &,;! Logical operators @ Ring bond / \ /? \? Directional bond (cis/trans) Symbol Description (C.C) Same component (C).(C) Different component C.C No component restrictions 11 Slide 12 Coordination compounds Atom-to-atom (dative) and multicenter coordinate bonds. Alternative representations: Structural Search Using ChemAxon Tools — JChem version 5.3 Position variation bond 12 Slide 13 Hydrogens • H representations: – Explicit – Implicit – Query H count: Considered in ABAS Explicit H Implicit H Query H count Query – total (H<n>) Target – implicit (h<n>) • Example: Target Query Structural Search Using ChemAxon Tools — JChem version 5.3 13 Slide 14 Stereo searching 1. Double bonds Depiction • Levels of check: – All – Only marked double bonds (MDL: stereo care flag) Meaning Cis Trans Cis or trans (unknown) – None Not trans Not cis Structural Search Using ChemAxon Tools — JChem version 5.3 14 Slide 15 Stereo searching 2. Tetrahedral chirality • Stereo bond types: Up Down Up or down • Relative stereo configuration • Chiral flag model • Enhanced stereo representation: AND<n>, OR<n>, ABS groups Structural Search Using ChemAxon Tools — JChem version 5.3 15 Slide 16 Groups integration (query & target) Both sides are treated similarly by the search: • Abbreviations (super-atom S-groups): • Multiple groups: Other S-groups supported: component, mixture, formulation , many polymer brackets: Structural Search Using ChemAxon Tools — JChem version 5.3 16 Slide 17 Reaction search • Reactants, agents, products • Transformation recognition (mapping) • Stereospecific reactions (inversion, retention) • Reactant grouping • Reacting center Structural Search Using ChemAxon Tools — JChem version 5.3 17 Slide 18 R-group search • Scaffold, R-group definitions • Monovalent, divalent R-groups • R-logic •Occurrence •If-then •Rest H Structural Search Using ChemAxon Tools — JChem version 5.3 18 Slide 19 Undefined R-atoms - No substitution elsewhere retrieves: Structural Search Using ChemAxon Tools — JChem version 5.3 19 Slide 20 Polymer storage and search • Comprehensive representation – Source based and structure based – Copolymer types, mixtures, ladder-type polymers, etc – Phase shifting – End groups: specific, undefined, etc. • Flexible – Attached data search – Wide range of polymer search options Structural Search Using ChemAxon Tools — JChem version 5.3 20 Slide 21 Chemical Terms filter • Chemically aware filtering for structure and similarity searches • Elements of the Chemical Terms language – structure matching functions (describing functional groups, reaction sites, similarity, etc) – property calculations (partial charge distribution, pKa, logP, HB donors, acceptors, topological descriptors, etc) – arithmetic and logic-operators Examples Lipinski rule of 5 (mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10); Veber filter (rotatableBondCount() <= 10) && (PSA() <= 140); Structural Search Using ChemAxon Tools — JChem version 5.3 21 Slide 22 Markush structures Markush structure registration and search • Markush features ― R-groups ― Atom lists, bond lists ― Position variation bond ― Link nodes and repeating units ― Homology groups • Compatible enumeration plugin Structural Search Using ChemAxon Tools — JChem version 5.3 22 Slide 23 Fingerprint screening in the database • JChem database searches use fingerprint technology for fastest search results. • It rapidly* filters out most non-hits usually more than 99% of them are rejected. Search Hits for the query • query JChem table Fingerprint screening Supported fingerprint types: – Chemical hashed fingerprints – User-defined additional structural keys Screened out Need to be searched Results * Average screening time in a 3-million cached table: ~0.1s Structural Search Using ChemAxon Tools — JChem version 5.3 Atom by atom search 23 Slide 24 Application: R-group decomposition JChem is able to identify the ligands of a given scaffold at specified substitution positions: Query(scaffold) Library Structural Search Using ChemAxon Tools — JChem version 5.3 Result R-group decomposition 24 Slide 25 Further applications of structural search in JChem • Transformations - Standardizer & Reactor Converting covalent form of alcoholates to ionic form: Enamine-amine tautomerism: • Identification of pharmacophoric groups Pmapper nitro: amidine: • Identification of bond cleavage - Fragmenter ether cut: Structural Search Using ChemAxon Tools — JChem version 5.3 25 Slide 26 Performance Query Substructure searching in 19.5 million structures (Pubchem) JChem Base 5.2.2, Intel Quad Q6600 2.4GHz, 8 GB RAM; Oracle 10.2.0.3 Compound registration: Number of compounds Number of hits Search time 2 0.91 s 93 0.98 s 6,001 1.30 s 146,256 5,66 s Elapsed time Duplicates not checked Duplicates checked 10,000 21 s 26 s 100,000 2 min 4 s 2 min 34 s 200,000 4 min 24 s 5 min 13 s 26 Structural Search Using ChemAxon Tools — JChem version 5.3 Slide 27 Future plans • R-group decomposition GUI in client applications • Visualization of similarity search results using MCS • Diastereomer search • Markush search enhancements (homology variation conditions, maximum common substructure, etc) Structural Search Using ChemAxon Tools — JChem version 5.3 27 Slide 28 Summary • JChem suite: contains a broad range of chemical search facilities, including Markush structure analysis. • Structural search is a useful tool for many applications. Structural Search Using ChemAxon Tools — JChem version 5.3 28 Slide 29 References • JChem Query Guide http://www.chemaxon.com/jchem/doc/user/Query.html • Chemical Terms reference http://www.chemaxon.com/jchem/marvin/help/chemicalterms/ChemicalTerms.html • JChem Base JSP demo page http://www.chemaxon.com/jchem/examples/db_search/index.jsp • Jcsearch command line tool http://www.chemaxon.com/jchem/doc/user/Jcsearch.html • API documentation http://www.chemaxon.com/jchem/doc/api/index.html (chemaxon.sss.search.MolSearch, chemaxon.jchem.db.JChemSearch) • JChem Base http://www.chemaxon.com/product/jc_base.html • JChem Cartridge http://www.chemaxon.com/product/jc_cart.html • Instant JChem http://www.chemaxon.com/product/ijc.html • JChem for Excel http://www.chemaxon.com/products/jchem-for-excel/ Structural Search Using ChemAxon Tools — JChem version 5.3 29 Slide 30 Thank you for your attention Máramaros köz 3/a Budapest, 1037 Hungary info@chemaxon.com www.chemaxon.com Structural Search Using ChemAxon Tools — JChem version 5.3 30