chemical biology standards caspase activity PubChem domain cheminformatics nomenclature semantic activity enzyme reporter viability fluorescence binding based programming data sets search knowledge screening technology end point thesauri article object XML enzyme substrate based high-thoughput screeningversioning (HTS) natural language software classification polysemes biological pathways Beta-Lactamase Induction dehydrogenase activity specificity subject indexing schemes properties GFP induction annotation Fluorogenic substrate chemical probes information exchange Stephan Schürer, PhD servers ATP Luciferin Coupled classes synonyms search tool controlled vocabulary biological assay biomedical knowledge ICBO, Buffalo, July 30 2011 disease networks concepts meta-data small molecule structure taxonomies subject headings cyclic AMP redistribution RDF calcium redistribution OWL sschurer@med.miami.edu novel chemical tools indexing pharmaceutical semantic web library authorized terms structural biology individuals homographs energy transfer tags PDSP BioAssay Ontology (BAO) ChemBank Background for BioAssay Ontology High-throughput screening One of the most important approaches to find novel entry points for drug discovery programs Historically in pharmaceutical companies Since ~2005, massive NIH effort (MLI) to make HTS accessible to public sector research PubChem is the major repository of HTS data More recently: EU-OpenScreen project 2 Motivation for BioAssay Ontology Large public screening data sets PubChem, ChEMBL, PDSP, ChemBank, Binding DB Lack of standardized assay annotations No standardized endpoint names or formats Data is rarely re-used(!) Common queries cannot be asked Analysis across different data sets is difficult Integration with other databases is difficult No knowledge model for assays and screening results 3 Queries the Ontology should be able to answer • Identify inhibitors of kinases in biochemical assays. • Identify compounds active in multiple luciferase reporter gene assays. • Identify compounds active in cell viability assays and organize by cell lines and assay types. • Identify active compounds in assays related to pathway X. • … 4 Leverage the aggregated corpus of publically available HTS data to infer molecular mechanism of actions (MMOA) of small molecule perturbagens in biological model systems. Schürer et al. “BioAssay Ontology Annotations Facilitate Cross-Analysis of Diverse High-throughput Screening Data Sets” J Biomol Screen 2011 (16), 415-426. 5 BAO Products and Resources BAOSearch Software (beta): http://baosearch.ccs.miami.edu Query, explore, download BAO-annotated PubChem content Some semantic search capabilities Project Website and Wiki with relevant materials and documentation: http://www.bioassayontology.org/ http://www.bioassayontology.org/wiki 6 Questions / Discussion points Application / user focus vs. “universal” ontologies Efficiency vs. “realism” of representations Rapid application development Orthogonal ontologies vs. Ontology mapping Universal “realism” vs. domain or application-specific Chemical bond: 2D structure graph, 3D rule based, molecular mechanics, semi-empirical, up-initio QM Disease Virtual world 7 Questions / Discussion points Collaborative ontology development Collaborative vs. individual effort Control over development and focus / application focus Rapid application development Quality Aligning BAO to upper level ontology (BFO) Benefits vs. required resources Do upper level ontologies matter for specialized applications? 8 Questions / Discussion points Aligning BAO with OBI Some level of overlap OBI: process-oriented (model the investigation) BAO: purpose of categorization and analysis of HTS data BAO model becomes more complex if based on OBI How do we do it practically Define missing assays to OBI and MIREOT back? Quick term templates (QTT)? Define our relations as short-cut relationships (using RO)? 9 Additional slides 10 BAO-facilitated Example for Analysis (Luciferase Assays) Details in: Schürer et al. “BioAssay Ontology Annotations Facilitate Cross-Analysis of Diverse High-throughput Screening Data Sets” J Biomol Screen 2011 (16), 415-426. 11 Assay Count Most promiscuous reporter gene compounds Panel Assay Single Conc Other Conc-response Most promiscuous reporter gene compounds Compounds Assay_PCIdxCorrelation Promiscuity Index 1 0 0.2 Luciferase Enzyme Inhibitors Generally cytotoxic PCIdx = N ( Active) N (Tested) Examples: Cytotoxic Series Daunorubicin Cluster Reporter PCIdx: 0.56 Cluster Reporter Active: 58 Cluster Viability PCIdx: 0.64 Cluster Viability Active 27 Emetine Cluster Reporter PCIdx: 0.48 Cluster Reporter Active: 23 Cluster Viability PCIdx: 0.45 Cluster Viability Active 10 Cluster Reporter PCIdx: 0.41 Cluster Reporter Active: 29 Cluster Viability PCIdx: 0.57 Cluster Viability Active 13 Examples: Luciferase Inhibitor Series Cluster Size: 6 Cluster Reporter PCIdx: 0.61 Cluster Reporter Active: 101 Cluster EnzActivity PCIdx: 0.58 Cluster EnzActivity: 15 Cluster Size: 4 Cluster Reporter PCIdx: 0.38 Cluster Reporter Active: 52 Cluster EnzActivity PCIdx: 0.61 Cluster EnzActivity: 11 Cluster Size: 5 Cluster Reporter PCIdx: 0.46 Cluster Reporter Active: 77 Cluster EnzActivity PCIdx: 0.58 Cluster EnzActivity: 14 Schürer et al. “BioAssay Ontology Annotations Facilitate Cross-Analysis of Diverse High-throughput Screening Data Sets” J Biomol Screen 2011 (16), 415-426. BAO Project: Three major components 1) Development of the Bioassay Ontology 2) Annotation of assays and assay results (content curation) 3) Development of software tools 16 BAO design to describe assays + ert urb a ge n_ of ge n( I) + format + perturbagen og ol gen hn nd po (I) en_ of bioassay spec i nt has _ y per turb a ec ge t t s_ ha s_ ta r ha n has_detectio ha s_ e rb a rbag measure group s_ p er tu is_p ertu + + ha on ati ific ha s_ for m at p ec sp _m ea r su u ro eg is_ p ( I) s_ s ha p_ ha is s ea m _ ou gr e ur bioassay of + + meta target detection technology + Legend + assay design + endpoint Ontology class Asserted relation, (I) is inverse More subclasses Inferred relation Primitive class Application of BAO: BAO Search Software 18 http://baosearch.ccs.miami.edu/baosearch/ 19 BAO: Concept Search 20 Biochemical Assays with IC50 < 1 mM 21 22 Chemical structure search 23 BAO Products and Resources BioAssay Ontology (NCBO bioportal and project site): http://bioportal.bioontology.org/ontologies/45410 http://www.bioassayontology.org/visualize/ Terminology / annotations for biochemical assays: http://www.bioassayontology.org/ >Assay Annotation Template Over 1000 BAO-annotated assays from PubChem (available in BAOSearch) 24 Acknowledgements • Ubbo Visser • Vance Lemmon • Mitsunori Ogihara • Nick Tsinoremas • • • • • • • Saminda Abeyruwan Uma Vempati Magdalena Przydzial Kunie Sakurai Robin Smith Yuanyuan Jia Caty Chung • • • • Chris Mader Amar Koleti Nakul Datar Sreeharsha Venkatapuram • Felimon Gayanilo • Mark Southern http://bioassayontology.org sschurer@med.miami.edu 25