Ontologies Come of Age with the iKUP browser www.kupkb.org Simon Jupp Bio-health informatics group University of Manchester The problem domain Kidney and Urinary Pathways Kidney Ureter Bladder Need to understand how they work for prevention Mutiple diseases Need to learn new ways to detect them Dialysis and transplantation The problem domain Hundreds of studies have been conducted by the kidney research community On different species human mouse urine tissue gene protein On different materials • On different biological levels cell Where does the data go? Bespoke kidney laboratory databases Research Papers Generalist databases Scattered, hidden in figures, coming in different formats Most of the data is lost! Capturing what is known in a form of nano-publication What has been observed, where and when? Experimental factors e.g. Experiment X showed gene TGFB1 over-expressed in location Kidney under condition model of diabetic nephropathy Disease ontology Animal model Ontologies provide the schema Cell type ontology Mouse anatomy ontology Filling in the gaps We needed to connect these reference ontologies. By connecting we build our own application ontology. Gene Biological processes(GO) Anatomy (MAO) part-of Cells (CTO) participate-in Kidney Cortex part-of part-of Renal proximal tubule part-of participates-in Proximal tubule epithelial cell subClassOf Proximal convoluted tubule Proximal straight tubule Assertion Inference subClassOf Proximal convoluted tubule epithelial cell Proximal straight tubule epithelial cell Renal sodium absorption participates-in Renal sodium ion absorption Separation of concerns Knowledge All Eukaryotic Cells are either nucleated or anucleate, some cells are multinucleate Ontologically ‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’ ‘Nucleation’ subClassOf {mononucleate , binucleate , polynucleate , anucleate} Differentia Real Examples Populous ‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’ ‘Nucleation’ subClassOf {mononucleate , binucleate , polynucleate , anucleate} ‘Eukaryotic Cells’ ‘Nucleation’ Mononuclear phagocyte Flight Muscle cell Red Blood cell mononucleate multinucleate anucleate Ontologies by stealth The domain experts are the experts so get them build it Cells (CTO) Anatomy (MAO) Biological processes( GO) http://www.populous.org.uk Populous generates simple Excel based templates Ontologies by stealth Convert from Excel Owl Classify Validate in Protégé Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO) Describing/Collecting experimental data Gathering good meta-data AND data again by stealth using RightField Mashing it all together Kidney and Urinary Pathway Ontology Experimental data ~1800 classes (~40,000 after imports closure) 195 KUP experiments/databases integrated Excel 2 RDF/OWL OWL reasoning KUP Knowledge Base RDF triple store Bio2RDF Linked data Sesame + OWLIM ~50M triples The iKUP browser An open-source, collaborative and easy-to-use interface The iKUP browser Anatomy search Disease search Doing some biology 1. A biological question 2. No answer with classical tools Can calreticulin be associated to the development of human kidney disease? Search in Pubmed and Google does not return any relevant result! 3. Querying the KUPKB 4. Validation in the wet-lab 5. Publish an innovative result KUPKB in silico result confirmed. Accepted for publication in the FASEB J! Summary The KUPKB RDF store is a mashup of biological knowledge relating to the KUP domain Ontologies provide the schema and a consistent data annotation mechanism We expose this knowledge base through a simple web interface that real biologists can use It is a testament to the tools and APIs that such applications are now being delivered at relatively low cost Thank you for listening… www.kupkb.org Some rough stats… • • • 195 KUP experiments integrated KUPKB RDF store ~35M triples KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed with Google web toolkit • OWL API and HermiT reasoner for classification and faceted browsing