Ontologies Come of Age with the iKUP browser

advertisement
Ontologies Come of Age
with the iKUP browser
www.kupkb.org
Simon Jupp
Bio-health informatics group
University of Manchester
The problem domain
Kidney and Urinary Pathways
Kidney
Ureter
Bladder
Need to understand how they work for prevention
Mutiple diseases
Need to learn new ways to detect them
Dialysis and transplantation
The problem domain
Hundreds of studies have been conducted by the kidney research community
 On different species
human
mouse
urine
tissue
gene
protein
 On different materials
• On different biological levels
cell
Where does the data go?
Bespoke kidney laboratory databases
Research Papers
Generalist databases
Scattered, hidden in figures, coming in different formats
Most of the data is lost!
Capturing what is known in a
form of nano-publication
What has been observed, where and when?
Experimental factors
e.g. Experiment X showed gene TGFB1 over-expressed in location Kidney
under condition model of diabetic nephropathy
Disease ontology
Animal model
Ontologies provide the schema
Cell type ontology
Mouse anatomy ontology
Filling in the gaps
We needed to connect these reference ontologies.
By connecting we build our own application ontology.
Gene Biological processes(GO)
Anatomy (MAO)
part-of
Cells (CTO)
participate-in
Kidney Cortex
part-of
part-of
Renal proximal tubule
part-of
participates-in
Proximal tubule epithelial cell
subClassOf
Proximal convoluted tubule
Proximal straight tubule
Assertion
Inference
subClassOf
Proximal convoluted tubule epithelial cell
Proximal straight
tubule epithelial cell
Renal sodium absorption participates-in
Renal sodium ion absorption
Separation of concerns
Knowledge
All Eukaryotic Cells are either nucleated or anucleate, some cells are
multinucleate
Ontologically
‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’
‘Nucleation’ subClassOf {mononucleate , binucleate , polynucleate ,
anucleate}
Differentia
Real Examples
Populous
‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’
‘Nucleation’ subClassOf {mononucleate , binucleate , polynucleate ,
anucleate}
‘Eukaryotic Cells’
‘Nucleation’
Mononuclear phagocyte
Flight Muscle cell
Red Blood cell
mononucleate
multinucleate
anucleate
Ontologies by stealth
The domain experts are the experts so get them build it
Cells (CTO)
Anatomy (MAO)
Biological processes(
GO)
http://www.populous.org.uk
Populous generates simple Excel based templates
Ontologies by stealth
Convert from Excel  Owl  Classify  Validate in Protégé
Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO)
Describing/Collecting experimental
data
Gathering good meta-data AND data again by stealth using RightField
Mashing it all together
Kidney and Urinary Pathway Ontology
Experimental data
~1800 classes (~40,000 after imports closure)
195 KUP experiments/databases
integrated
Excel 2 RDF/OWL
OWL reasoning
KUP Knowledge Base
RDF triple store
Bio2RDF Linked data
Sesame + OWLIM ~50M triples
The iKUP browser
An open-source, collaborative and easy-to-use interface
The iKUP browser
Anatomy search
Disease search
Doing some biology
1. A biological question
2. No answer with classical tools
Can calreticulin be associated
to the development of human
kidney disease?
Search in Pubmed and Google does
not return any relevant result!
3. Querying the KUPKB
4. Validation in the wet-lab
5. Publish an innovative result
KUPKB in silico result
confirmed.
Accepted for publication in the FASEB J!
Summary

The KUPKB RDF store is a mashup of biological knowledge relating to the
KUP domain

Ontologies provide the schema and a consistent data annotation mechanism

We expose this knowledge base through a simple web interface that real
biologists can use

It is a testament to the tools and APIs that such applications are now being
delivered at relatively low cost
Thank you for listening…
www.kupkb.org
Some rough stats…
•
•
•
195 KUP experiments integrated
KUPKB RDF store ~35M triples
KUPK Ontology ~1800 classes. ~40,000 after imports closure
Architecture
• Sesame and BigOWLIM for the RDF store
• Web site developed with Google web toolkit
• OWL API and HermiT reasoner for classification and faceted browsing
Download