Knowledge Representation for the Web Simon Jupp Bio-Health Informatics Group

advertisement
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Knowledge Representation for the Web
Simon Jupp
Bio-Health Informatics Group
University of Manchester, UK
SWAT4LS Edinburgh 2008
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Introduction
Library Science Application for a Semantic Web
Dynamic addition of semantic links to the existing web
Improve document retrieval, indexing and navigation
This is not science - but a science enabling tool
Application requires some domain knowledge
What semantics do we need in the knowledge to support
navigation?
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
COHSE (Conceptual Open Hypermedia SErvice)
Document navigation system for the web
Original COHSE used OWL ontologies
for background knowledge
Ontology structure used to drive navigation
around web documents
SeaLife project: Extending COHSE with a focus on life sciences use
cases
Do OWL ontologies meet all the requirements for Sealife?
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
COHSE Architecture
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Knowledge requirements
for a navigation system
What should the background knowledge provide?
1. Rich lexical support for adding appropriate meta-data to documents on the web
2. Semantics for representing relationships between terms, in particular to
generalise or specialise a term
3. A simple data structure flexible enough to incorporate a wide range of new or
existing KOS
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
OWL ontologies for navigation systems
1. Rich lexical support for adding meta-data/labels to documents on the web
Annotation space is flexible, but few built in standards for rich label support
2. Semantics for representing relationships between terms, in particular to
generalise or specialise a term
Yes, but strict (Class hierarchy provides a natural navigational structure)
3. A common data structure that is flexible enough to represent a wide range
of new or existing knowledge bases
Conversions of existing knowledge bases to OWL is common, but don’t
always respect OWL semantics
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Issues with using OWL
OWL classes describe sets of instances and the conditions for set membership
Modelling terminologies at the class (T-box) is difficult partly due to the universal
nature of statements you make in OWL
Tuberculosis is a Lung Disease
Cells have nuclei
Bacteria may cause pneumonia
Most existing terminologies aren’t built with the strict semantics of OWL in mind.
We want to model the relationships between terms, not a description of the instances
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Tuberculosis in OWL
Infectious Disease
TB
Bacteria
abbreviation
Caused by
BCG vaccine
Isoniazid
Tuberculosis
vaccine
Chest X-ray
Is a
Diagnosis/
detection
Symptom
drug
Affects
Similar to
Coughing
Lung
Mycobacterium bovis
These are all useful links to terms relating to tuberculosis
Modelling these relationships between classes in an OWL ontology is extremely
difficult and is not necessary for building a navigation hierarchy
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
“Something to do with”
The semantics of “something to do with” is all we need to build a connected
graph of related terms that can aid document navigation
Simply don’t need to consider the philosophical and logical aspects of modelling
domain terminology in an OWL T-box
Easier to merge and align two semantically different resources when we weaken the
semantics
Simple terminologies serve as useful intermediates for future OWL ontologies
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Modeling terminologies
at the instance level
Use OWL vocabulary to build a schema for representing terminologies at the
instance (A-box) level
Define a set of data properties to capture richer lexical support for terminologies
Define a set of object properties to capture the kinds of relationships between
terms that we want to express in navigation systems
Generalise, specialise, related terms.
Exploit RDF/OWL machinery e.g. property characteristics for simple inferences
Extending/Constraining the schema, Transitive, Symmetric,
Property Chains
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Simple Knowledge Organisation System
(SKOS)
W3C standard for representing Knowledge Organisation Systems (KOS) such as
thesauri, classification schemes, subject heading systems, taxonomies,
dictionaries etc. on the web
Uses RDF/OWL to define schema, model terminologies at instance level
Rich support for labelling and documenting concept meta data
Preferred label, Alternate labels, hidden labels, definitions,
examples, scope notes
Semantic relationships for building concept hierarchies
Has Broader, Has Narrower, Related, Close Match, Exact Match
http://www.w3.org/2004/02/skos/
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Tuberculosis in SKOS
Infectious Disease
TB
Bacteria
skos:altLabel
skos:broader
BCG vaccine
skos:narrower
Chest X-ray
skos:broader
Isoniazid
Tuberculosis
skos:related
skos:narrower
skos:related
skos:related
skos:narrower
Coughing
Lung
Mycobacterium bovis
Not a replacement for OWL, just an additional representation
for terminologies
A syntax for building navigational hierarchies
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SKOS in COHSE
COHSE knowledge base now support both SKOS and OWL representations
Rapidly develop SKOS vocabularies to support navigation
SKOS provides a standard syntax to represent a wide range of terminologies that
don’t readily convert into an OWL ontology
e.g. MeSH, Online Medial Dictionary, Bio thesaurus
The weaker semantics of SKOS are “enough” for some applications, such as
navigation systems
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Conversions to SKOS
Reuse! Publish existing life science ontologies, thesauri, dictionaries on the web
using SKOS - for applications that only require SKOS semantics
OWL , OBO, UMLS etc. Large coverage of terminologies that could be represented
in SKOS
-Protein covalent bond
-Protein domain
-UniProt taxonomy
-Sequence types
features
-Genetic Context
-Pathway ontology
-Event (INOH pathway ontology)
-Systems Biology
-Protein-protein interaction
and
-Mosquito gross anatomy
-Mouse adult gross anatomy
-Mouse gross anatomy and development
-C. elegans gross anatomy
-Arabidopsis gross anatomy
-Cereal plant gross anatomy
-Drosophila gross anatomy
-Dictyostelium discoideum anatomy
-Fungal gross anatomy FAO
-Plant structure
-Maize gross anatomy
-Medaka fish anatomy and development
-Zebrafish anatomy and development
BRENDA tissue / enzyme source
Proteins
Sequence
Pathways
Anatomy
Phenotype
Phenotype
Gene products
Development
Transcript
Plasmodium life cycle
Cell type
- Molecule role
- Molecular Function
- Biological process
- Cellular component
eVOC (Expressed Sequence
Annotation for Humans)
-Arabidopsis development
-Cereal plant development
-Plant growth and developmental stage
-C. elegans development
-Drosophila development FBdv fly development.obo OBO yes yes
-Human developmental anatomy, abstract version
-Human developmental anatomy, timed version
-NCI Thesaurus
-Mouse pathology
-Human disease
-Cereal plant trait
-PATO PATO attribute and value.obo
-Mammalian phenotype
-Habronattus courtship
-Loggerhead nesting
-Animal natural history and life history
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
NeLI use case
National Electronic Library of Infection
Built new SKOS vocabulary to improve search/navigation around their website
(powered by COHSE - evaluation ongoing).
Reuse existing vocabularies that have been converted to SKOS to fill in the gaps
(MeSH , OBO Disease Ontology)
neli:Polio_Virus
SKOS relation from
neli:Polio_Virus
skos:altLabel
skos:broader
skos:narrower
skos:broader
skos:broader
SKOS Concept
Brunhilde Virus
Spinal cord disease
Postpoliomyelitis
Syndrome
Microorganism
Enterovirus
Source
MeSH
Disease Ontology
SNOMED
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Conclusion
We need a large knowledge artefact that supports navigation between related
web resources
Rapid generation and reuse of existing terminologies (cheap)
Loosening the semantics of our model enables this with acceptable trade off
SKOS is a suitable data model to represent our background knowledge
We have an implementation in COHSE and use cases from the life sciences that
are good showcases for semantic web technologies
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Conclusion
SKOS is new, still exploring what it can do for the life science and semantic web
e.g. Ontology indexing, Resource for text mining…
Not a replacement for OWL, cost effective alternative for certain tasks
This is not a criticism of OWL, rather a criticism for the misuse of OWL
Thank you.
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SKOSEd Thesaurus editor for the Semantic Web
http://code.google.com/p/skoseditor/
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Acknowledgments
Manchester
Other
Robert Stevens
COHSE developers
Sean Bechhofer
SeaLife project
Yeliz Yesilada
NeLI
Patty Kostkova
Download