Why Ontologies? - UC San Diego Health Sciences

advertisement
Language (Formalisms) For
Ontology Building
Neda Alipanah
22 October 2012
Content

Why Ontologies?
 Machine Process able Knowledge
 Knowledge Exchange
 Big Data

Relevant Technologies
 Layered Architecture
 Building Tools and Visualization

Ontology Application
 Information Integration
 Web Database Management
 Web Services
Why Ontologies?
1.
Machine readable and understandable
process of data
2.
Consistent Knowledge Presentation for
Enterprise application integration (Knowledge
Exchange)
3.
Nodes and links that essentially form a very
large database with specific rules
Why Ontologies?
1. Machine readable and understandable process of
data
John Smith is Assistant
Professor of
Computer Science in
University of X.
He is teaching several
courses including
Course A, B, C.
Assistant
Professor
John
Smith
University X
Why Ontologies?
2. Consistent Knowledge Presentation for
Enterprise application integration (Knowledge
Exchange)
Disease
Patient
Symptoms
Address
Why Ontologies?
3. Nodes and links that essentially form a very
large database with specific rules.
Database capture the data and relations (Entity Relations) but
not the semantic and rules
Disease
Patient
Symptoms
Address
Concept 1 is reverse of Concept 2.
Concept 2 is subclass of Concept 3.
Concept 100 has isA relation with Concept 2000 and is
reverse of Concept 500.
Content

Why Ontologies?
 Machine Process able Knowledge
 Knowledge Exchange
 Big Data

Relevant Technologies
 Layered Architecture
 Building Tools and Visualization

Ontology Application
 Information Integration
 Web Database Management
 Web Services
Technologies- Layered Architecture
Tim Berners Lee Architecture
T
R
U
S
T
P
R
I
V
A
C
Y
Logic, Proof and Trust
Rules/Query
RDF/Ontologies
XML/XML Schemas
URI/UNICODE
Other
Services
Technologies- Layered Architecture
URI (Uniform Resource Identifiers):
◦ Simple and Extensible means for Identifying a
Resource
◦ Universal Resource Identifiers in WWW
◦ Example http://www.nih.gov/
http://www.ncbi.nlm.nih.gov/gap
http://www.ucsd.edu
http://www.semantic
web/JohnSmith
What is XML about?
XML= eXtensible Markup Language
by the W3C (World Wide Web
Consortium)
 Transport and Store Data (Structured
Knowledge)
 Key to XML is Document Type
Definitions (DTDs)

◦ Defines the role of each element of text in
a formal model

Compound Documents(Multiple files)
XML Example
Year: 2002
Asset report
Assets
Dept
Patents
Name: U. Of X
Equipment
Other assets
Funds
Patent
news
Name: Expenses
BioInformatics
Contracts
Grants
ID Author title
XML File Example
<Professor credID=“9” subID = “16: CIssuer = “2”>
<name> Alice Brown </name>
<university> University of X <university/>
<department> CS </department>
<research-group> BioInformatics</research-group>
</Professor>
<Secretary credID=“12” subID = “4: CIssuer = “2”>
<name> John James </name>
<university> University of X <university/>
<department> BioInformatics </department>
<level> Senior </level>
</Secretary>
Technologies- Layered Architecture
Tim Berners Lee Architecture
T
R
U
S
T
P
R
I
V
A
C
Y
Logic, Proof and Trust
Rules/Query
RDF/OWL Ontologies
XML/XML Schemas
URI/UNICODE
Other
Services
RDF
RDF = Resource Description Framework
 Adds semantics with the use of ontologies,
XML syntax


RDF Concepts
◦ Basic Model
 Resources, Properties and Statements
◦ Container Model
 Bag, Sequence and Alternative
RDF

RDF/RDFS Elements
◦ Class (School, Department, Person)
 Rdfs:SubClassOf
◦ Properties (Works)
 Rdfs:SubPropertiesOf
◦ Domain and Range of Property
 Rdfs: domain (School)
 Rdfs: range (Person)
School
Works
SubClass
Department
Person
RDF vs. XML Views
An iPhone is a Product that has a price of $200″
<product>
<title>iPhone</title>
<price>$200</price>
</product>
<product title=”iPhone”>
<price>$200</price>
</product>
XML Views
<owl:Class rdf:about="&OntTeaching;Product"/>
<owl:NamedIndividual rdf:about="&OntTeaching;Product1">
<rdf:type rdf:resource="&OntTeaching;Product"/>
<rdfs:label rdf:datatype="&xsd;Name">iPhone</rdfs:label>
<price rdf:datatype="&xsd;decimal">200</price>
</owl:NamedIndividual>
OntTeaching:product1 rdf:type OntTeaching:Product
OntTeaching:product1 OntTeaching:title “iPhone”
OntTeaching:product1 price “200″
RDF View
OWL Web Ontology Language

OWL: Semantic Markup Language for
Publishing/Sharing Ontologies
 Enumeration on Classes
<owl:Class>
<owl:oneOf rdf:parseType="Collection">
<owl:Thing rdf:about="#Europe"/>
<owl:Thing rdf:about="#Africa"/>
<owl:Thing rdf:about="#NorthAmerica"/>
<owl:Thing rdf:about="#SouthAmerica"/>
<owl:Thing rdf:about="#Australia"/>
<owl:Thing rdf:about="#Antarctica"/>
</owl:oneOf>
</owl:Class>
OWL Web Ontology Language

OWL
◦ Value Constraints
 OWL:ALLVALUESFROM
 OWL:SOMEVALUESFROM
 OWL:HASVALUE
<owl:Restriction>
<owl:onProperty rdf:resource="#hasParent" />
<owl:someValuesFrom rdf:resource="#Physician" />
</owl:Restriction>
<owl:Restriction>
<owl:onProperty rdf:resource="#hasParent" />
<owl:hasValue rdf:resource="#Clinton" />
</owl:Restriction>
OWL Web Ontology Language

OWL: Cardinality constraints



OWL:MAXCARDINALITY
OWL:MINC ARDINALITY
OWL:CARDINALITY
<owl:Restriction>
<owl:onProperty rdf:resource="#hasParent" />
<owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">2</owl:cardinality>
</owl:Restriction>
OWL Web Ontology Language

OWL: Intersection, union and complement
 OWL:INTERSECTIONOF
 OWL:UNIONOF
 OWL:COMPLEMENTOF
Not Meat
<owl:Class>
<owl:complementOf>
<owl:Class rdf:about="#Meat"/>
</owl:complementOf>
</owl:Class>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<owl:Class>
<owl:oneOf rdf:parseType="Collection">
<owl:Thing rdf:about="#Tosca" />
<owl:Thing rdf:about="#Salome" />
</owl:oneOf>
</owl:Class>
<owl:Class>
<owl:oneOf rdf:parseType="Collection">
<owl:Thing rdf:about="#Turandot" />
<owl:Thing rdf:about="#Tosca" />
</owl:oneOf>
</owl:Class>
</owl:unionOf>
</owl:Class>
OWL Web Ontology Language

OWL: Equivalent Class, Disjoint Class
<owl:Class rdf:about="#Man">
<owl:disjointWith
rdf:resource="#Woman"/>
</owl:Class>
<owl:Class
rdf:about="#DaPonteOperaOfMozart">
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf
rdf:parseType="Collection"> <owl:Restriction>
<owl:onProperty
rdf:resource="#hasComposer"/> <owl:hasValue
rdf:resource="#Wolfgang_Amadeus_Mozart"/>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty rdf:resource="#hasLibrettist"/>
<owl:hasValue
rdf:resource="#Lorenzo_Da_Ponte"/>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class> </owl:equivalentClass>
</owl:Class>
How to Build OWL/RDF files?
Do we need to remember all the OWL
language syntax?
 How to do it easy to use and remember?

Content

Why Ontologies?
 Machine Process able Knowledge
 Knowledge Exchange
 Big Data

Relevant Technologies
 Layered Architecture
 Building Tools and Visualization

Ontology Application
 Information Integration
 Web Database Management
 Web Services
How to Build RDF/OWL files?

Different Building and Visualization Tools
◦ Protégé, http://protege.stanford.edu/
◦ Gruff, http://www.franz.com/agraph/gruff/
(Download version 3.3)

Using Programming Languages
◦ Java and Jena API
◦ http://jena.apache.org/
◦ http://jena.sourceforge.net/tutorial/RDF_API/
Protégé Tool- Open Source Ontology Editor

Class Creation
Protégé Tool- Open Source Ontology Editor

Property (Object/Data properties)
Protégé Tool- Open Source Ontology Editor

Individual Creation
Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph
1. What is triple Store?
type
Product
Product1
title
200
iPhone
<owl:Class rdf:about="&OntTeaching;Product"/>
<owl:NamedIndividual rdf:about="&OntTeaching;Product1">
<rdf:type rdf:resource="&OntTeaching;Product"/>
<rdfs:label rdf:datatype="&xsd;Name">iPhone</rdfs:label>
<price rdf:datatype="&xsd;decimal">200</price>
</owl:NamedIndividual>
Subject Predicate Object
OntTeaching:product1 rdf:type OntTeaching:Product
OntTeaching:product1 OntTeaching:title “iPhone”
OntTeaching:product1 price “200″
price
Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph
1. Create a New Triple Store
2. Choose a Path for the Ontology
3. Load Ontology
4. Present the Ontology Triples
5. Query the Triples
Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph
Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph
Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph
Programming with Ontologies
Java + Jena API







Collection of Tools and Java Libraries
For Developing Linked-data Apps, Tools and Servers
Store Information in RDF Triples in Directed
Graphs
An Ontology API for Handling OWL and RDFS
Ontologies
A Rule-based Inference Engine for Reasoning with
RDF and OWL data sources
Efficient Storage of Triples on Disk
A query engine compliant with the latest SPARQL
Ontology Building using Jena
Ontology:
iPhone.owl
type
title
Product
Subject
Product1
Predicate
Object
price
200
iPhone
Triples
http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#Product1
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#Product
Product1 http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#title
“iPhone”
Product1 http://www.semanticweb.org/ontologies/iPhone.Owl#price “200”
Ontology Building using Jena
The code to create this graph, or model, is simple:
// some definitions
 static String productURI =
"http://www.semanticweb.org/ontologies/Product";
// create an empty Model
 Model model = ModelFactory.createDefaultModel();
// create the resource
 Resource product = model.createResource(productURI);
// add the property
 product.addProperty(title, ”iPhone”);
 product.addProperty(price, ”200”);
Jena

How to read Ontology?
// list the statements in the Model
StmtIterator iter = model.listStatements();
// print out the predicate, subject and object of each statement
while (iter.hasNext()) {
Statement stmt
= iter.nextStatement(); // get next statement
Resource subject = stmt.getSubject(); // get the subject
Property predicate = stmt.getPredicate(); // get the predicate
RDFNode object = stmt.getObject();
// get the object
System.out.print(subject.toString());
System.out.print(" " + predicate.toString() + " ");
if (object instanceof Resource) {
System.out.print(object.toString());
} else {
// object is a literal
System.out.print(" \"" + object.toString() + "\"");
}
System.out.println(" .");
}
SPARQL Query

Query on Triples with Exact Pattern
Matching (Subject of query is Product1)
SELECT ?b ?c Where {
<http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#Pr
oduct1> ?b ?c
}
 Result
Content

Why Ontologies?
 Machine Process able Knowledge
 Knowledge Exchange
 Big Data

Relevant Technologies
 Layered Architecture
 Building Tools and Visualization

Ontology Application
 Information Integration
 Web Database Management
 Web Services
Ontology Applications
The database of Genotypes and Phenotypes (dbGaP) is archiving
the results of different Genome Wide Association Studies
(GWAS).
•
•
•
•
•
Phenotype variables are not harmonized across studies.
Redundent phenotype identifiers for the same phenotype.
dbGaP lacks semantic relations among its variables.
Search on phenotypes is inefficient and inaccurate .
Goal is to standardize dbGaP information to allow
accurate, reusable and quick retrieval of information
Ontology Applications

Several Available dbGAP Studies
phs000284.v1.pht001901.v1.CFS_CARe_Sample.data_dict_2011_02_07
id=”phv00122015”,
Description=”Age at
time of Study”,
name=”age”,
version=“1”, Logical
Max=”65”, Logical
Minimum=”18”,
unit=”Years”,
type=”decimal”
id=”phv00122058”,
Description=”Age of
patient at the time of
Study”, name=”age”,
version=“1”, Logical
Max=”90”, Logical
Minimum=”20”,
unit=”Years”,
type=”decimal”
phs000284.v1.pht001903.v1.CFS_CARe_ECG.data_dict_2011_02_07
Ontology Applications

Building Information Model (Ontology)
Individual
Age
id=”phv00122058”
id=”phv00122015”
Ontology Applications

Information Retrieval and Ranking
Phenotypes
Query={Age of Subject}
Study
Phenotype Variable
phs000284.v1.pht001901.v1.C id=”phv00122058
FS_CARe_Sample.data_dict_2
011_02_07
phs000284.v1.pht001903.v1.C id=”phv00122015”
FS_CARe_ECG.data_dict_201
1_02_07
Conclusion
Benefits of Structured Data (XML, OWL)
 Tools to Create and Visualize Ontologies
 Jena API for Building Ontologies
 Sparql Queries on Ontologies
 Applications uses Ontologies

Contacts
Neda Alipanah
 Division of Biomedical Informatics
9500 Gilman Dr., Bldg 2 #0203E
 Email: nalipanah@ucsd.edu

Download