Language (Formalisms) For Ontology Building Neda Alipanah 22 October 2012 Content Why Ontologies? Machine Process able Knowledge Knowledge Exchange Big Data Relevant Technologies Layered Architecture Building Tools and Visualization Ontology Application Information Integration Web Database Management Web Services Why Ontologies? 1. Machine readable and understandable process of data 2. Consistent Knowledge Presentation for Enterprise application integration (Knowledge Exchange) 3. Nodes and links that essentially form a very large database with specific rules Why Ontologies? 1. Machine readable and understandable process of data John Smith is Assistant Professor of Computer Science in University of X. He is teaching several courses including Course A, B, C. Assistant Professor John Smith University X Why Ontologies? 2. Consistent Knowledge Presentation for Enterprise application integration (Knowledge Exchange) Disease Patient Symptoms Address Why Ontologies? 3. Nodes and links that essentially form a very large database with specific rules. Database capture the data and relations (Entity Relations) but not the semantic and rules Disease Patient Symptoms Address Concept 1 is reverse of Concept 2. Concept 2 is subclass of Concept 3. Concept 100 has isA relation with Concept 2000 and is reverse of Concept 500. Content Why Ontologies? Machine Process able Knowledge Knowledge Exchange Big Data Relevant Technologies Layered Architecture Building Tools and Visualization Ontology Application Information Integration Web Database Management Web Services Technologies- Layered Architecture Tim Berners Lee Architecture T R U S T P R I V A C Y Logic, Proof and Trust Rules/Query RDF/Ontologies XML/XML Schemas URI/UNICODE Other Services Technologies- Layered Architecture URI (Uniform Resource Identifiers): ◦ Simple and Extensible means for Identifying a Resource ◦ Universal Resource Identifiers in WWW ◦ Example http://www.nih.gov/ http://www.ncbi.nlm.nih.gov/gap http://www.ucsd.edu http://www.semantic web/JohnSmith What is XML about? XML= eXtensible Markup Language by the W3C (World Wide Web Consortium) Transport and Store Data (Structured Knowledge) Key to XML is Document Type Definitions (DTDs) ◦ Defines the role of each element of text in a formal model Compound Documents(Multiple files) XML Example Year: 2002 Asset report Assets Dept Patents Name: U. Of X Equipment Other assets Funds Patent news Name: Expenses BioInformatics Contracts Grants ID Author title XML File Example <Professor credID=“9” subID = “16: CIssuer = “2”> <name> Alice Brown </name> <university> University of X <university/> <department> CS </department> <research-group> BioInformatics</research-group> </Professor> <Secretary credID=“12” subID = “4: CIssuer = “2”> <name> John James </name> <university> University of X <university/> <department> BioInformatics </department> <level> Senior </level> </Secretary> Technologies- Layered Architecture Tim Berners Lee Architecture T R U S T P R I V A C Y Logic, Proof and Trust Rules/Query RDF/OWL Ontologies XML/XML Schemas URI/UNICODE Other Services RDF RDF = Resource Description Framework Adds semantics with the use of ontologies, XML syntax RDF Concepts ◦ Basic Model Resources, Properties and Statements ◦ Container Model Bag, Sequence and Alternative RDF RDF/RDFS Elements ◦ Class (School, Department, Person) Rdfs:SubClassOf ◦ Properties (Works) Rdfs:SubPropertiesOf ◦ Domain and Range of Property Rdfs: domain (School) Rdfs: range (Person) School Works SubClass Department Person RDF vs. XML Views An iPhone is a Product that has a price of $200″ <product> <title>iPhone</title> <price>$200</price> </product> <product title=”iPhone”> <price>$200</price> </product> XML Views <owl:Class rdf:about="&OntTeaching;Product"/> <owl:NamedIndividual rdf:about="&OntTeaching;Product1"> <rdf:type rdf:resource="&OntTeaching;Product"/> <rdfs:label rdf:datatype="&xsd;Name">iPhone</rdfs:label> <price rdf:datatype="&xsd;decimal">200</price> </owl:NamedIndividual> OntTeaching:product1 rdf:type OntTeaching:Product OntTeaching:product1 OntTeaching:title “iPhone” OntTeaching:product1 price “200″ RDF View OWL Web Ontology Language OWL: Semantic Markup Language for Publishing/Sharing Ontologies Enumeration on Classes <owl:Class> <owl:oneOf rdf:parseType="Collection"> <owl:Thing rdf:about="#Europe"/> <owl:Thing rdf:about="#Africa"/> <owl:Thing rdf:about="#NorthAmerica"/> <owl:Thing rdf:about="#SouthAmerica"/> <owl:Thing rdf:about="#Australia"/> <owl:Thing rdf:about="#Antarctica"/> </owl:oneOf> </owl:Class> OWL Web Ontology Language OWL ◦ Value Constraints OWL:ALLVALUESFROM OWL:SOMEVALUESFROM OWL:HASVALUE <owl:Restriction> <owl:onProperty rdf:resource="#hasParent" /> <owl:someValuesFrom rdf:resource="#Physician" /> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="#hasParent" /> <owl:hasValue rdf:resource="#Clinton" /> </owl:Restriction> OWL Web Ontology Language OWL: Cardinality constraints OWL:MAXCARDINALITY OWL:MINC ARDINALITY OWL:CARDINALITY <owl:Restriction> <owl:onProperty rdf:resource="#hasParent" /> <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">2</owl:cardinality> </owl:Restriction> OWL Web Ontology Language OWL: Intersection, union and complement OWL:INTERSECTIONOF OWL:UNIONOF OWL:COMPLEMENTOF Not Meat <owl:Class> <owl:complementOf> <owl:Class rdf:about="#Meat"/> </owl:complementOf> </owl:Class> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class> <owl:oneOf rdf:parseType="Collection"> <owl:Thing rdf:about="#Tosca" /> <owl:Thing rdf:about="#Salome" /> </owl:oneOf> </owl:Class> <owl:Class> <owl:oneOf rdf:parseType="Collection"> <owl:Thing rdf:about="#Turandot" /> <owl:Thing rdf:about="#Tosca" /> </owl:oneOf> </owl:Class> </owl:unionOf> </owl:Class> OWL Web Ontology Language OWL: Equivalent Class, Disjoint Class <owl:Class rdf:about="#Man"> <owl:disjointWith rdf:resource="#Woman"/> </owl:Class> <owl:Class rdf:about="#DaPonteOperaOfMozart"> <owl:equivalentClass> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="#hasComposer"/> <owl:hasValue rdf:resource="#Wolfgang_Amadeus_Mozart"/> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="#hasLibrettist"/> <owl:hasValue rdf:resource="#Lorenzo_Da_Ponte"/> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> </owl:Class> How to Build OWL/RDF files? Do we need to remember all the OWL language syntax? How to do it easy to use and remember? Content Why Ontologies? Machine Process able Knowledge Knowledge Exchange Big Data Relevant Technologies Layered Architecture Building Tools and Visualization Ontology Application Information Integration Web Database Management Web Services How to Build RDF/OWL files? Different Building and Visualization Tools ◦ Protégé, http://protege.stanford.edu/ ◦ Gruff, http://www.franz.com/agraph/gruff/ (Download version 3.3) Using Programming Languages ◦ Java and Jena API ◦ http://jena.apache.org/ ◦ http://jena.sourceforge.net/tutorial/RDF_API/ Protégé Tool- Open Source Ontology Editor Class Creation Protégé Tool- Open Source Ontology Editor Property (Object/Data properties) Protégé Tool- Open Source Ontology Editor Individual Creation Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph 1. What is triple Store? type Product Product1 title 200 iPhone <owl:Class rdf:about="&OntTeaching;Product"/> <owl:NamedIndividual rdf:about="&OntTeaching;Product1"> <rdf:type rdf:resource="&OntTeaching;Product"/> <rdfs:label rdf:datatype="&xsd;Name">iPhone</rdfs:label> <price rdf:datatype="&xsd;decimal">200</price> </owl:NamedIndividual> Subject Predicate Object OntTeaching:product1 rdf:type OntTeaching:Product OntTeaching:product1 OntTeaching:title “iPhone” OntTeaching:product1 price “200″ price Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph 1. Create a New Triple Store 2. Choose a Path for the Ontology 3. Load Ontology 4. Present the Ontology Triples 5. Query the Triples Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph Gruff Tool- A Grapher-Based TripleStore Browser for AllegroGraph Programming with Ontologies Java + Jena API Collection of Tools and Java Libraries For Developing Linked-data Apps, Tools and Servers Store Information in RDF Triples in Directed Graphs An Ontology API for Handling OWL and RDFS Ontologies A Rule-based Inference Engine for Reasoning with RDF and OWL data sources Efficient Storage of Triples on Disk A query engine compliant with the latest SPARQL Ontology Building using Jena Ontology: iPhone.owl type title Product Subject Product1 Predicate Object price 200 iPhone Triples http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#Product1 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#Product Product1 http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#title “iPhone” Product1 http://www.semanticweb.org/ontologies/iPhone.Owl#price “200” Ontology Building using Jena The code to create this graph, or model, is simple: // some definitions static String productURI = "http://www.semanticweb.org/ontologies/Product"; // create an empty Model Model model = ModelFactory.createDefaultModel(); // create the resource Resource product = model.createResource(productURI); // add the property product.addProperty(title, ”iPhone”); product.addProperty(price, ”200”); Jena How to read Ontology? // list the statements in the Model StmtIterator iter = model.listStatements(); // print out the predicate, subject and object of each statement while (iter.hasNext()) { Statement stmt = iter.nextStatement(); // get next statement Resource subject = stmt.getSubject(); // get the subject Property predicate = stmt.getPredicate(); // get the predicate RDFNode object = stmt.getObject(); // get the object System.out.print(subject.toString()); System.out.print(" " + predicate.toString() + " "); if (object instanceof Resource) { System.out.print(object.toString()); } else { // object is a literal System.out.print(" \"" + object.toString() + "\""); } System.out.println(" ."); } SPARQL Query Query on Triples with Exact Pattern Matching (Subject of query is Product1) SELECT ?b ?c Where { <http://www.semanticweb.org/ontologies/2012/9/OntTeaching.owl#Pr oduct1> ?b ?c } Result Content Why Ontologies? Machine Process able Knowledge Knowledge Exchange Big Data Relevant Technologies Layered Architecture Building Tools and Visualization Ontology Application Information Integration Web Database Management Web Services Ontology Applications The database of Genotypes and Phenotypes (dbGaP) is archiving the results of different Genome Wide Association Studies (GWAS). • • • • • Phenotype variables are not harmonized across studies. Redundent phenotype identifiers for the same phenotype. dbGaP lacks semantic relations among its variables. Search on phenotypes is inefficient and inaccurate . Goal is to standardize dbGaP information to allow accurate, reusable and quick retrieval of information Ontology Applications Several Available dbGAP Studies phs000284.v1.pht001901.v1.CFS_CARe_Sample.data_dict_2011_02_07 id=”phv00122015”, Description=”Age at time of Study”, name=”age”, version=“1”, Logical Max=”65”, Logical Minimum=”18”, unit=”Years”, type=”decimal” id=”phv00122058”, Description=”Age of patient at the time of Study”, name=”age”, version=“1”, Logical Max=”90”, Logical Minimum=”20”, unit=”Years”, type=”decimal” phs000284.v1.pht001903.v1.CFS_CARe_ECG.data_dict_2011_02_07 Ontology Applications Building Information Model (Ontology) Individual Age id=”phv00122058” id=”phv00122015” Ontology Applications Information Retrieval and Ranking Phenotypes Query={Age of Subject} Study Phenotype Variable phs000284.v1.pht001901.v1.C id=”phv00122058 FS_CARe_Sample.data_dict_2 011_02_07 phs000284.v1.pht001903.v1.C id=”phv00122015” FS_CARe_ECG.data_dict_201 1_02_07 Conclusion Benefits of Structured Data (XML, OWL) Tools to Create and Visualize Ontologies Jena API for Building Ontologies Sparql Queries on Ontologies Applications uses Ontologies Contacts Neda Alipanah Division of Biomedical Informatics 9500 Gilman Dr., Bldg 2 #0203E Email: nalipanah@ucsd.edu