RDF, Ontologies and Meta-Data Workshop at Uk e-Science Centre 8th June, 2006 OGSA-DAI-RDF & Its Ontology Interfaces Isao Kojima and Masahiro Kimoto Data Grid Team, Grid Technology Research Center http://www.dbgrid.org/ http://www.gtrc.aist.go.jp/ AIST(↓),Japan Summary Implementation of OGSA-DAI RDF Prototype Web(Grid) Service based access to RDF databases – Supports RDF database type to OGSA-DAI database middleware – RDF handling Activities including SPARQL query processing • Supports Jena and Sesame • Supports RDFS/OWL – Ontology Interfaces implementation • Based on the Ontogrid’s WS-DAIO initial(old) Proposal Specs – Schema & Instance Handling Interfaces – Supports Jena – Supports RDFS/OWL – Current Status and Future Directions Approach • OGSA-based: Web Service Interfaces – Use of Globus Toolkit/WS-RF – Can be combined with other Grid tools (GRAM,MDS,…..) • Query Language (SPARQL) – W3C Standard Query Language for RDF – Multi-Platform approach • To support Jena, Sesame,,,,,,, (Oracle?) • OGSA-DAI: – Activity Framework • To combine RDF databases other data resources – XMLDBs, Relational DBs(and Files) • To achieve RDF database workflow programming – DataConversion, Data Transfer,( Web Services) OGSA-DAI • OGSA-DAI: http://www.ogsadai.org.uk/ – UK e-Science R&D project for grid DB middleware • Service-based Database Query and Integration – RDB:SQL (Oracle,PostgreSQL,MySQL,DB2,,,) – XML-DB: XPath and XQuery (Xindice,eXist,,,) – Integration tool OGSA-DQP for RDBs • Supports Grid Middleware – WS-RF(Globus Toolkit 4) – WS-I(OMII) OGSA-DAI-RDF OGSA-DAI Activities: Set of Primitives for Data Processing Relational, XML, File Transfer, Data Conversion/Compression Not available for some version RDF Activities OGSA-DAI-RDF: Set of RDF Processing Activities Implemented Activities (Platform Independent = All activity supports Jena and Sesame) 1. SPARQL QueryStatement Activity – 2. W3C Standard SPARQL for Jena/Sesame RDF Resource(triple)Management Activity – 3. Create/Delete RDF triples,statements RDF Collection(model)Management Activity – 4. Create/Delete/List RDF databases(model,repository) RDF BulkLoad Activity – – Query (SPARQL) & Instance Level (Without ontology) Appoach Bulk loading functions of RDF triples To achieve common Interface Pattern with Relational, XML-DBs. 5. Ontology Interfaces (based on the Ontogrid’s WS-DAIO initial(old) specs) - Schema level Activity - Instance level Activity OGSA-DAI-RDF demo(1): Multi-Platform Environment SPARQL GUI Query Interface for Jena/Sesame – Modified OGSA-DAI DataBrowser OGSA-DAI Data Service (WS-RF and WS-I) Sesame Jena SPARQL Statement Result is SPARQL/XML Format OGSA-DAI Activity Framework • RDF Activities can be combined with other OGSA-DAI Activities – Data Compression,Conversion and Transfer – Other database Resources (SQL-DB, XML-DB) • Activity Programming – Support of Workflow Programming – Simple Distributed Processing Site1: 1.SPAQL query result is converted with XLSTransform Activity 2. The converted data is send to site2 by deliverTo Activity Site2: 1.Data is received with delverfrom Activity 2.Triples are inserted into the temporary model by ResourceManagement Activity 3.Anthoer query is done for this temporary model Demo(2) : Distributed SPARQL Processing ( to show the power of OGSA-DAI Activity Framework) •Parallel & Synchronized Data Processing over 3 different sites can be programmed with this Framework Currently Supported Platforms • Jena and Sesame – RDF(S) and OWL for Jena • Middleware-> Depends on the Underlying Platforms – Operational Semantics – Functional Limitation • Supported Implementation Type – On DBMS(RDB as default) – On Memory (Jena) – On File (Sesame) Can be specified OGSA-DAI resource config file Dai.data.resource.type=RDF Function Summary Functions Future OGSA-DAI-RDF WS-DAI-RDF/Ont Data RDF and RDFS •RDF, RDFS and OWL A Single Architecture to cover OWL Ontology Interface WS-DAI-Ont •Schema Activity •Instance Activity Must follow current WS-DAI-RDFOnt specs Query Language SPARQL SPARQL QueryStatement Activity Must have some compatibility with W3C SPARQL protocol Result Set SPARQL XML Result Format SPARQL XML Result Format Updates Supports basic Update Activity Graph Management Graph(Collection) Management Activity Bulk Operation BulkLoad Activity Factory Operations Notes and Problems Must cope with OGSA-DAI resource configuration Implemented Ontology Interface Activities This is only a prototype based on the obsolete specs •To find the usefulness of these kind of interfaces •To get the feedbacks for the future ontology interface discussion •To explore the integrated architecture with Query interface and Ontology Interface •To be a candidate for the second GGF reference implementation • SchemaActivity • InstanceActivity AddConcept AddIndividual RemoveConcept RemoveIndividual GetConcept Basic Ontology GetIndividualBasic Instance Handling Functions GetConcepts HasConcept Parent/Child Ancestor/Descend AddSubclassOfRelation ant, RemoveSubclassOfRelation Sibling,,,, GetRelatedConcepts GetIndividuals Handling HasIndividual Functions Add/Remove AddInstanceOfRelation Get/Has,,, RemoveInstanceOfRelation Individual GetIndividualParents PerformDocument Example: NCI Cancer Ontology Import of NCI cancer ontology using the Activity Framework 1. Create a Model on RDB using RDF Collection(model)Management Activity 2. Get the data from the NCI site using the OGSA-DAI deliverFromActivity 3. Uncompress the file using the OGSA-DAI gZip Activity 4. Load the data with our RDF BulkLoad Activity Access Example of the Ontology <?xml version="1.0" encoding="UTF-8"?> <!-- (c) International Business Machines Corporation, 2002 - 2005.--> <!-- (c) University of Edinburgh, 2002 - 2005.--> <!-- See OGSA-DAI-Licence.txt for licencing information.--> NCI Cancer Ontology <perform xmlns="http://ogsadai.org.uk/namespaces/2005/10/types"> <ontologySchemaManagement name="myActivity"> <getRelatedClasses fromData="rdb" className="http://www.mindswap.org/2003 cology.owl#Chemotherapy" lineageRelation="ancestor"> <entry>cancerOntology</entry> </getRelatedClasses> <outputStream name="myActivityOutput"/> Get the ancestor of </ontologySchemaManagement> </perform> The class “Chemotherapy” Execution Examples 1. 2. 3. 4. GetClass HasClass GetRelatedClasses(ancest or) areClassesRelated(Parent ?) Planned Applications Grid Database Integration with Ontology Clinical/Bioinformatics Integration (with U-Tokyo) AIST Geogrid Project (based on GEON) Feedbacks and Problems Specifications • Our system aims to cover OWL and RDF(S) – We think it is very convenient to cover them in a single framework. • First GGF DAIS specs for RDF will target RDF(S) – WS-DAI-RDF/Ont architecture should have some extensibility/compatibility for future OWL – Need to Synchronize R&D and the Spec Discussion • Must have some compatibility with W3C SPARQL Protocol • Security and Authorization: No idea at now Implementations • Jena: – Simple wrapping of Jena Ontology API • Sesame: – Each interface message is converted to the query representation • On top of the current OGSA-DAI-RDF query module – Our current approach is based on SPARQL, not SeRQL • Aims to be a yet another implementation of WS-DAI-RDF/Ont • Restricted with Sesame SPARQL implementation • Should wait for the future stable releases Current Implementation & Roadmap • Current Prototype OGSA-DAI-RDF Activities Query Interface=full function with basic updates Ontology interface= basic functions of WS-DAIO • Demo Server environment is already set-up – If you have OGSA-DAI 2.1 and download the configuration XML, you can access our demo site using SPARQL – Current config xsds will be downloadble soon. • June,2006. 1st Release (without ontology) Version 0.9 Ready to be released now. Re-designing the interface: Possible inconsistency with future releases. Almost no english document • Sep. 2006 or Nov. 2006 2nd Release Version1.0 with Stable Spec (with ontology interface????) Future Directions • Distributed & Scalability – Distributed Query Processing? – Combine with P2P Query Processing? (presented yesterday by A.Matono) We think the single site performance is already the matter of the DB vendors. Multi-site performance and optimization is our concern • Applications (related topic will be presented today by S.Mirza) – Resource Discovery & Management – Service Management for the Semantic SOA. – Database Integration (Geosciences, Clinical-bioinformatics) • Collaboration – DAIS-RDF/Ont Spec Discussions / Ontogrid Project (F2F will be tomorrow) – Semantic Web Group within AIST