OGSA-DAI-RDF & Its Ontology Interfaces Isao Kojima and Masahiro Kimoto

advertisement
RDF, Ontologies and Meta-Data Workshop at Uk e-Science Centre
8th June, 2006
OGSA-DAI-RDF
&
Its Ontology Interfaces
Isao Kojima and Masahiro Kimoto
Data Grid Team,
Grid Technology Research Center
http://www.dbgrid.org/
http://www.gtrc.aist.go.jp/
AIST(↓),Japan
Summary
Implementation of OGSA-DAI RDF Prototype
Web(Grid) Service based access to RDF databases
– Supports RDF database type to OGSA-DAI database
middleware
– RDF handling Activities including SPARQL query
processing
• Supports Jena and Sesame
• Supports RDFS/OWL
– Ontology Interfaces implementation
• Based on the Ontogrid’s WS-DAIO initial(old) Proposal Specs
– Schema & Instance Handling Interfaces
– Supports Jena
– Supports RDFS/OWL
– Current Status and Future Directions
Approach
• OGSA-based: Web Service Interfaces
– Use of Globus Toolkit/WS-RF
– Can be combined with other Grid tools
(GRAM,MDS,…..)
• Query Language (SPARQL)
– W3C Standard Query Language for RDF
– Multi-Platform approach
• To support Jena, Sesame,,,,,,, (Oracle?)
• OGSA-DAI:
– Activity Framework
• To combine RDF databases other data resources
– XMLDBs, Relational DBs(and Files)
• To achieve RDF database workflow programming
– DataConversion, Data Transfer,( Web Services)
OGSA-DAI
• OGSA-DAI: http://www.ogsadai.org.uk/
– UK e-Science R&D project for grid DB middleware
• Service-based Database Query and Integration
– RDB:SQL (Oracle,PostgreSQL,MySQL,DB2,,,)
– XML-DB: XPath and XQuery (Xindice,eXist,,,)
– Integration tool OGSA-DQP for RDBs
• Supports Grid Middleware
– WS-RF(Globus Toolkit 4)
– WS-I(OMII)
OGSA-DAI-RDF
OGSA-DAI Activities: Set of Primitives for Data Processing
Relational, XML, File Transfer, Data Conversion/Compression
Not available
for some version
RDF
Activities
OGSA-DAI-RDF: Set of RDF Processing Activities
Implemented Activities
(Platform Independent = All activity supports Jena and Sesame)
1.
SPARQL QueryStatement Activity
–
2.
W3C Standard SPARQL for Jena/Sesame
RDF Resource(triple)Management Activity
–
3.
Create/Delete RDF triples,statements
RDF Collection(model)Management Activity
–
4.
Create/Delete/List RDF databases(model,repository)
RDF BulkLoad Activity
–
–
Query
(SPARQL)
&
Instance Level
(Without ontology)
Appoach
Bulk loading functions of RDF triples
To achieve common Interface Pattern with Relational, XML-DBs.
5. Ontology Interfaces
(based on the Ontogrid’s WS-DAIO initial(old) specs)
- Schema level Activity
- Instance level Activity
OGSA-DAI-RDF demo(1): Multi-Platform Environment
SPARQL GUI Query Interface for Jena/Sesame
– Modified OGSA-DAI DataBrowser
OGSA-DAI
Data Service
(WS-RF and WS-I)
Sesame
Jena
SPARQL
Statement
Result is
SPARQL/XML
Format
OGSA-DAI Activity Framework
• RDF Activities can be combined with other OGSA-DAI Activities
– Data Compression,Conversion and Transfer
– Other database Resources (SQL-DB, XML-DB)
• Activity Programming
– Support of Workflow Programming
– Simple Distributed Processing
Site1:
1.SPAQL query result is converted with XLSTransform Activity
2. The converted data is send to site2 by deliverTo Activity
Site2:
1.Data is received with delverfrom Activity
2.Triples are inserted into the temporary model by ResourceManagement Activity
3.Anthoer query is done for this temporary model
Demo(2) : Distributed SPARQL Processing ( to show the
power of OGSA-DAI Activity Framework)
•Parallel & Synchronized Data Processing over 3 different
sites can be programmed with this Framework
Currently Supported Platforms
• Jena and Sesame
– RDF(S) and OWL for Jena
• Middleware-> Depends on the Underlying Platforms
– Operational Semantics
– Functional Limitation
• Supported Implementation Type
– On DBMS(RDB as default)
– On Memory (Jena)
– On File (Sesame)
Can be specified OGSA-DAI resource config file
Dai.data.resource.type=RDF
Function Summary
Functions
Future
OGSA-DAI-RDF
WS-DAI-RDF/Ont
Data
RDF and RDFS
•RDF, RDFS and OWL
A Single Architecture
to cover OWL
Ontology Interface
WS-DAI-Ont
•Schema Activity
•Instance Activity
Must follow current
WS-DAI-RDFOnt specs
Query Language
SPARQL
SPARQL
QueryStatement
Activity
Must have some
compatibility with W3C
SPARQL protocol
Result Set
SPARQL XML
Result Format
SPARQL
XML Result Format
Updates
Supports basic
Update Activity
Graph Management
Graph(Collection)
Management
Activity
Bulk Operation
BulkLoad Activity
Factory Operations
Notes and
Problems
Must cope with
OGSA-DAI resource
configuration
Implemented Ontology Interface Activities
This is only a prototype based on the obsolete specs
•To find the usefulness of these kind of interfaces
•To get the feedbacks for the future ontology interface discussion
•To explore the integrated architecture with Query interface and Ontology Interface
•To be a candidate for the second GGF reference implementation
• SchemaActivity
• InstanceActivity
AddConcept
AddIndividual
RemoveConcept
RemoveIndividual
GetConcept Basic Ontology
GetIndividualBasic Instance
Handling Functions
GetConcepts
HasConcept
Parent/Child
Ancestor/Descend
AddSubclassOfRelation
ant,
RemoveSubclassOfRelation
Sibling,,,,
GetRelatedConcepts
GetIndividuals Handling
HasIndividual
Functions
Add/Remove
AddInstanceOfRelation
Get/Has,,,
RemoveInstanceOfRelation
Individual
GetIndividualParents
PerformDocument Example: NCI Cancer Ontology
Import of NCI cancer ontology using the Activity Framework
1. Create a Model on RDB using RDF Collection(model)Management Activity
2. Get the data from the NCI site using the OGSA-DAI deliverFromActivity
3. Uncompress the file using the OGSA-DAI gZip Activity
4. Load the data with our RDF BulkLoad Activity
Access Example of the Ontology
<?xml version="1.0" encoding="UTF-8"?>
<!-- (c) International Business Machines Corporation, 2002 - 2005.-->
<!-- (c) University of Edinburgh, 2002 - 2005.-->
<!-- See OGSA-DAI-Licence.txt for licencing information.-->
NCI
Cancer Ontology
<perform
xmlns="http://ogsadai.org.uk/namespaces/2005/10/types">
<ontologySchemaManagement name="myActivity">
<getRelatedClasses fromData="rdb" className="http://www.mindswap.org/2003
cology.owl#Chemotherapy" lineageRelation="ancestor">
<entry>cancerOntology</entry>
</getRelatedClasses>
<outputStream name="myActivityOutput"/>
Get the ancestor of
</ontologySchemaManagement>
</perform>
The class
“Chemotherapy”
Execution Examples
1.
2.
3.
4.
GetClass
HasClass
GetRelatedClasses(ancest
or)
areClassesRelated(Parent
?)
Planned Applications
Grid Database Integration with Ontology
Clinical/Bioinformatics Integration
(with U-Tokyo)
AIST Geogrid Project (based on GEON)
Feedbacks and Problems
Specifications
• Our system aims to cover OWL and RDF(S)
– We think it is very convenient to cover them in a single framework.
• First GGF DAIS specs for RDF will target RDF(S)
– WS-DAI-RDF/Ont architecture should have some extensibility/compatibility for
future OWL
– Need to Synchronize R&D and the Spec Discussion
• Must have some compatibility with W3C SPARQL Protocol
• Security and Authorization: No idea at now
Implementations
• Jena:
–
Simple wrapping of Jena Ontology API
• Sesame:
– Each interface message is converted to the query representation
• On top of the current OGSA-DAI-RDF query module
– Our current approach is based on SPARQL, not SeRQL
• Aims to be a yet another implementation of WS-DAI-RDF/Ont
• Restricted with Sesame SPARQL implementation
• Should wait for the future stable releases
Current Implementation & Roadmap
• Current Prototype OGSA-DAI-RDF Activities
Query Interface=full function with basic updates
Ontology interface= basic functions of WS-DAIO
• Demo Server environment is already set-up
– If you have OGSA-DAI 2.1 and download the configuration XML, you can
access our demo site using SPARQL
– Current config xsds will be downloadble soon.
• June,2006. 1st Release (without ontology) Version 0.9
Ready to be released now.
Re-designing the interface: Possible inconsistency with future releases.
Almost no english document
• Sep. 2006 or Nov. 2006
2nd Release Version1.0 with Stable Spec (with ontology interface????)
Future Directions
• Distributed & Scalability
– Distributed Query Processing?
– Combine with P2P Query Processing?
(presented yesterday by A.Matono)
We think the single site performance is already the matter of the DB vendors.
Multi-site performance and optimization is our concern
• Applications
(related topic will be presented today by S.Mirza)
– Resource Discovery & Management
– Service Management for the Semantic SOA.
– Database Integration (Geosciences, Clinical-bioinformatics)
• Collaboration
– DAIS-RDF/Ont Spec Discussions / Ontogrid Project
(F2F will be tomorrow)
– Semantic Web Group within AIST
Download