Introduction: - School of Information Technology

advertisement
ITK 478 Advanced database
Position Paper Relational.OWL Vs Ontology-based Framework for Integrating Databases into the
Semantic Web
Submitted By
K.Venkat
ITK 478 Advanced database – Position Paper
Relational.OWL Vs Ontology-based Framework for Integrating Databases into the Semantic Web
1. Introduction:
Semantic web is a new web technology allows the data to be shared among different data source. Semantic
web integrates data from different data sources, giving flexibility to the application [6]. Web ontology
language and resource description framework are the two technologies that are used for representing the
semantic web data, these languages are recommended by the World Wide Web (W3C). Ontology language
is used for storing information that combines the domain or communities and Resource framework
description is a standard for storing information that merge different data source [3]. Most of the data
accessed from different data source, is stored by and large in relation database so there is need for
transforming the relational database to the semantic data. Mapping the queried executed data from the
relational data source and the semantic data is not reliable because relationship defined in the relational
data schema is different from the semantic data.
Relational.Owl is a technique that represents the relational schema in the ontology language, which can be
used by the semantic web. Querying the converted schema is done through SPARQL. It is a query language
used for querying different data sources. With this approach different relational data source can be
represented in ontology automatically and can be queried using SPARQL. Section 2 discusses briefly on
using SPARQL and Relational.Owl for representing relational database.
The alternative to the above alternative is ontology based framework, web PDDL, a first order ontology
language is used for representing the mapping, structure and semantics of the relational data and Ontograte
Engine for querying [1]. The working and more information on Ontograte is disused in section 3.
From the positives and negatives of the above approaches, position is taken based on the performance and
capability.
2. Relational.Owl and SPARQL:
In this approach, first the data from the relational database is represented in ontology languages which are
understood by semantic applications. Relational.Owl is used for representing relational schema in ontology
language. Relational.Owl has different classes defined for example table, column, database, etc. The data
from any data source can be represented as instance of these classes. It also defines the relationships of the
classes are defined from the original database schema. Data represented in other ontology language or RFD
can be mapped to this new schema using syntax such as owl:equivalentClass or owl:equivalentproperty. So
using Relational.Owl we have transferred the data form different data source to a common ontology
language. The querying of the data is now done using the languages such as SPARQL, RDQL and RQL
[2]. The figure 1 shows example of Relational.OWL ontology.
Figure 1: Relational.owl ontology and Schema representation [7] [2]
2
ITK 478 Advanced database – Position Paper
SPARQL performs most of the basic operations that can be performed using the SQL. The general syntax
of SPARQL is shown below [5].
SELECT ? Title
WHERE
{
<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title
}
The output for the above query is:
The tags refers to the class book and title which is equal to the book table and title column in the relational
database and ?name represents the variable that are to be shown in output. The join operation in the
SPARQL is shown below [2]. The term prefix with a keyword can be used for giving a namespace for the
table, columns and relations.
PREFIX rdf:[...]
PREFIX db :[...]
CONSTRUCT {?a ?b ?c;
?e ?f} WHERE {{?a ?b ?c;
rdf:type db:COUNTRY} .
{?d ?e ?f;
rdf:type db:ADDRESS} .
{?a db:COUNTRY.COUNTRYID ?x} .
{?d db:ADDRESS.COUNTRYID ?x}}
With the use of SPQRL and Relational.Owl we can use the relational data in the semantic web applications.
3. Ontology based frame work:
In this approach the data is represented in the ontology through Web-PDDL and is given to the Ontograte
for the data querying and translation. The data form the relational table are converted to ontology language
using Web-PDDL, here inheritance, namespaces, type, predicates, axioms, functions and facts are used to
data schema is represented in ontology. For example Inheritance is used for getting aggregation, relations
and data types in the relational schema and axioms gets the relationship between the tables. The ontology
developed is given to the Ontograte engine for data integration. Ontograte engine integrates data that is
represented in Web-PDDL.
The data is converted in to ontology language and relationship is defined between the two data and then
querying the required data is done. The Web-PDDL is converted to OWL first using the PDDOWL and
then used by the semantic web application. We use the same process for already existing ontology, except
for the translation of data to ontology language. Here we still using mapping technique for relating data.
The semantic web application can talk query the database using OWL-QL query language, this query is
converted to PDDSQL using PDDOWL, which is in turn converted into the SQL Queries that can query the
database with the help of PDDSQL translators. The Ontograte architecture consists of following different
blocks [1].
Integration of schemas and ontology: this module converts the schema into the ontology, based on the
standards for translation. Complex database conversion is done through both theoretically and
automatically.
3
ITK 478 Advanced database – Position Paper
Matching generation: this module matches helps in matching the data in correspondence to the given data
through ontology information of the data schema. Matching is done more precisely with the help of
learning mapping from the knowledge module and mining large data sets to find candidate mappings and
these modules are repeated if necessary.
Learning mapping form the knowledge module: helps user to define the specific association or case of
the data to the system to understand the relation.
Mining large data sets this module helps the user to associate the data through association rule mining
technique, where the similarities between the databases are taken in consideration for defining the
relationship.
User interface: this user interface helps the user to giving his inputs for all the above modules
Inference engine: this module helps in applying all the above defined mapping rules for querying the data.
4. Comparing Relational.Owl & SPARQL and Ontology based frame work:
Integration: The first approach, Relational.Owl simply transfer a particular database source into ontology
language using the defined classes known as Relational.OWL, it doesn’t concentrates on mapping the
relation between the two data sources. Ontology based framework also take care of mapping the data
sources with the help of domain expert.
New query language: Ontology based frame work use the SQL for querying the database. It converts the
SQL queries to OWL-QL an ontology based query language using PDDOWL and PDDSQL translators.
SPARQL is still in early stages and still plenty of research is going on.
SPARQL query language used in the first approach is relative new query language and it doesn’t fully
support all the functions such as grouping, sub Queries etc of SQL [2].
SPARQL is a semi structured and does not require joints because mapping can be done using relationship
between the data that is represented in RDF format It also does not support hierarchal queries directly [8].
SPARQL works on RDF data (resource description framework). It does not need the target source to map
the data, In case of onto grate we need a target data source to map and this mapping can change when any
of the database is updated, which needs manual update of mapping every time when an update is made [2].
OntoGrate is integrated system with user interface for user to enter the mapping information to the system.
The relational.Owl created for the data source i.e. classes defined may not be equal for all the data sources,
for each data source we may have to define different set of classes, which decreases the performance.
Complex subclasses in data source will make relational.Owl complex and hard to define. Integrating of xml
documents to the semantic web is not supported using OntoGrate.
The Relational.Owl and SPARQL don’t need any transferring, as we can query data using SPARQL.
Though we need transfer in OntoGrate engine it takes around 3 seconds for 1000 records of data [1], which
is relatively fast.
Position:
The transfer data from relational to the data schema that that can be used and queried by the semantic web
needs to me easier to use, have good performance, and should support wide variety of data sources.
Ontology based frame work OntoGrate engine supports wide variety of data sources. It can transfer
100,000 data records in one minute and supports integration of data sources well when compared to the
4
ITK 478 Advanced database – Position Paper
other approach [1]. The OntoGrate system has a good user interface and integrated system that supports in
build transfer. It can take data input of any form except for xml. More importantly it supports the normal
SQL query language, which is traditional and mostly used. The SPARQL query language is relative new,
though it has some advantages it is better to go for Ontograte which uses SQL and also it is integrated
framework which is experimented. Even though we need target data source and manual update we can go
for OntoGrate engine. So OntoGrate can be used ahead of the Realtional.Owl and SPARQL approach for
getting the data form the relational database and performing the different operations between the two data
sources.
References:
1.
Dejing Dou, Paea LePendu, Shiwoong Kim and Peishen Qi, “Integrating Databases into the
Semantic Web through an Ontology-based Framework”, ICDEW, p. 54, Proceedings of the 22nd
International Conference on Data Engineering Workshops (ICDEW'06), Year of
Publication: 2006.
2.
Cristian P´erez de Laborda, Stefan Conrad, “Bringing Relational Data into the Semantic Web
using SPARQL and Relational.OWL”, ICDEW, p. 55, 22nd International Conference on Data
Engineering Workshops (ICDEW'06), 2006.
3.
Janet Daly, “World Wide Web Consortium Issues RDF and OWL Recommendations”,
http://www.w3.org/2004/01/sws-pressrelease .
4.
Eric Prud'hommeaux, Andy Seaborne,
http://www.w3.org/TR/rdf-sparql-query/.
“SPARQL
Query
Language
for
RDF”,
5. Eric Prud'hommeaux, Andy Seaborne, “SPARQL Query Language for RDF”, W3C Working
Draft 21 July 2005, http://www.w3.org/TR/2005/WD-rdf-sparql-query-20050721/#QueryForms.
6.
“Semantic Web”, http://www.w3.org/2001/sw/, Date accessed: 2007-09-24.
7.
Csongor Nyulas, Martin O’Connor and, Samson Tu, “Data Master – a Plug-in for Importing
Schemas
and
Data
from
Relational
Databases
into
Protege,
http://protege.stanford.edu/conference/2007/presentations/10.01_Nyulas.pdf, Date accessed: 200709-24.
8.
Lee Feigenbaum, “SPARQL FAQ”, http://thefigtrees.net/lee/sw/sparql-faq, Date accessed: 200709-24.
5
Download