ITK 478 Advanced database Position Paper Relational.OWL Vs Ontology-based Framework for Integrating Databases into the Semantic Web Submitted By K.Venkat ITK 478 Advanced database – Position Paper Relational.OWL Vs Ontology-based Framework for Integrating Databases into the Semantic Web 1. Introduction: Semantic web is a new web technology allows the data to be shared among different data source. Semantic web integrates data from different data sources, giving flexibility to the application [6]. Web ontology language and resource description framework are the two technologies that are used for representing the semantic web data, these languages are recommended by the World Wide Web (W3C). Ontology language is used for storing information that combines the domain or communities and Resource framework description is a standard for storing information that merge different data source [3]. Most of the data accessed from different data source, is stored by and large in relation database so there is need for transforming the relational database to the semantic data. Mapping the queried executed data from the relational data source and the semantic data is not reliable because relationship defined in the relational data schema is different from the semantic data. Relational.Owl is a technique that represents the relational schema in the ontology language, which can be used by the semantic web. Querying the converted schema is done through SPARQL. It is a query language used for querying different data sources. With this approach different relational data source can be represented in ontology automatically and can be queried using SPARQL. Section 2 discusses briefly on using SPARQL and Relational.Owl for representing relational database. The alternative to the above alternative is ontology based framework, web PDDL, a first order ontology language is used for representing the mapping, structure and semantics of the relational data and Ontograte Engine for querying [1]. The working and more information on Ontograte is disused in section 3. From the positives and negatives of the above approaches, position is taken based on the performance and capability. 2. Relational.Owl and SPARQL: In this approach, first the data from the relational database is represented in ontology languages which are understood by semantic applications. Relational.Owl is used for representing relational schema in ontology language. Relational.Owl has different classes defined for example table, column, database, etc. The data from any data source can be represented as instance of these classes. It also defines the relationships of the classes are defined from the original database schema. Data represented in other ontology language or RFD can be mapped to this new schema using syntax such as owl:equivalentClass or owl:equivalentproperty. So using Relational.Owl we have transferred the data form different data source to a common ontology language. The querying of the data is now done using the languages such as SPARQL, RDQL and RQL [2]. The figure 1 shows example of Relational.OWL ontology. Figure 1: Relational.owl ontology and Schema representation [7] [2] 2 ITK 478 Advanced database – Position Paper SPARQL performs most of the basic operations that can be performed using the SQL. The general syntax of SPARQL is shown below [5]. SELECT ? Title WHERE { <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title } The output for the above query is: The tags refers to the class book and title which is equal to the book table and title column in the relational database and ?name represents the variable that are to be shown in output. The join operation in the SPARQL is shown below [2]. The term prefix with a keyword can be used for giving a namespace for the table, columns and relations. PREFIX rdf:[...] PREFIX db :[...] CONSTRUCT {?a ?b ?c; ?e ?f} WHERE {{?a ?b ?c; rdf:type db:COUNTRY} . {?d ?e ?f; rdf:type db:ADDRESS} . {?a db:COUNTRY.COUNTRYID ?x} . {?d db:ADDRESS.COUNTRYID ?x}} With the use of SPQRL and Relational.Owl we can use the relational data in the semantic web applications. 3. Ontology based frame work: In this approach the data is represented in the ontology through Web-PDDL and is given to the Ontograte for the data querying and translation. The data form the relational table are converted to ontology language using Web-PDDL, here inheritance, namespaces, type, predicates, axioms, functions and facts are used to data schema is represented in ontology. For example Inheritance is used for getting aggregation, relations and data types in the relational schema and axioms gets the relationship between the tables. The ontology developed is given to the Ontograte engine for data integration. Ontograte engine integrates data that is represented in Web-PDDL. The data is converted in to ontology language and relationship is defined between the two data and then querying the required data is done. The Web-PDDL is converted to OWL first using the PDDOWL and then used by the semantic web application. We use the same process for already existing ontology, except for the translation of data to ontology language. Here we still using mapping technique for relating data. The semantic web application can talk query the database using OWL-QL query language, this query is converted to PDDSQL using PDDOWL, which is in turn converted into the SQL Queries that can query the database with the help of PDDSQL translators. The Ontograte architecture consists of following different blocks [1]. Integration of schemas and ontology: this module converts the schema into the ontology, based on the standards for translation. Complex database conversion is done through both theoretically and automatically. 3 ITK 478 Advanced database – Position Paper Matching generation: this module matches helps in matching the data in correspondence to the given data through ontology information of the data schema. Matching is done more precisely with the help of learning mapping from the knowledge module and mining large data sets to find candidate mappings and these modules are repeated if necessary. Learning mapping form the knowledge module: helps user to define the specific association or case of the data to the system to understand the relation. Mining large data sets this module helps the user to associate the data through association rule mining technique, where the similarities between the databases are taken in consideration for defining the relationship. User interface: this user interface helps the user to giving his inputs for all the above modules Inference engine: this module helps in applying all the above defined mapping rules for querying the data. 4. Comparing Relational.Owl & SPARQL and Ontology based frame work: Integration: The first approach, Relational.Owl simply transfer a particular database source into ontology language using the defined classes known as Relational.OWL, it doesn’t concentrates on mapping the relation between the two data sources. Ontology based framework also take care of mapping the data sources with the help of domain expert. New query language: Ontology based frame work use the SQL for querying the database. It converts the SQL queries to OWL-QL an ontology based query language using PDDOWL and PDDSQL translators. SPARQL is still in early stages and still plenty of research is going on. SPARQL query language used in the first approach is relative new query language and it doesn’t fully support all the functions such as grouping, sub Queries etc of SQL [2]. SPARQL is a semi structured and does not require joints because mapping can be done using relationship between the data that is represented in RDF format It also does not support hierarchal queries directly [8]. SPARQL works on RDF data (resource description framework). It does not need the target source to map the data, In case of onto grate we need a target data source to map and this mapping can change when any of the database is updated, which needs manual update of mapping every time when an update is made [2]. OntoGrate is integrated system with user interface for user to enter the mapping information to the system. The relational.Owl created for the data source i.e. classes defined may not be equal for all the data sources, for each data source we may have to define different set of classes, which decreases the performance. Complex subclasses in data source will make relational.Owl complex and hard to define. Integrating of xml documents to the semantic web is not supported using OntoGrate. The Relational.Owl and SPARQL don’t need any transferring, as we can query data using SPARQL. Though we need transfer in OntoGrate engine it takes around 3 seconds for 1000 records of data [1], which is relatively fast. Position: The transfer data from relational to the data schema that that can be used and queried by the semantic web needs to me easier to use, have good performance, and should support wide variety of data sources. Ontology based frame work OntoGrate engine supports wide variety of data sources. It can transfer 100,000 data records in one minute and supports integration of data sources well when compared to the 4 ITK 478 Advanced database – Position Paper other approach [1]. The OntoGrate system has a good user interface and integrated system that supports in build transfer. It can take data input of any form except for xml. More importantly it supports the normal SQL query language, which is traditional and mostly used. The SPARQL query language is relative new, though it has some advantages it is better to go for Ontograte which uses SQL and also it is integrated framework which is experimented. Even though we need target data source and manual update we can go for OntoGrate engine. So OntoGrate can be used ahead of the Realtional.Owl and SPARQL approach for getting the data form the relational database and performing the different operations between the two data sources. References: 1. Dejing Dou, Paea LePendu, Shiwoong Kim and Peishen Qi, “Integrating Databases into the Semantic Web through an Ontology-based Framework”, ICDEW, p. 54, Proceedings of the 22nd International Conference on Data Engineering Workshops (ICDEW'06), Year of Publication: 2006. 2. Cristian P´erez de Laborda, Stefan Conrad, “Bringing Relational Data into the Semantic Web using SPARQL and Relational.OWL”, ICDEW, p. 55, 22nd International Conference on Data Engineering Workshops (ICDEW'06), 2006. 3. Janet Daly, “World Wide Web Consortium Issues RDF and OWL Recommendations”, http://www.w3.org/2004/01/sws-pressrelease . 4. Eric Prud'hommeaux, Andy Seaborne, http://www.w3.org/TR/rdf-sparql-query/. “SPARQL Query Language for RDF”, 5. Eric Prud'hommeaux, Andy Seaborne, “SPARQL Query Language for RDF”, W3C Working Draft 21 July 2005, http://www.w3.org/TR/2005/WD-rdf-sparql-query-20050721/#QueryForms. 6. “Semantic Web”, http://www.w3.org/2001/sw/, Date accessed: 2007-09-24. 7. Csongor Nyulas, Martin O’Connor and, Samson Tu, “Data Master – a Plug-in for Importing Schemas and Data from Relational Databases into Protege, http://protege.stanford.edu/conference/2007/presentations/10.01_Nyulas.pdf, Date accessed: 200709-24. 8. Lee Feigenbaum, “SPARQL FAQ”, http://thefigtrees.net/lee/sw/sparql-faq, Date accessed: 200709-24. 5