SEMANTIC DEFINITION AND MATCHING FOR NATIONAL SPATIAL DATA INFRASTRUCTURE Gülten Kara, Deniztan Ulutaş, Çetin Cömert, Karadeniz Technical University, Trabzon, Turkey In Turkey, the establishment of National Spatial Data Infrastructure (NSDI) is on the agenda. The technologies which are still in use for technological infrastructure of any SDI are “Syntactic Web” technologies. However, it is foreseen that in the near feature, the current technologies will be replaced by “Semantic Web” technologies. This has formed the motivation for this work which aimed at developing a methodology for semantic definition of data of the participators of an NSDI. One of the primary requirements of SW is the semantic definition of data and services. The semantic definition is that syntactic definition of data and services is represented by one of the semantic web languages and definition data and services is associated upper ontology. In the related literature, several projects are fulfilled and studies have been published. FinnONTO (National Semantic Web Ontology Project in Finland) Project1 started in 2003 and it is scheduled to complete in 2012. The goal of project is to lay a foundation for a national metadata, ontology, ontology service, and linked data framework in Finland. SWING (Semantic Web Services Interoperability for Geospatial Decision Making) Project2 runs from 2006 to 2009. The objective of project is to provide an open, easy-to-use SWS framework of suitable ontologies and inference tools for annotation, discovery, composition, and invocation of geospatial web services. ACE-GIS (Adaptable and Composable E-commerce and Geographic Information Services) project (URL-1) started in June 2002 and was successfully completed in October 2004. The project provides better and more efficient tools for development, deployment, discovery and composition of distributed web-services with special emphasis on the key combination of geographical information and e-commerce services. Schade (2009) presented an approach for achieving computer-tractable translation of geospatial data. Klien (2008) propose a method to automatically support the semantic annotation process that evaluates the validity of existing annotations and suggests possible new ones. Lemmens (2006) proposed semantic interoperability framework serving semantic definition of spatial web services with Nen36103, Top10NL4, RiskMap and Travel ontologies. Dolbear at al., (2005) addresses the problem of integrating a data ontology, exactly describing the database schema of the British national mapping agency, with an application ontology. The problem here is how semantic definition of data and services will be made. In our previous work (Kara, Cömert, 2011), we proposed the methodology to make semantic definition of the participators of an NSDI. The second requirement of SW is semantic annotation of data and services. Semantic annotation is formal statement establishing a link between concepts in ontology and features in a data source. This task requires upper ontologies. Since an upper ontology 1 FinnONTO Project, http://www.seco.tkk.fi/projects/finnonto/ SWING Project, http://138.232.65.156/swing/index.html 3 NEN3610, http://www.geonovum.nl/content/geonovum-0 4 TOP10NL, http://www.kadaster.nl/window.html?inhoud=/top10nl/ 2 describes very general concepts, it has to be extended. In the related literature, several ontology extension studies are available. Probst (2007) extended DOLCE5 for making the meaning of symbols explicit that denote observation and measurement results. Klien (2008) extended DOLCE for providing a classification schema of geographic objects. Novaljia and Mladenić (2010) proposed a methodology for text-driven semi-automatic ontology extension using Cyc ontology content and ontology structure information. OntoPlus (URL-2) is a text-driven methodology for extending ontologies, using content, structure and co-occurrence information. The OntoPlus methodology can extend large, multi-domain ontologies, and is implemented as an interface for extending the Cyc ontology using glossary files. One of the problems here is to determine what upper ontology will be used. It is out of scope of this paper how the upper ontology is evaluated. We select DOLCE ontology because its size is smaller than others (e.g. SUMO6, Cyc7). Second problem is to identify the concept extended in the upper ontology and how the procedure of ontology extension will be followed. We extend DOLCE re-using part of Klien’s taxonomy and commit ourselves subcategories of “ManMadeStructure”. We will continue to extend upper ontology for the GCM-Road Ontology and INSPIRE-TN Ontology. The other requirement of SW is semantic matching. If schemas thought as graph structure, “Semantic Matching” can be perceived as concepts of two graph nodes comparing semantically for determining the similarities between them. There are various works related with semantic matching in the literature. In our schema matching scenario, we used S-Match (Giunchiglia et al., 2004) software. S-Match takes two schemas and returns semantic relations between the nodes of the schemas using WordNet (URL-3) lexical database as an external resource. We can classify schema matching technics with three groups. These are; schema based, instance based, external resource based. S-Match fits both schema and external resource based class. In our schema matching scenario, we converted Road Schema of General Command of Matching8 and Transport Network (TN) Schema of INSPIRE (Infrastructure for Spatial Information in Europe) to semantic web language using SWT (GCM-Road Ontology and INSPIRE-TN Ontology). Then, we implemented semantic matching between GCMRoad Ontology and INSPIRE-TN Ontology with S-Match. In our future work, we plan to make works to increase accuracy and amount of match results. For this, in order to use in matching we plan to create a source like GeoWordNet (URL-4) that contains Turkish spatial concepts, attributes and relations. As far as we know, there isn’t available a resource like it in Turkey literature. And also, we will make works about matching relations of schema entities. And we will go on works about transformation between schemas. 5 DOLCE, http://www.loa.istc.cnr.it/DOLCE.html SUMO, http://www.ontologyportal.org/ 7 Cyc, http://www.cyc.com/ 8 GCM is the National Mapping Agency for 1/25 000 and smaller scale maps in Turkey 6 REFERENCES Dolbear C., Goodwin J., Mizen H., Ritchie J., 2005, Semantic interoperability between topographic data and a flood defence ontology, Ordnance Survey Technical Report I001. Giunchiglia, F., Shvaiko, P., Yatskevich, M., 2004. S-MATCH: An Algorithm and An Implementation of Semantic Matching, Technical report # DIT-04-015, February 2004, Trento (Italy). Kara, G., and Cömert, C., 2011. Semantıc Data Defınıtıon For Natıonal Spatıal Data Infrastructure, Congress of Geographic Information Systems, 31 October - 04 November 2011, Antalya Culture Center, Antalya. Klien, E. (2008). Semantic Annotation of Geographic Information. Institute for Geoinformatics. Münster, Germany, University of Münster. PhD. Lemmens, R. L. G., 2006. Semantic interoperability of distributed geo – services, Netherlands Geodetic Commission NCG : Publications on Geodesy : New Series 63, ISBN: 90-6132-298-7. Novalija, I., Mladenic, D., 2010. Ontology Extension Towards Analysis of Business News. Informatica (Slovenia) 34(4): 517-522. Probst, F. (2007). Semantic Reference Systems for Observations and Measurements, Institute for Geoinformatics. Münster, Germany, University of Münster. PhD. Schade, S., 2009. Ontology-Driven Translation of Geospatial Data, PhD Thesis, Institute for GeoInformatics, University of Münster, Münster, Germany. URL-1, ACE-GIS Project http://plone.itc.nl/agile_old/Conference/greece2004/papers/P11_Poveda.pdf, 11.10. 2010. URL-2 OntoPlus, http://www.youtube.com/watch?v=9h9iZYGQ9P4, 20.12.2011. URL-3 WordNet, http://wordnet.princeton.edu/, 22.12.2011 URL-4GeoWordNet,http://s-match.org/background-knowledge-datasets.html 23.12.2011 ,