EUROREGIONALMAP, A PRAGMATIC SOLUTION TO CREATE AN ESDI AT MEDIUM SCALE THROUGH THE HARMONIZATION OF NATIONAL TOPOGRAPHIC DATA COLLECTIONS Nathalie Delattre Project Manager Geographic National Institute of Belgium ABSTRACT The paper presents the data harmonisation proceedings to create a ESDI up to the national data collections from 6 mapping agencies at scale 1:250 000; the harmonisation approach, how the specifications have been set up, the harmonisation procedures, the gained experiences on the main points in data harmonisation and the future of the project which aims to cover the whole of Europe. KEYWORDS: Specifications Data Harmonisation, National Mapping Agencies, Proceedings, European INTRODUCTION General topographic and other geographic reference data are the starting point for many GIS based applications. In Europe, large quantities of reference data exist, as national, regional and local public authorities regularly collect them. However, for historical reasons these data collections vary considerably from one country to another, both in content and in many other technical aspects, and complications abound whenever reference data from different sources have to be combined. No universal solution exists to overcome this situation: standardization is typically a practical problem. But there are also no good technical reasons to maintain these national differences. In addition, there is an increasing demand for ‘cross border’ geographic information to support good governance and private business. This is why the National Mapping Agencies (NMAs) cooperating within the EuroGeographics association have started this project to show the way of a pragmatic harmonization of existing data collections. The approach is called pragmatic, because it starts with data collections at medium scale for a European level and gives priority to those aspects of harmonization that are both significantly useful and realistically achievable under cost constraints. It is furthermore pragmatic, because it starts with only seven European countries (Germany, France, Denmark, Belgium, Luxembourg, Ireland and Northern Ireland), so that the experience gained in the process will make it easier for all the other EuroGeographics NMAs to join in the harmonization effort and extend it to the whole of Europe. SETTING UP EUROPEAN SPECIFICATIONS Figure 1: setting up European Specifications Product Specification The data quality objective of the specifications is to reach a high-quality representation of the real world at a medium spatial resolution (1:250 000) with a complete set of geographical and topographical data, structured in a coherent way according to a standard model, being cross-border and harmonized at international boundaries. These data would be used as reference data for multifunctional purposes suitable for visualization as well as for spatial analysis. On the practical side, the resulting specifications reflect the negotiation between partners on a compromise between 'ideal' specifications for users on the one hand and a 'practical' specifications that could be implemented at an acceptable cost by the project partners on the other hand. This is at the heart of the 'pragmatic approach' towards harmonization of existing national datasets. Therefore the data availability and data structure have been somehow restricted to what the mapping agencies could currently provide in a short-term production while keeping as guideline the main requirements of the users. One approach was to classify information into mandatory and optional. The criteria to decide which information will be mandatory or not were based on the common data that the mapping agencies could provide and on their degree of importance for the users. This determined the CORE data content of EuroRegionalMap. This does not mean that the specifications are not ambitious: right from the start the project consortium agreed that for multi-purpose GIS use, topological connectivity and geometrical consistency of features between the different themes were mandatory and mainly for the most important themes generally used for spatial analysis like the transport, the administrative boundaries and the water network. Product Description EuroRegionalMap can be described as following: 1 Data Type: Vector Resolution: 1:250 000 Standard: Feature Attribute Coding Catalogue of DIGEST 1 Data Model: based on DIGEST Geodetic Datum: ETRS89 (European Terrestrial Reference System) Coordinate System: Geodetic coordinates (Decimal Degrees) Themes: Administrative boundaries Water network Transport network Settlement Miscellaneous topographic elements Vegetation and Soil Named location (geographical names) Positional Accuracy: 125 m Data density level and selection criteria: EuroRegionalMap data are collected at a density of detail that approximates the medium scale product range from 1:200 000 to 1:300 000. Portrayal criteria mentioned in the data dictionary are general guidelines. Naming convention: Using Latin alphabet ISO 8859-1 Spatial analysis requirements: Full connectivity of the water network and the transport network Geometrical consistency of the administrative boundaries with the water network DIGEST: Digital Geographical Information Exchange Standard, NATO Standard (ISO compliant), developed by the Digital Geographic Information Working Group (DGWIG) to support efficient exchange of Digital Geographic Information among nations, data producers and data users. Figure 2: extract of EuroRegionalMap on Luxembourg, Germany and France (scale 1:250 000) DATA HARMONISATION WORKFLOW The mapping agencies have been in charge of the production of EuroRegionalMap for their national territories using their existing national databases. The effort of re-engineering their national data according the ERM specifications has mainly focused on completing the data with new information, integrating them according to the ERM data model and upgrading the topology in order to get full connectivity of the transport and water networks. The producers were also responsible for the good conformance with the specifications using their own quality control proceedings. To help them, technical guidance and validation tools have been delivered. The producers were also responsible for the harmonization of their respective data at both sides of the international boundaries. The first step was to determine a fixed international boundary acceptable for them. The second step was to insure the good continuity and consistency of the cross-border data. The technical management team has been in charge of the good quality acceptance of the data and the final assembly of the national components to get a pan-European dataset. The first step was to perform the validation phase looking not only into the conformance to the specifications but also to a good consistency and coherency of the data. The technical management team has written up validation reports giving some correction guidance for unacceptable errors and asking for clarification justifying the deviation from the specifications. This validation phase was also a period of discussion with the producer punctuated of correction phases and validation controls in order to reach the best acceptable data quality. A final approval of the data closed the validation phase. In the next step, the technical management team performed the final assembly phase, which proceeds to an accurate edge matching of the national data at the international boundaries. The final product is tiled by country. Figure 3: Data Harmonisation Workflow from 6 Mapping Agencies GAINED EXPERIENCES Data Topology A first known point in data harmonization is that the data model characteristics of many databases maintained by NMAs are still very much determined by their cartographic origin: topological consistency between all area and line features as well as linear connectivity of networked line features are no strict requirements for producing graphically consistent maps, but they are needed for almost every useful implementation of GIS based applications. Therefore the pragmatic approach adopted by the project did not accept any compromise on data model aspects, and the consortium partners did upgrade their existing databases wherever necessary to obtain full topological consistency and connectivity. Because of their cartographic origin national geo-spatial data can also be relatively poor in attribute information (alphanumeric spatial data linked to geo-spatial data). One case in point is the names of topographic features, which are usually placed on the map as independent text features, and have to be transformed into feature attributes. The effort to upgrade a mainly cartographic dataset so that it becomes a useful database for GIS presenting topological consistency and linear connectivity as well as connected attribute values for names and other spatially related information (e.g. census data) is generally quite labor intensive and forms the major part of the re-engineering effort, because most of the time it relies on interactive data processing requiring human intervention (as opposed to purely automatic data processing). We estimate that the enhancement in content and structure of the national data (where it already existed at a scale close to 1:250.000 and thereby excluding the task of generalization from larger scales) represented 90% of the workload as compared with the tasks of selection and conversion of data into the EuroRegionalMap data model. Data Content A second point of harmonization lies in heterogeneous content of the various national databases, and on this point the project was indeed pragmatic and made a distinction between mandatory features and attributes on the one hand, and optional features and attributes on the other hand. The consortium partners did upgrade their existing databases wherever necessary to obtain a homogeneous result for all mandatory features and attributes, but they did not spend resources on obtaining optional features and attributes that were not already present in their national database. This does certainly disappoint cross-border users, when they find that interesting information that is available for one country cannot necessarily be compared with what is available for another country. Still, most users that have tested our prototype also agreed that the content areas affected by heterogeneity were not the content areas that are of central importance to them. Selection Criteria A third more specific point of harmonization is the choice of selection criteria corresponding to the scale or resolution of EuroRegionalMap: which real world features should be portrayed in a 1:250.000 database and which real world features can be excluded from it, but it has been almost impossible to strictly define selection criteria suitable for all the countries. By its very nature this matter can only be resolved pragmatically by taking into account the real world situation of the surrounding terrain (within one country or across borders), and the project did indeed leave it to the consortium partners to make the choices they deemed appropriate for the terrain of their country (i.e. to select the watercourse network or the located places), choices that had in fact already been made prior to the project and were reflected in the different existing national databases. In the specifications, basic selection criteria have however been defined in order to keep an overall consistency between countries and between the themes and are enough opened to be compatible with most of the national selection criteria (i.e. for mine and quarry selection: mine and quarry larger than 40 Ha or being considered as landmark). Most consortium partners chose to remain faithful to their original selection criteria, of course in the vicinity of international boundaries was some effort made to get consistent cross-border data mainly for the networks. Finally, the result in the selection homogeneity between countries has been judged satisfactory and acceptable on the mapping side but it can be found somewhat unsatisfactory for the user on what he can expect to exactly find in the database, and no clear answer can be given other than that it depends on what the NMA of a given country traditionally chose to portray on its maps. As the matter is closely related to the problem of generalization, and as most NMAs are in the process of harmonizing their national reference data across different scales by means of clearly defined procedures of generalization, one can only hope that these efforts are sufficiently coordinated to produce a result that is acceptable to pan-European and cross-border users. Common International Boundaries A fourth specific point of harmonization is the delimitation of common international boundaries. In national databases the international boundaries have normally been adjusted by each NMA individually to the topographic objects included in its national database and can therefore differ in positional accuracy and spatial resolution between neighbouring countries. The solution adopted to determine common international boundaries for EuroRegionalMap was to give priority to relative positional accuracy with the topographic situation over absolute positional accuracy: of the two national boundaries the one that was closest to the spatial resolution and accuracy of EuroRegionalMap was chosen and then it was adjusted to the topographic situation of both countries by moving geometrical segments and vertices. This implied that the NMAs of neighbouring countries came to an agreement on a common boundary that ideally could not only be used for EuroRegionalMap, but could also be integrated definitively into their national databases (replacing the boundary they had before). Reaching a comprise has sometimes been very time consuming, especially when the international boundary had to be topologically consistent with the water network and when spatial resolution and accuracy were very different between national databases. Graphic Resolution A fifth very specific point of harmonization has to do with the 'graphic' resolution of the vector data corresponding to the scale or resolution of EuroRegionalMap. Although this problem is not pragmatic by nature (there simply is no point in presenting vectors that are of a higher resolution than what the accuracy of the data supports), the 'graphic' resolution with which the various national databases were initially digitized (and for which there often were reasons that have to do with cartographic purposes) were not always consistent with the objective accuracy of the data, and these variations of 'graphic' resolution between the different national databases are indeed transferred to EuroRegionalMap. It would of course have been possible to use processing power to 'downgrade' exaggerated 'graphic' resolution of vector data wherever necessary (although 'downgrading' could be trickier than it appears on first sight due to topological constraints), but in many instances this would have conflicted with another guiding principle of EuroRegionalMap, namely that EuroRegionalMap data should never become 'disconnected' from the national databases: if we want EuroRegionalMap to be the result of nationally maintained databases converging towards a common European model, but NMAs have strong reasons to keep the 'graphic' resolution of their national databases for other reasons than EuroRegionalMap and therefore refuse to converge towards a common 'graphic' resolution, than the point of convergence has been reached and some heterogeneity must be tolerated as a price to be paid for ease of maintenance. Integration with the National data Base On the last point, namely the convergence of national databases towards a common European model, it may be premature to make a judgment. The idea is of course simple: NMAs may have good reasons to map their country in a particular way that is specific for their country, but if they want to be able to produce and maintain a database that is conformant with a common European specification to satisfy cross-border and pan-European user requirements, they better incorporate this common specification into the basic workflow designed to maintain their national databases (for both content and data model characteristics), rather than struggle with the harmonization of their national databases afterwards. For some consortium partners that actually maintained vector databases at the scale of 1:250.000 before the project, this is indeed what happened: they enhanced these national production databases in such a way that from now on they include whatever is necessary to extract EuroRegionalMap. Another distinct possibility for the future is the generalization of the data at this scale from larger scale databases, and because the design of these larger scale databases has not been affected right now by the project, the question of convergence may arise again in the future, when these NMAs will have to make sure that their larger scale databases contain whatever is necessary to derive EuroRegionalMap through generalization. For the consortium partners that do not maintain national vector databases at the scale of 1:250.000 and that used generalization and remodelling of larger scale data to produce EuroRegionalMap, it is understandable that they did not want to make any decisions on redesigning their larger scale databases on the sole basis of EuroRegionalMap as a demonstration project, and therefore they concentrated on devising ad hoc procedures for producing EuroRegionalMap, which they hope they can enhance and re-use for the maintenance of EuroRegionalMap. Conclusion By and large, we think that the consortium was largely successful in producing a reference database that fulfils basic GIS requirements through harmonization and upgrading of existing national databases. The harmonization project has been a stimulus for NMAs to enhance their national databases and to implement a more GIS oriented data model, and ideally European NMAs should follow this common approach of harmonization each time they take major decisions to improve their national data. THE FUTURE Efficient and interoperable maintenance with regular updates delivery at low costs is a key requirement for the good success of EuroRegionalMap towards the mapping agencies regarding for their commitment in the project and towards the user market regarding regularly up-to-date data covering the whole of Europe. Mapping agencies stressed the fact that the EuroRegionalMap specifications must be compatible with the European specifications for large-scale topographic data that is developed by the EuroSpec 2 project. In that respect the EuroRegionalMap specifications could be considered as a European standard for small-scale topographic data that is part of the EuroSpec vision. On the other hand, the effectiveness of the EuroRegionalMap maintenance with regular update delivery can’t really be considered unless mapping agencies adopt the EuroRegionalMap specifications for their national ones and that the Euro components become an integrated part of their national data bases. In the longer term, the ideal vision for harmonization would be to gather in a unique database of reference all the components of the EuroGeographics small-scale products (SABE-EuroRegionalMapEuroGlobalMap) maintained at a scale resolution of 1:250 000. This will imply a harmonization of the specifications between the different products. Indeed, the objective must be first to incorporate SABE into EuroRegionalMap as the administrative boundary theme, avoiding the duplication of different administrative boundary datasets, and then to derive as many EuroGlobalMap (1:1000K) data as possible from EuroRegionalMap remaining the main data source. 2 EuroSpec: a major programme within EuroGeographics aiming to basically address the issue of interoperability by focusing on common or interoperable specifications for reference data and by creating the conditions for an efficient access and use of distributed data The maintenance of this EuroRegionalMap+ master database would be to rely on “decentralized” maintenance, with EuroRegionalMap+ becoming a 'virtual database'. National mapping will produce and harmonize the national data assuring via “online tools” the data completeness, the fitness to the common specification and the necessary quality. This vision for the longer term is agreed, although the 'architecture' for achieving it has still to be determined. Figure 4: Our Vision: decentralised maintenance BIBLIOGRAPHY Delattre, N, (2003). eContent EDC-11031, EuroRegioMap, Deliverable 5.1: Product Specifications, Report Malliet, M. and Delattre, N. (2004). eContent EDC-11031, EuroRegioMap, Deliverable 1.8: Final Report 2003, Report