Flexible Standardization: Making Interoperability Accessible to Agencies with Limited Resources Nadine Schuurman

advertisement
Flexible Standardization: Making Interoperability
Accessible to Agencies with Limited Resources
Nadine Schuurman
ABSTRACT: Semantic standardization is an integral part of sharing data for GIS and spatial analysis.
It is part of a broader rubric of interoperability or the ability to share geographic information across
multiple platforms and contexts. GIScience researchers have made considerable progress towards
understanding and addressing the multiple challenges involved in achieving interoperability. For local
government agencies interested in sharing spatial data, however, current interoperability approaches
based on object-oriented data models represent idealistic solutions to problems of semantic heterogeneity that often exceed the level of sophistication and funding available. They are waiting for the
market to decide how interoperability should be resolved. In order to assist in this transition, this
paper presents a rule-based Visual Basic application to standardize the semantics of simple spatial
entities using several classification systems. We use the example of well-log data, and argue that this
approach enables agencies to share and structure data effectively in an interim period during which
market and research standards for semantic interoperability are being determined. It contributes to
a geospatial data infrastructure, while allowing agencies to share spatial data in a manner consistent
with their level of expertise and existing data structures.
KEYWORDS: Standardization, interoperability, semantic heterogeneity, classification, Visual Basic
I
Introduction
nteroperability and standardization of spatial
data are recognized as fundamentally important goals by international, national, and
provincial data and mapping agencies (Masser
1999) (Salgé 1997; Salgé 1999; Albrecht 1999a).
GIS researchers and computer scientists have
conducted considerable basic research in this area
over the past decade and have contributed to a
more sophisticated understanding of schematic,
syntactic, and semantic aspects of interoperability (Stock and Pullar 1999; Vckovski 1999; Bishr
1998). These three aspects of interoperability
dominate logic and software-related research─as
distinct from networking and hardware related
issues. For the purposes of this paper, Bishr’s
(1997; 1998) distinctions between schematic, syntactic, and semantic interoperability are used.
Schematic heterogeneity (or differences) arise
from using a divergent classification scheme. For
example, spatial entities (such as a wheat field)
described in one database as objects might be
described as attributes (type of crop) in a different
schema. Schematic differences can also arise from
diverse methods for aggregating data or from
databases with different attributes representing
Nadine Schuurman is Assistant Professor, Department of Geography,
Simon Fraser University, 8888 University Drive, Burnaby BC, Canada
V5A 1S6. Tel: (604) 291-3320. E-mail: <schuurman@sfu.ca>. URL:
<www.sfu.ca/gis/schuurman>.
the same entity, missing attributes, or entities with
implicit attributes. Syntactic heterogeneity, on the
other hand, arises from the use of different data
models and varied ways of addressing entities and
attributes (Bittner and Edwards 2001; Brodaric
and Hastings 2002). Syntactic differences are
more easily reconciled than semantic heterogeneity. This is because database formats can be
changed without affecting the integral meaning
or interpretation of data. Semantic heterogeneity
is so difficult to reconcile because it arises from
differences in language and meaning (Vckovski
et al. 1999). Facts and spatial entities have different descriptions, depending on the discipline
describing it and their context (Hill 1997). In GIS,
the close relationship between language and data
must be reconciled by interoperability. Semantic
interoperability requires reconciling the meaning
of data terms that vary between domains.
Research on syntactic, schematic, and semantic
interoperability is fundamental to GIScience, but
it does not directly address a range of social and
institutional problems associated with implementation. Foremost is the reluctance of government
agencies and businesses to embrace innovative, but
unproven, interoperability solutions. The test of time
and widespread adoption is the yardstick by which
such agencies judge interoperability. Off-the-shelf
software is not yet designed to handle semantic
heterogeneity among databases. When joining data
tables from different sources, for example, there is
no way to identify fields that refer to the same entity
Cartography and Geographic Information Science, Vol. 29, No. 4, 2002, pp. 343-353
by different names. “Park” and “recreational area”
or “logging road” and “limited access road” might
be equivalent, but would remain separate categories
unless flagged by an operator with extensive local
knowledge. Many agencies struggle with geospatial
data that refer to the same class of geographical
entities, but are not semantically compatible. Data
sharing challenges that face such agencies could be
addressed by interim measures that allow semantic
and schematic standardization between data sets.
In response to a demonstrated need to facilitate
semantic standardization among agencies, this paper
describes Visual Basic programs developed to support
sharing of groundwater data. Though designed in
this instance for well-log data, these programs can
be modified to suit a variety of natural resource and
environmental data commonly used by government
agencies. The programs use an “eager” or advance
approach to standardization that anticipates datasharing needs (Widom 1996), and, in the process,
contributes a pragmatic solution to semantic standardization between agencies. It also contributes
to the development of the Canadian spatial data
infrastructure by yielding standardized data for use
in environmental analyses .
The Visual Basic programs do not take advantage
of more sophisticated solutions promoted in the
GIScience literature, such as creating domain-specific
standardized vocabularies or federated data-sharing
environments. The programs do, however, facilitate
the use of extant geospatial data stored in DBMS and
GIS software that are in situ, allowing resource-poor
agencies to wait for more fundamental science to be
developed and implemented in off-the-shelf software.
A system of flexible, accessible standardization will
promote de facto standardization of semantics among
multiple agencies, and, in the process, provide the
basis for a future multi-domain vocabulary that can
be integrated into more comprehensive interoperability solutions. Given the present period of financial
restraint, especially among provincial agencies in
Canada, this approach is a step toward the development of a populated geospatial infrastructure that
is interoperability ready.
Interoperability and
Standardization in GIScience
Interoperability and standardization have sometimes been used interchangeably, though they
should be differentiated. Bishr’s (1998) vision of
interoperability includes six levels: network protocols; hardware and operating systems; spatial data
files; DBMS; data models; and domain semantics.
344
Other researchers argue that interoperability can
be decomposed into problems of communication
between systems, syntax, structure or schematics,
and semantics (Sheth 1999). There is general
agreement, however, that interoperability is a
broad rubric, and semantic standardization is
one component. Semantic standardization has
also been recognized as one of the most difficult
aspects of interoperability to resolve (Vckovski
1999; Kottman 1999; Bishr et al. 1999). This
paper focuses primarily on finding ways to encourage and enhance semantic standardization, while
accepting that different schematics associated with
data models and structures affect semantics (Bishr
1998).
There are several strands of interoperability and
standardization research in GIS that are relevant to
the problem of semantic standardization. Federated
data sharing environments have been suggested as
one means of facilitating semantic standardization
(Devogele et al. 1998; Sheth 1999; Abel et al. 1998).
This approach is attractive in that it eclipses systems
integration. It does not require agencies or members
to restructure their data using common schematics and semantics. Instead, an automated mediator
brokers information between agencies (Devogele et
al. 1998). In one version of this model, schematics
are used to allow correspondences between objects
from multiple databases (Laurini 1998; Devogele
et al. 1998). Sheth (1999) proposes information-brokering architectures in which a mediator
shares data among federated databases, supporting scalable information and multiple ontologies.
Alternatively, Abel et al. (1998) proposes a virtual
GIS based on object-oriented database systems that
use an extended SQL to communicate data-sharing
requests. By achieving this minimal level of standardization based on data models, automated data
sharing among multiple environments is enhanced.
Object-oriented data models are the basis for many
interoperability solutions based on the “federated
database” model, though few government agencies
use object-oriented data descriptions at this time.
A variation of this tactic of extracting and matching
similar semantic terms has been explored in literature at the computer science and GIS intersection
by searching for semantic similarities between data
sets. There are several ways of finding such similarities, but each approach shares the fundamental
goal of identifying entities with similar descriptions
in different databases without top-down mediation.
Kashyap and Sheth (1996), for example, use queries
to find objects based on context, rather than specific
database information. They provide the example of
relating the authorship of publications to university
Cartography and Geographic Information Science
employees, and then finding all employee publications that use the word “abortion” in their title. They
are able to find these implicit correspondences by
first defining the context (university-based academics
who publish), based on attributes, implicit assumptions of the attributes, and variables in the database
(Kashyap and Sheth 1996).
Multiple contexts can be defined for any database,
and for searches conducted within them. This is a
linguistic approach that uses the principle of semantic
proximity and object context to compare the semantics of possible linkages. Schematic heterogeneity
(such as an attribute in one schema being an entity
in another) can then be resolved in the interests of
standardization (Kashyap and Sheth 1996). Another
approach is to develop rules for membership in
a particular category (Stock and Pullar 1999). For
example, Stock and Pullar (1999) created a set of
rules that allowed them to create linkages between
land parcels described differently in two database
schemas. A related method uses automated taxonomic
reasoning but requires extra time and vigilance in
initial data structuring (Mark et al. 1995). The value
of these solutions lies in the integration of linguistics and cognition with the computing formalisms
necessary for implementation in a GIS environment.
The work of these researchers is unique in that it
carries cognitive and linguistic insights from the
social sciences into computer science. Their solutions, however, remain difficult to implement using
archival data stored in a relational format.
The Open GIS Consortium (OGC) takes a more
pragmatic approach to interoperability by using
existing technologies that can be harnessed to
facilitate data sharing through component architecture that works with multiple platforms (Cuthbert
1999; Voisard and Schweppe 1998; Kottman 1999).
The advantage of designing interoperability around
distributed computing platforms is that components can be designed to work between existing
software environments (Ungerer and Goodchild
2002; Hardie 1998). The basis for interoperability,
in this environment, is finding common geometry
between components of object-oriented databases
(Cuthbert 1999). A drawback is that many existing
applications are built around RDBMS, which makes
it difficult to extract features and compare their
geometry or semantics in a way that is compatible
with component architecture. As GIS development embraces object-oriented database design
and programming, it will be easier to move toward
component architecture.
System architecture is not the sole or defining
constituent of interoperability. There is increasing
recognition among GIScientists that interoperabil-
Vol. 29, No. 4
ity must account for social, institutional, and user
exigencies. Harvey and Chrisman (1998) brought
social aspects of interoperability into purview with
the documentation of the multiple ways that a geographical term such as “wetland” can be interpreted,
depending on the agenda of the user. Harvey and
Chrisman introduced the concept of boundary
objects to describe spatial entities that are defined
and named differently or are semantically heterogeneous. A documented historical unwillingness among
agencies to share data for technological and organizational reasons might potentially be overcome by
identifying boundary objects─or shared stakes─as
a preliminary step towards standardization. This
requires, however, the creation of an institutional
infrastructure that supports spatial data sharing
(Nedovic-Budic and Pinto 1999). Institutions need
to become more receptive to data sharing and
interoperability and support organizational channels for interoperability. The needs of the individuals
that populate institutions, and who are responsible
for interoperability, must be addressed simultaneously. Continued development of interoperability
techniques for sharing spatial data will not benefit
users unless the techniques are accompanied by
usability engineering research (Slocum et al. 2001).
Such research addresses non-algorithmic elements
of software, especially user interfaces. It necessarily
intersects with systems research, particularly with
respect to schemata and GUI interfaces (Slocum et
al. 2001), and will ultimately determine the character
and success of interoperability.
Academic interoperability research has contributed
to increased awareness of the problems of integrating
semantically heterogeneous data and developed a
number of technical solutions, but is based on the
assumptions that:
• Practitioners charged with collecting data are
trained to consider abstract issues related to
classification and schematics that bear on standardization; and
• Agencies have the resources to keep abreast of,
and adopt, new integration technologies and
theory.
Inter-organizational cooperation depends on
multiple levels of standardization, including data
models, references, categorization of spatial data,
choice of attribute layers, data collection procedures,
data quality, accuracy, metadata, output requirements, data transfer protocols, and data dictionaries (Nedovic-Budic and Pinto 1999). Many agencies,
however, develop data-sharing arrangements based
on exigency, and these frequently take the form of
an exchange of disks (Glen 2000). Indeed, there is a
disjuncture between optimal semantic standardiza-
345
tion practices and the on-the-ground reality. This
distance between theory and practice is to be expected
given the traditional trickle-down between academic
research and software development─a delay that
may exceed a decade.
Government agencies frequently wait for new
developments to be integrated into off-the-shelf
software in order to avoid early adoption and the
attendant danger of early antiquity. The example
of the Ministry of Environment, Land and Parks
(MELP) in British Columbia is instructive. One of
the most sophisticated object-oriented data storage
systems, Spatial Archive and Interchange Format
(SAIF), was developed by Henry Kucera at MELP.
SAIF supports data sharing, and it is a potentially
valuable method of data integration (Albrecht 1999b).
Ironically, MELP remains committed to RDBMS,
and the capabilities of SAIF stand in stark contrast
to the Ministry’s water well data (considered below),
which are stored somewhat haphazardly in a relational database.
The reluctance of provincial agencies to adopt
technologies that allow greater ontological flexibility
and interoperability is, however, easily understood.
A study of interoperability at MELP indicated that
the agency is not prepared to embrace interoperability until it is integrated into ESRI software
(Glen 2000). Another study commissioned by the
Geological Survey of Canada (GSC) found that
Canadian Geoscience agencies are reluctant to
adopt a “mad-in-Canada” data model or format,
preferring to use data formats supported by the
Federal Geographic Data Committee (FGDC) or
other internationally recognized bodies (Weston et
al. 2000). Many provincial agencies, moreover, are
still committed to internal standards that have been
developed over a long period of time and reflect
local needs. These agencies are reluctant to adopt
a non-proprietary data model in the interests of
interoperability, often due to previous experience
with aborted government initiatives in which funding was dropped (Weston et al. 2000).
Interoperability research is occurring in parallel with increased emphasis on the development
of spatial data infrastructures (SDI). In Canada,
Geoconnections was established in 1999 by the federal
government, along with industry partners, to oversee
the development of the Canadian Geospatial Data
Infrastructure (CGDI) (http: //www.geoconnections.
org/redir/en.cfm?file=/english/about/about_ataglance_content.html).
Geoconnections has achieved better organization
and dissemination of available spatial data but it is
constrained by a history of cost-recovery policies
that have slowed the development of the Canadian
346
Geospatial Data Infrastructure. In Canada, data collected by public agencies remain the property of
the Queen (O’Donnell and Penton 1997). This is in
direct contrast to the United States where federal
government data are frequently available in the public
domain. The effects of the constrained Canadian
policy are shared by researchers, educators, and public
agencies, all of whom frequently have to improvise
with respect to spatial data (Klinkenberg 2003).
This problem is shared by a number of countries,
including Australia and New Zealand (Mooney and
Grant 1997; O’Donnell and Penton 1997; Robertson
and Gartner 1997). In the absence of federal data,
many agencies have collected their own data with
a proliferation of local approaches to data sharing.
Ironically, this approach fits well within the dictum,
advocated by international standards organizations,
that data collaboration remains local while being
extendible to broader integration frameworks. Indeed,
some researchers have suggested that, in order to
meet the needs of local users, data infrastructures
should be constructed primarily at a local level so
as to draw on the knowledge of data collectors and
users (Harvey et al. 1999a; Harvey et al. 1999b).
The application reported in this paper is local in
its data-sharing approach, while creating the basis
for a broader spatial data framework.
Semantic Heterogeneity:
Low Government Data
Compatibility
In the Canadian province of British Columbia,
several ministries have separately collected and
maintained environmental and forest resource
data sets that describe the same geographical
entities using different schematics and semantics.
Despite detailed common procedures for feature
code identification, digitizing thresholds, and
naming conventions in the provinces (see http:
//srmwww.gov.bc.ca/gis/imwgstd.doc) there is considerable heterogeneity between schematics and
semantics. The storage of roads provides a concrete example. The British Columbia Ministry of
Forests’ Forest Cover data set contains more roads
than the set from the Ministry of Sustainable
Resource Management (SRM) despite an equivalent scale of 1:20,000 (Figure 1).
The road data are not only incongruent, but the
same roads are defined differently in the two sets
(Figure 2). In both cases the road segments represent the same feature, but are defined differently.
Within the Forest Cover dataset the feature code
returned for Figure 2a was DD31700000, but for
Cartography and Geographic Information Science
Figure 1a. Ministry of Forests road data.
Figure 1b. Sustainable resource management road data for
the same area as Figure 1a.
Figure 2a. MOF road segment.
Figure 2b. SRM road segment.
the TRIM (SRM) dataset the it was DA25150000.
The catalogue provided by the Information
Systems Branch of British Columbia Ministry of
Environment, Land and Parks contains the following definitions:
DA25150000─A specially prepared route on
land for the movement of vehicles (other
than railway vehicles) from place to place.
These are typically resource roads, either
industrial or recreational. Includes road
access to log landings.
DD31700000─A narrow path or route not
wide enough for the passage of a four
wheeled vehicle but suitable for hiking or
cycling. Park paths and boardwalks are considered trails.
Despite the fact that roads describes in both
maps frequently refer to the same feature, their
semantic and physical descriptions vary depending on context and interpretation. There is a clear
disjuncture between categories and instances,
which is most likely the result of both broad interpretation of road definitions by operators, and
different institutional cultures between the ministries of forests and environment.
These distinctions of geometric and textual
descriptions are relatively minor, but they can
undermine digital data sharing in the decentralized data collection environment found at many
levels of government. Solving this type of semantic heterogeneity would considerably enhance the
development of a provincial data infrastructure,
while enabling more comprehensive interoperability measures to be developed consistent
with federated data collection environments, as
described in GIScience literature (Albrecht 1999b;
Devogele et al. 1998; Sheth 1999; Laurini 1998).
There is a demonstrated need to resolve semantic heterogeneity. A study conducted at the Ministry
of Environment, Lands, and Parks indicated that
Vol. 29, No. 4
347
Figure 3. Water well data for the Vancouver area of BC in LD Builder after common errors have been rectified. The data are
characterized by thousands of different lithological terms with little consistency.
Ministry officials were overwhelmed by the various
initiatives for standardization developed by multiple agencies, including the United States NSDI,
GeoConnections Canada, and Geomatics Canada.
After a preliminary survey of options, the Ministry
decided to avoid interoperability initiativeseither
data- or software-relatedunless they were incorporated into ArcInfo® (Glen 2000). This strategy was
recently reiterated by Evert Kenk, Acting Director
of Integration and Planning for the Ministry of
Sustainable Resource Management (Kenk, 2002,
personal communication). In order to accommodate this cautious policy, the standardization
approach described in this paper is based on
non-proprietary software and scripting languages.
It is particularly suited to governmental agencies
with limited resources, which may feel inundated
by the seemingly limitless hardware and software
reorganization involved with interoperability. A
classic case of data heterogeneity is found in the
British Columbia Ministry of Water, Land, and Air
Protection (previously Environment, Land, and
Parks) groundwater section, and will be used to
describe local interoperability solutions advanced
in this paper.
348
A Standardization Program
for Well-log Data
in British Columbia
Unlike the United States, which maintains some
federal influence over groundwater data (see http:
//www.epa.gov/safewater/data/draft), water is a
provincial resource in Canada. As a result, each
province maintains separate control over data
and manages groundwater with varying degrees of
expertise and efficiency. In British Columbia, like
in many provinces, groundwater data are derived
almost exclusively from well-logs submitted by private drillers. Some provinces, including Ontario
and Newfoundland, require that drillers submit
well-logs with lithology and yield to the provincial
government. Other provinces (including British
Columbia) rely on the good will of drillers to
acquire well data. Well-log data are not necessarily a panacea for groundwater management.
Not all drillers are educated about the geological
subsurface, and soil deposits and their underlying
rock formations are commonly misrepresented. In
Ontario, drillers entered “clay” for 40 percent of
Cartography and Geographic Information Science
Lithological Database Builder
is
a data preparation tool for the
Region
second program that was develCariboo
2 726
15 113
5 195
oped to standardize semantic data.
Kootenays
2 350
8 717
3 860
Standardization is necessary both to
use the data locally and to compare
Galiano Island
859
3 773
1 179
techniques
for visualization and
Table 1. Degree of data heterogeneity for three geologically distinct regions of
analysis
across
Canadian provinces.
British Columbia.
Figure 3 shows data prepared by the
LD Builder in heterogeneous form.
the subsurface layers. By contrast, log data from
Note that in 23 lines of description there are only
continuous boreholes used for ground-truthing
two repeated descriptions of subsurface material,
revealed that clay constituted two percent of the
which indicates a high degree of heterogeneity in
material in the area (Russell et al. 1998). While
the raw data.
such pervasive mistakes do limit analysis of the
British Columbia is typical of Canadian provsubsurface, the data remain useful as drillers are
inces in that it includes varied topography with
generally able to distinguish unconsolidated subattendant dissimilarities between subsurface
surface materials from bedrock.
environments. In the interest of determining
Groundwater managers use well-log data to
the extent of data heterogeneity, three distinct
develop a better understanding of subsurface
geological areas were identified: the Cariboo; the
aquifers, aquitards (impediments to the flow of
Kootenays; and Galiano Island. These regions had,
groundwater), and groundwater flow direction.
respectively, 2726, 2350, and 859 wells and 5195,
The chief impediment to this goal is the absence
3860, and 1179 unique lithological descriptions,
of a standardized classification system within or
respectively.2 This extreme data heterogeneity,
between provinces.1 Standardization internally is
summarized in Table 1, speaks to the need for
necessary to develop groundwater models for the
both standardization and classification before
individual provinces, while standardization across
these data can be used.
provinces using a national framework would allow
Flexible Standardization for Spatial Data (FSSD),
comparative research on modelling and visualizathe second program (shown in Figure 4), offers a
tion of aquifers as well as provide the basis for the
choice of standardization using three different
development of a national groundwater policy.
classification systems. It is important to note that
In British Columbia, the Ministry of Water,
data standardization alone significantly reduces
Land and Air Protection collects well-log data on
the number of terms by eliminating misspellings,
an ad hoc basis and serves the data over the world
variations in spelling, and multiple descriptors,
wide web. This convenience is undermined only
but it does not solve the problem of data heterogeby the separation of UTM coordinate data and
neity. Classification is necessary in order to create
well-log attribute data into two separate dataa sufficiently limited number of categories, such
bases. Moreover, the lithology field was, until
that the data can be used in GIS.
2001, limited to 30 characters. This led to a spillThe first classification system is unique to British
over of narrative descriptions into multiple fields
Columbia , and it is being developed for the projcharacterized by depths of zero meters when, in
ect in conjunction with Dr. Diana Allen from the
fact, they referred to the previous strata. Despite
Department of Earth Sciences at Simon Fraser
these problems, these data are the sole basis for
University and Mike Wei, Senior Groundwater
groundwater modelling in a province in which 20
Hydrologist for the province. The second claspercent of the population relies on groundwater
sification system is based on one developed by
for drinking water. In order to standardize these
local experts at the Geological Survey of Canada
data, two Visual Basic programs were developed,
(GSC) for a database of 250,000+ boreholes in
which process the data to the point where they can
the Greater Toronto Area of Ontario. This system
be used to model aquifers in a GIS environment.
includes a twelve-category subsurface classificaThe first program, called Lithological Database
tion scheme for well-log data (Logan et al. 2001;
Builder (LD Builder), merges the two databases
Russell et al. 1996). This classification is better
using unique well identifiers, and then corrects
suited to the unconsolidated subsurface composithe five most common errors, including the lithotion of the area than it is for hard rock, but prological description problem described above.
Number of
Wells
1
2
Lithological
Units
Unique Lithological
Descriptors
Newfoundland is a rare province that has developed a standardized classification system for subsurface lithologies.
Most wells have several lithological units associated with different depths.
Vol. 29, No. 4
349
vides a basis for comparing in
situ groundwater models with
those developed in Ontario
(Desbarats et al. 2001). The
third classification system is
based on a national groundwater framework developed in
1991 but not yet implemented.
FSSD is rule-based and therefore designed for non-complex
classification that does not rely
on contextual, interpreted
information (see Figure 5). An
advantage of this approach is
that it can be easily modified,
and extended to incorporate
other entities.
In semi-automatic mode, the
parsed lithological description
is compared to rules for classification in one of the three
systems, based on colour, Figure 4. The splash screen for the flexible standardization program FSSD.
description, texture, and the
classification of contiguous
surface lithological composition of the area. The
BC classification option provides a three-tiered
layers. If the lithology matches more than one
profile of the lithology in order to serve different
classification, the operator is prompted to choose
user groups. Level one distinguishes only between
between possible classifications (Figure 6). The
bedrock and surficial materials; level two distinsemi-automatic mode draws on the expertise of
guishes types of bedrock and surficial materials;
the operator who frequently understands the suband level three includes
details about colour and
texture of the materials
as well as fractures in the
geology? The BC classification system also interprets the grammar of the
lithological
description
in order to assign priority to multiple descriptors. The first term in a
multiple term description
is weighted more heavily
than the second term. The
exception is the description
“sand and gravel” which is
concatenated to one term,
reflecting the persistence of
this combination in coastal
British Columbia.
The program follows a
semantic proximity approach
to develop the rules for
each of the classification
Figure 5. The first step in standardizing requires (i) choice of a database; and (ii) choice systems. The advantage of
this approach is that it allows
of mode (semi-automatic or manual).
350
Cartography and Geographic Information Science
Figure 6. Rules for standardizing lithologies using the Canada Geological Survey classification system for Ontario.
standardization of semantic terms based on expert
knowledge of the relationship between different
semantic descriptors (Kashyap and Sheth 1996).
Moreover, it enables data sharing among multiple
agencies that share close informal contact but do
not have the resources to develop more structured,
sophisticated data sharing architectures. One disadvantage of FFSD is that it relies on expert knowledge of the subsurface, both for the development
of standardization rules, and to direct the application in semi-automated mode. This shortcoming
is somewhat offset by the skills of dominant users
who are frequently hydrogeologists.
Both LD Builder and FSSD are already in use
in three research projects and have been released
to the provincial Ministry of Water, Land, and
Air Protection. The Ministry is keen to use these
programs as it is under pressure to downsize, and
resources for groundwater data collection, classification, and modeling are restricted.
software. This bodes well for the success of the
Open GIS Consortium, which works closely with
software developers. In the meantime, however,
the Flexible Standardization program represents
an approach to data sharing among agencies that
may be reluctant to embrace untested but more
comprehensive methods of semantic standardization. Moreover, this approach encourages the
development of an integrated geospatial data
infrastructure at the local level─an important
goal of the Canadian Geomatics community, and
geospatial data users elsewhere.
ACKNOWLEDGMENTS
This research was supported by Social Sciences
and Humanities Research Council (SSHRC)
grant # 401-2001-1497. I am also indebted to two
knowledgeable anonymous referees who made
valuable suggestions that improved the quality of
this paper.
REFERENCES
Conclusion
The flexible standardization approach to semantic
heterogeneity presented here is interim. It will
ideally be succeeded by interoperability solutions that exploit the full range of semantic and
schematic diversity allowed by object-oriented
data description. Initial institutional responses to
interoperability options suggest that many government agencies will, indeed, adopt interoperability
measures as they are written into non-proprietary
Vol. 29, No. 4
Abel, D. J., B. C. Ooi, K.-L. Tan, and S. H. Tan. 1998.
Towards integrated geographical information processing.
International Journal of Geographical Information Science
12(4): 353-71.
Albrecht, J. 1999a. Geospatial information standards: A
comparative study of approaches in the standardization of geospatial information. Computer & Geosciences
25:9-24.
Albrecht, J. 1999b. Towards interoperable geo-information standards: A comparison of reference models for
geo-spatial information. Annals of Regional Science 33:
151-69.
351
Bishr, Y., A. 1997. Semantic aspects of interoperable GIS.
Ph.D., Spatial Information, International Institute for
Aerospace Survey and Earth Sciences (ITC), Enschede,
The Netherlands.
Bishr, Y. A. 1998. Overcoming the semantic and other
barriers to GIS interoperability. International Journal of
Geographical Information Science 12(4): 299-314.
Bishr, Y. A., H. Pundt, W. Kuhn, and M. Radwan. 1999.
Probing the concept of information communities─A
first step toward semantic interoperability. In: M. F.
Goodchild, M. Egenhofer, R. Fegeas and C. Kottman
(eds), Interoperating geographic information system. Boston,
Illinois: Kluwer Academic Press. pp. 55-69.
Bittner, T., and G. Edwards. 2001. Towards an ontology
for geomatics. Geomatica 55(4): 475-90.
Brodaric, B., and J. Hastings. 2002. An object model
for geologic map information. Presented at the Joint
International Symposium on Geospatial Theory,
Processing and Applications, Ottawa, Canada.
Cuthbert, A. 1999. OpenGIS: Tales from a small market
town. In: A. Vckovski, K. E. Brassel and H. J. Schek
(eds), Interoperating Geographic Information Systems. Zurich,
Switzerland: Spinger. pp.17-28.
Desbarats, A. J., M.J. Hinton, C. E. Logan, and D. R.
Sharpe. 2001. Geostatistical mapping of leakance in a
regional aquitard, Oak Ridges Moraine area, Ontario,
Canada. Hydology Journal 9: 79-96.
Devogele, T., C. Parent, and S. Spaccapietra. 1998. On
spatial database integration. International Journal of
Geographical Information Science 12(4): 335-52.
Glen, A. 2000. Towards interoperable geographic information systems at the British Columbia Ministry of
Environment, Lands, and Parks. Honours Degree,
Geography, Simon Fraser University, Burnaby, British
Columbia.
Hardie, A. 1998. Present state of Web-GIS. Cartography
27(2): 11-26.
Havey, F., and Chrisman. 1998. Boundary objects and the
social construction of GIS technology. Environment and
Planning A 30: 1683-1694.
Harvey, F., B. Buttenfield, and S. C. Lambert. 1999a.
Integrating geodata infrastructures from the ground
up. Photogrammetric Engineering and Remote Sensing 11:
1287-91.
Harvey, F., W. Kuhn, H. Pundt, and Y. Bishr. 1999b. Semantic
interoperability: A central issue for sharing geographic
information. Annals of Regional Science 33: 213-32.
Hill, P. 1997. Lacan for beginners. New York, New York:
Writers and Readers Publishing Inc.
Kashyap, V., and A. Sheth. 1996. Semantic and schematic
similarities between database objects: A context-based
approach. The VLDB Journal 5: 276-304.
Kenk, 2002, personal communication.
Klinkenberg, B. 2003. The true cost of spatial data in
Canada. The Canadian Geographer 47 (forthcoming).
Kottman, C. 1999. The Open GIS Consortium and progress
toward interoperability in GIS. In: M. Goodchild, M.
Egenhofer, R. Fegeas and C. Kottman (eds), Interoperating
Geographic Information Systems. Boston, Illinois: Kluwer
Academic Publishers. pp. 39-54.
352
Laurini, R. 1998. Spatial multi-database topological
continuity and indexing: A step towards seamless GIS
data interoperability. Geographical Information Science
12(4): 373-402.
Logan, C., H. A. J. Russell, and D. R. Sharpe. 2001.
Regional three-dimensional stratigraphic modelling
of the Oak Ridges Moraine areas, southern Ontario.
Geological Survey of Canada Current Research 2001-D1:
Mark, W., J. Dukes-Schlossberg, and R. Kerber. 1995.
Ontological commitment and domain-specific architectures: Experience with Comet and Cosmos. In: N. J. I.
Mars (ed.), Towards very large knowledge bases. Amsterdam,
The Netherlands: IOS Press. pp. 33-41.
Masser, I. 1999. All shapes and sizes: The first generation of
national spatial data infrastructures. International Journal
of Geographical Information Science 13(1): 67-84.
Mooney, D., J., and D. M. Grant. 1997. The Australian
National Spatial Data Infrastructure. In: D. Rhind (ed.),
Framework for the world. Cambridge, U.K.: GeoInformation
International. pp. 187-201.
Nedovic-Budic, Z., and J. K. Pinto. 1999. Interorganizational
GIS: Issues and prospects. Annals of Regional Science
33:183-95.
O’Donnell, J. H., and C. R. Penton. 1997. Canadian perspectives on the future of national mapping organizations.
In: D. Rhind (ed.), Framework for the world. Cambridge,
U.K.: GeoInformation International. pp. 214-25.
Robertson, W., A., and C. Gartner. 1997. The reform of
national mapping organizations: The New Zealand
experience. In: D. Rhind(ed.), Framework for the world.
Cambridge, U.K.: GeoInformation International. pp.
247-64.
Russell, H.A.J., C. Logan, T.A. Brennand, D.R. Sharpe,
and M.J. Hinton. 1996. Regional geoscience database for the Oak Ridges Moraine project (southern
Ontario). In Current Research 1998-E: Geological Survey
of Canada.
Russell, H. A. J., T. A. Brennand, C. Logan, and D. R. Sharpe.
1998. Standardization and assessment of geological
descriptions from water well records, Greater Toronto
and Oak Ridges Moraine areas, southern Ontario. In:
Current Research 1998-E, Geological Survey of Canada.
pp. 89-102.
Salgé, F. 1997. International standards and the national
mapping organizations. In: D. Rhind (ed.), Framework
for the world. Cambridge, U.K.: GeoInformation
International. pp. 160-72.
Salgé, F. 1999. National and international data standards.
In: P. A. Longley, M. F. Goodchild, D. J. Maguire and D.
W. Rhind (eds), Geographical information systems: Principles,
techniques, management and applications. New York, New
York: John Wiley & Sons, Inc. pp. 693-706.
Sheth, A. P. 1999. Changing focus on interoperability in
information systems: From system, syntax, structure
to semantics. In: M. Goodchild, M. Egenhofer, R.
Fegeas and C. Kottman (eds), Interoperating geographic
information systems. Boston, Illinois: Kluwer Academic
Publishers. pp. 5-29.
Slocum, T. A., C. Blok, B. Jiang, A. Koussoulakou, D.
Montello, S. Fuhrman, and N. Hedley. 2001. Cognitive
Cartography and Geographic Information Science
and usability issues in geovisualization. Cartography and
Geographic Information Science 28(1): 61-75.
Stock, K., and D. Pullar. 1999. Identifying semantically
similar elements in heterogenous spatial databases
using predicate logic expressions. In: A. Vckovski, K.
E. Brassel and H. J. Schek (eds), Interoperating geographic information systems. Zurich, Switzerland: Spinger.
pp. 231-52.
Ungerer, M. J., and M.F. Goodchild. 2002. Integrating spatial
data analysis and GIS: A new implementation using the
Component Object Model (COM). International Journal
of Geographical Information Science 16(1): 41-53.
Vckovski, A. 1999. Interoperability and spatial information
theory. In: M. Goodchild, M. Egenhofer, R. Fegeas and C.
Kottman (eds), Interoperating geographic information systems.
Boston, Illinois: Kluwer Academic Publishers. pp. 31-7.
Vol. 29, No. 4
Vckovski, A., K. E. Brassel, and H.-J. Schek (eds). 1999.
Interoperating geographic information systems. New York,
New York: Springer. vol. 1580.
Voisard, A., and H. Schweppe. 1998. Abstraction and
decomposition in interoperable GIS. International Journal
of Geographical Information Science 12(4) :315-33.
Weston, M., H. Kucera, and G. Kucera. 2000.
Recommendations for a geoscience model to support
an integrated, Internet-based geoscience knowledge
network for Canada. Geological Survey of Canada,
Canadian Geoscience Data Model Working Group,
Calgary, Canada.
Widom, J. 1996. Integrating heterogeneous databases: Lazy or
eager? ACM Computing Surveys 1996 [cited November
16, 2001]. Available from http://delivery.acm.org/10.1145/
250000/242344.
353
354
Cartography and Geographic Information Science
Download