Draft paper for presentation at: Workshop on Spatial and Geographic Ontologies on 23rd September, 2003 (prior to COSIT'03) Talking in COADs: A New Ontological Framework for Discussing Interoperability of Spatial Information Systems Andrew G. Turk1 and David M. Mark2 1 School of Information Technology, Murdoch University, Perth, Western Australia 6150, Australia Email: a.turk@murdoch.edu.au 2 Department of Geography, National Center for Geographic Information and Analysis, and Center for Cognitive Science University at Buffalo, Buffalo, NY 14261, USA Email: dmark@geog.buffalo.edu Abstract: This paper introduces a new ontological framework based on a system of Conceptualizations of a Domain (COADs) which may aid the development of theories and techniques to facilitate the interoperability of Spatial Information Systems. The need for this framework, the basis of the COADs, and an example are discussed. It is hoped that this will aid in discussion of the need for a way of dealing with terminology related to ontology in a manner with facilitates interdisciplinary collaboration and is not restricted to any particular philosophical tradition or worldview. Keywords: Spatial Information Systems, interoperability, ontology, landscape categories, conceptualizations, language, ethnophysiography, geographic information systems. Introduction: Spatial Information Systems (SIS) (including Land and Geographic Information Systems) store a wide variety of types of geospatial information at different scales. They play a vital role in both developed and developing nations in diverse fields, such as: urban and regional planning; environmental management; business analysis; military and security operations; property cadastres; epidemiological studies; and vehicle navigation. More complex SISs share the characteristic that their databases are constructed from a variety of sources, including other SISs. Increasingly, there is also a need for SIS to interact with each other, often in an automated fashion, for analysis of interdisciplinary data sets (Gaffney et al., 1996), for projects in cross-border regions, etc.. Hence, facilitating the interoperability of SIS is of great practical importance. The study of SIS interoperability must be founded in the nature of physical reality, the way aspects of reality are selected for encoding in SIS, the taxonomies/categorizations (feature/attribute catalogues) and data format standards utilized, and the way information is stored in the databases (conceptual data models). Thus, questions of Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 1 ontology and epistemology are central to the development of enhanced SIS interoperability (Kuhn, 2001; Mark et al., in press; Mark et al., 2003; Smith and Mark, 2003; Winter, 2001). However, research and development in this field is difficult because it requires collaboration between a large number of disciplines including: information systems; computer science; geography; cognitive science, linguistics, and philosophy. When one takes seriously the need to recognize different worldviews (e.g. of Indigenous peoples) and the need for a participative information system development methodology (Remenyi, et al., 1997; Turk and Trees, 2000; Wilson, 1998), then the discipline of ethnography needs to be added. In order to develop improved theories and practical techniques of SIS interoperability, it is necessary for the interdisciplinary debate to operate within a reasonable agreement on the meaning of the fundamental terms: "ontology" and "epistemology". However, these meanings are often disputed. Ontology, in its long-established philosophical sense, seeks to identify the constituents of reality. However, philosophers from different traditions have alternative explanations of what constitutes reality and how it may be known and categorized. Are 'meanings' in the world or in people's heads? Is ontology about reality or concepts? Is it meaningful to talk about multiple ontologies or is there only the (big O) Ontology? Are ontologies strictly about (physical) reality (truth) or can they include 'mere' beliefs? Do ontologies consist of words or thoughts, or both? Where does 'ontology' finish and 'epistemology' start, or do they overlap? (Mark et al., 2003). In its more recent information systems sense, an ontology is a logical theory that provides "an explicit, partial account of a conceptualization" (Guarino and Giaretta, 1995, p. 32). The ontology stipulates the taxonomy that forms the basis of a data dictionary used in building an information system. Geographic entities and their categories may differ in kind from entities and categories in other domains. Geographic entities are not simply large versions of their counterparts at smaller scales: "geographic objects are not merely located in space, but are tied intrinsically to space in a manner that implies that they inherit from space many of its structural (mereological, topological, geometrical) properties" (Smith and Mark, 1998, p. 592). For an ethnographer, it is important that the meanings adopted for the terms 'ontology' and 'epistemology' are coherent with the worldview of a particular speech community (cultural group) (Watson-Verran and Turnbull, 1995). Ethnophysiography: The authors have been working on these issues in the context of ethnophysiography (Mark and Turk, 2003) as part of a study asking questions such as: Do all people, and all peoples, think about the landscape and its elements in more or less the same way? Or are there significant cross-cultural and cross-linguistic differences in the ways human beings perceive and cognize their environments at geographic or landscape scales? How important is the nature of the particular landscape that provides the environment for a speech community, and especially the range of forms in that landscape, in the development of the category system and lexicon used by a speech community? How influential is the culture and lifestyle of the people, that is, the nature of human interaction with the landscape? How influential is the nature of the language itself, its grammar and lexicon? A more detailed treatment of ethnophysiography is provided in our companion paper for this workshop. Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 2 In our development of the basis of ethnophysiography, and especially in our interaction with other researchers in the field of SIS interoperability, it has become clear that scholarly communication is being inhibited by the lack of a common agreed terminological basis for the interdisciplinary discussion of ontology, in the context of geospatial information. Redefining existing terms and convincing others to adopt such definitions seems futile. This need led us to the development of a new framework of 'Conceptualizations of a Domain (COADs)'. By "domain" we mean some particular aspects of reality, which we are interested in studying (e.g. for incorporation into a SIS), such as the landforms within a particular geographic region. The definition of a fairly large number of COADs, with somewhat complicated names, is done deliberately in an attempt to obtain relatively unambiguous COADs, potentially allowing debates to focus on substance rather than on terminology. The COADs approach is suggested as a possible mechanism for facilitating more effective discussion of issues fundamental to the development of enhanced interoperability of SIS, in a social as well as a technical context. COADs: In order to overcome the difficulties associated with alternative definitions of the term 'ontology', we have decided to suggest a new way of categorizing the ways of thinking about a domain of interest - via a set of COADs. These COADs can be considered within a hierarchy, related to the way they are developed and used within a speech community. It is hoped that this ontological framework will assist in the discussion of the way information is generated and stored by people as well as computers, and hence facilitate development of techniques of SIS interoperability. We propose a framework which includes all conceptualizations (or definitions and sources of categories), as understood by people as part of common sense knowledge, and as held by scientists and knowledge workers of various types. We postulate various levels of existence or awareness or knowledge of the world, held by various agents - i.e. there are different ontologies and epistemologies. Categories (types or classes) might be in the world, in the mind, in the culture, in the language, or in an information system. The framework of COADS has been developed to aid in discussion of how a speech community may categorize key aspects of their environment and how an information system developer may work with them in order to develop a SIS which faithfully captures their worldview. By looking at COADs it may assist in revealing gaps or redundancies in knowledge schema (or a data dictionary). The initial set of COADs is as follows: O-COAD: Categories in the World The intent here is for this level of the hierarchy to capture the philosophical idea that there is an ontology of what actually exists in the world, completely free of any human thought processes - the categories inherent in what exists (with special reference, in the research context, to geographical landforms). It corresponds with what some authors have termed the (big O) Ontology. Prime examples would be so-called natural kinds, if they exist (see Keil, 1989). Realist perspectives place a strong emphasis on O-COADs whereas some competing philosophical stances would deny their existence. X-COAD: Categories in the Human Environment Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 3 These categorizations would arise directly and sub-consciously in humans through their perceptual interactions with physical reality (à la Gibson, 1979). They relate to the affordances provided by aspects of reality, in the context of the way human bodies and minds operate. P-COAD: Categories in one Mind Here we are considering the way any particular person categorizes the domain of interest - the categories they have developed through their lived experience, including their interaction with other individuals. G-COAD: Shared Categories within a Group Members of a group will share some common conceptualizations of any given domain of interest, through the commonality of their direct interaction with that domain and through their interactions with each other, especially their communication through spoken and/or written language. U-COAD: Human categorization Universals It is considered at least theoretically possible that the super-set of all groups of humans may share some categorizations of any particular domain of interest - so called "universals". These may only exist at a fundamental level, as "primes" (Wierzbicka, 1996), or possibly not at all. L-COAD: Categories Embodied in the Words of a Language To the extent that members of a language group share conceptualizations of some domain, these may be embodied in words within their language. The formation of the language can be expected to have been driven by aspects of the world experienced by its first speakers (as well as by chance) and the conceptualizations of later speakers will be determined, to some extent, by the characteristics of the language itself - its grammar and lexicon. E-COAD: Categories resulting from an Ethnographic study In order to understand the categories shared by a language group, an ethnographer must study the language itself and observe and record the ways that members of that group use words in their everyday comminication and interaction with the domain of interest (e.g. the way they describe landscape features to each other as an aid to navigation through their 'country'). The results of such studies must be recorded in an explicit and formal manner if they are to be utilized effectively for inter-cultural information exchange or the development of a SIS. IS-COAD: Categories entailed in the data dictionary of an Information System If a SIS is to be most effective for a particular language group, it must embody (to the greatest extent possible) the categorizations of the domain of interest which are held in common by that group, as revealed through an ethnographic study (as part of a participative information system development methodology). This would be achieved through its data dictionary and accompanying documentation explaining the meaning of the terms used and the user activities that are supported by the system functionality. Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 4 Figure 1 - Some relationships among COADS This initial set of COADs is in no sense claimed to be complete, or even necessarily the best formulation. Rather, it is offered as a first step in developing an understanding of how this approach may overcome some of the difficulties of interdisciplinary discussion of key aspects of ontology. Some interrelations between the COADs are depicted in Figure 1. G-COADs and U-COADs cannot be studied directly but need to be inferred from PCOADs and L-COADS, preferably via E-COADs. The COADs refer to categorization of aspects of the domain of interest in terms of (inherent) properties and/or attributes (assigned by an agent, either because of, or irrespective of, one or more of their properties). There is a continuum of the extent to which things in the world categorize themselves, and how much we intervene to categorize them, because of limitations in their kindedness and/or the perceptibility of their kindedness and/or affordances. High 'natural' kindedness would be expected to produce non-collusive unanimity of categorization (perhaps achieving a U-COAD). Our process of applying categories is influenced by language and other aspects of culture. Words play an important role in concept formation, learning, recognition and communication. The "tiers of ontology" proposed by Frank (2001) may be seen as an alternative approach to the use of the COADs framework. The tiers are as follows: Tier 0: human-independent reality; Tier 1: observations of physical world; Tier 2: objects with properties; Tier 3: social reality; Tier 4: subjective knowledge. Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 5 We believe that the COADs framework is more comprehensive and explicit about where the conceptualizations come from and reside. They allow for the process of SIS development to be more adequately tracked and described. Example Use of COADs: The Federal Court handed down a decision on the 3rd July, 2003 recognizing the native title rights of the Ngarluma and Yindjibarndi peoples of Western Australia's west Pilbara region. "The Court found that the Ngarluma and Yindjibarndi peoples hold nonexclusive native title rights over parts of their claim area. These rights relate to access; ritual and ceremony; camping, hunting and foraging; fishing in inter-tidal or river waters; taking of bush medicine, bush tucker, fauna, flora and water; taking of ochre; cooking and lighting of fires for cooking purposes, and the protecting and caring for sites and objects" (NNTT, 2003). This recent decision is expected to lead to an enhanced joint management agreement (Walsh and Mitchell, 2002) covering landuse and environmental issues for the Millstream National Park, including very significant sacred sites in the Jindawarrina area. Such an agreement would be facilitated by interoperable companion SIS (or complementary layers within one SIS), one using English terminology and the AUSLIG feature codes (AUSLIG, 2003) and the other using Yindjibarndi terms for landscape features (Mark and Turk, 2003). In order to be able to discuss the development of such a (composite) cross-cultural SIS it is necessary to refer to the requirements for feature categorizations in the data dictionaries (IS-COAD) in a formal manner. One requirement is that the SIS incorporates the way landscape is thought about in the Yindjibarndi worldview (GCOAD). This could utilize the landform terminology (L-COAD), informed by the current ethnographic research study (Mark and Turk, 2003) (E-COAD). For instance, in discussing protocols for community consultation regarding clearing of non-native water weeds, it would be important to establish an appropriate mapping between the Yindjibarndi term "yinda" (permanent pool, complete with "warlu" spirit) and the AUSLIG feature codes for lake and waterhole (Mark and Turk, 2003). It may be possible to break down the (physical) characteristics of such landscape features into primitives (size, shape, height/depth, etc.) to establish the required terminological mapping, or the differences in conceptualizations (G-COADs) may be so complex as to require that each feature be individually (double) classified in the field. Although this latter alternative may seem labour-intensive, it would have the added advantage of being able to establish the proper name for each significant landscape feature (e.g. "yinda") and to ensure an appropriate level of Yindjibarndi confidence in, and "ownership" of, the resulting SIS. Especially if the SIS is being developed to facilitate cross-cultural negotiations, its development must involve social and well as technical processes. Conclusions: This formulation of COADs is offered for discussion by participants in this workshop in order to test whether it has utility in facilitating research and development of techniques to enhance interoperability of SIS. It has been developed within a particular research project, focusing on what the authors have termed ethnophysiography (Mark and Turk, 2003). We hope that it will aid in discussion of the need for a way of dealing with Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 6 terminology related to ontology in a manner with facilitates interdisciplinary collaboration and is not restricted to any particular philosophical tradition or worldview. Acknowledgment: This material is part of a project “Geographic Categories: An Ontological Investigation” supported by the U. S. National Science Foundation under Grant No. BCS-9975557. Support of the National Science Foundation is gratefully acknowledged. References: AUSLIG, 2002. Feature Codes used by the Gazetteer of Australia. [http://www.auslig.gov.au/mapping/names/featurecodes.htm (accessed Dec. 2002)]. Frank, A. U., 2001. Tiers of ontology and consistency constraints in geographical information systems. International Journal of Geographical Information Systems, vol. 15, no. 7, pp. 667-678. Gaffney, V., Stancic, Z., and Watson, H., 1996. Moving from catchments to cognition: Tentative steps towards a larger archaeological context for GIS. In: Aldenderfer, M. and Maschner, H. (eds), 1996. Anthropology, Space, and Geographic Information Systems. Oxford University Press. New York. pp. 132-154. Gibson, J. J. 1979 The Ecological Approach to Visual Perception, Boston: HoughtonMifflin. Guarino N., and Giaretta P., 1995. Ontologies and Knowledge Bases: Towards a Termterminological Clarification. In N. J. I. Mars (ed.), Towards Very Large Knowledge Bases, IOS Press. Keil, F. C., 1989. Concepts, Kinds, and Cognitive Development. Cambridge, MA: The MIT Press. Kuhn, W., 2001. Ontologies in support of activities in geographical space. International Journal of Geographical Information Science, 15 (7), 591-612. Mark, D. M., Kuhn, W., Smith, B., and Turk, A. G., 2003. Ontology, Natural Language, and Information Systems: Implications of Cross-Linguistic Studies of Geographic Terms. in: M. Gould, R. Laurini, and S. Coulondre (Eds.), AGILE 2003 6th AGILE Conference on Geographic Information Science. Collection des sciences appliquees de l'INSA de Lyon pp. 45-50, Presses Polytechniques et Universitaires Romandes, Lyon, France. Mark, D. M., Smith, B., Egenhofer, M., and Hirtle, S. C., in press. Ontological Foundations for Geographic Information Science. UCGIS Research Challenges. In McMaster, R. B., and Usery, L., editors, Research Challenges in Geographic Information Science, New York: John Wiley & Sons, accepted, in press Mark, D. M. and Turk, A. G., 2003. Landscape Categories in Yindjibarndi: Ontology, Environment, and Language. In the Proceedings of COSIT2003. NNTT (2003) Australian National Native Title Tribunal - report on recent decisions. [http://www.nntt.gov.au/media/1057275393_2456.html (accessed 4th Aug 2003)] Remenyi, D., Sherwood-Smith, M. and White, T., 1997. Achieving Maximum Value From Information Systems: A Process Approach. Chapter 2. John Wiley and Sons, New York. Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 7 Smith, B., and Mark, D. M., 1998. Ontology and Geographic Kinds. in T. K. Poiker and N. Chrisman (eds.), Proceedings. 8th International Symposium on Spatial Data Handling (SDH’98), Vancouver: International Geographical Union, 1998, 308–320. Smith, B., and Mark, D. M., 2003. Do Mountains Exist? Towards an Ontology of Landforms. Environment and Planning, B, 30(3), 411-427. Turk, A. G. and Trees, K. A., 2000. Facilitating Community Processes Through Culturally Appropriate Informatics: An Australian Indigenous Community Information System Case Study. In: Gurstein, M. (ed.) Community Informatics: Enabling Communities with Information and Communication Technologies, Idea Group Publishing, 339-358. Walsh, F. and Mitchell, P. (eds.), 2002. Planning for Country: Cross-cultural Approaches to Decision-making on Aboriginal Lands. Central Land Council, Alice Springs, Australia: Jukurrpa Books. Watson-Verran, H. and Turnbull, D., 1995. Science and Other Indigenous Knowledge Systems. In: Jasanoff, S., Markle, G.E., Petersen, J.C. and Pinch, T. (eds.) Handbook of Science and Technology Studies. Sage Publications. Pp. 115-139. Wilson, D., 1998. Ontological Pluralism and Information Systems Research. In: Proceedings of PAIS II, the Second Symposium and Workshop on Philosophical Aspects of Information Systems: Methodology, Theory, Practice and Critique, University of the West of England, Bristol, UK, July 27-29, 1998. Winter, S., 2001. Ontology: buzzword or paradigm shift in GI science? International Journal of Geographical Information Science, 15 (7), 587-590. Wierzbicka, A., 1996. Semantics - Primes and Universals. Oxford, England: Oxford University Press Paper by Turk and Mark for Pre-COSIT Workshop on Spatial and Geographic Ontologies, 23rd September, 2003 page 8