A DYNAMIC ARCHITECTURE FOR DISTRIBUTING GEOGRAPHIC INFORMATION SERVICES ON THE INTERNET by TSOU, MING-HSIANG B.S., National Taiwan University, 1991 M.A., State University of New York at Buffalo, 1996 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirement for the degree of Doctor of Philosophy Department of Geography 2001 This thesis entitled: A Dynamic Architecture for Distributing Geographic Information Services on the Internet for the Doctor of Philosophy degree by Tsou, Ming-Hsiang has been approved for the Department of Geography by ____________________________ Barbara P. Buttenfield ___________________________ Gary L. Gaile Date _____________________ The final copy of this thesis has been examined by the signators, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. ABSTRACT Tsou, Ming-Hsiang (Ph.D., Geography) A Dynamic Architecture for Distributing Geographic Information Services on the Internet Thesis directed by Associate Professor Barbara P. Buttenfield The need for global access to and decentralized management of geographic information is pushing the GIS community to deploy a distributed GIService architecture on the Internet. Different from other types of information services, distributing geographic information on the Internet requires unique software frameworks and dynamic communication approaches. However, current GIS research mainly focuses on ad hoc technique-centered solutions without considering the uniqueness of geospatial information and the integration of heterogeneous GIServices. This research presents a dynamic architecture, where the architecture of GIServices is dynamically constructed by temporarily connecting or migrating data objects and GIS components across the networks. The detailed design of the GIServices architecture is illustrated by the Unified Modeling Language and emphasizes a distributed computing perspective. The dynamic architecture of distributed GIServices is deployed by defining appropriate relationships for distributed GIS components and geospatial data objects, establishing an operational metadata scheme for geospatial data objects and GIS components, and proposing an agent-based mechanism for the integration of distributed GIServices. The results of this research will help the GIS community adopt a long-term, technologyindependent strategy in developing distributed GIServices. It will clarify the operational relationships between client, server, geodata objects and GIS operations, and will justify the roles of metadata and software agents in distributed GIServices. By integrating GIS components and data objects dynamically across networks, computing resources may be utilized more efficiently on the Internet. III ACKNOWLEDGMENT I would like to take this opportunity to thank my advisor, Dr. Barbara P. Buttenfield, who inspired me to start the research on distributed GIServices, gave me her valuable comments, and has guided me through the process of writing this dissertation. My thanks to Dr. Buttenfield are boundless. She helps me with academic, financial and spiritual support, especially in the last year of my Ph.D. study when my first child was born with medical problems at that time. I would also like to thank Dr. Micheal F. Goodchild, whose comments were very important to me in revising and improving this dissertation. Thanks to Dr. Clayton Lewis from the Department of Computer Science who helps me with revising my research focus and building a more feasible research framework. My thanks also go to Dr. Gary L. Gaile and Dr. Rene F. Reitsma who give me their valuable comments, support, and advice from a geographer and spatial scientist’s perspective. My thanks also go to my parents in Taiwan, who have always been there for me throughout all these years, encouraging me to explore the world, and making me become a geographer. And to my wife, Chun-Yi, who did everything she could to support me all the time. Her love and care are essential to the success of my Ph.D. study. Thanks to my lovely daughter, ShuAn, who taught me the meanings of patience and love. And finally, to all my friends, colleagues, and classmates in Boulder, thank you for your support and help throughout my doctoral study. V TABLE OF CONTENTS CHAPTER 1. INTRODUCTION 1 1.1 The Uniqueness of On-line Geographic Information 4 1.2 Definitions of GIServices Terminology 6 1.3 Problem Statement 7 1.4 1.3.1 Management Perspective 7 1.3.2 User Perspective 8 1.3.3 Implementation Perspective 9 Chapter Summary 10 2. OVERVIEW OF DISTRIBUTED COMPUTING 11 2.1 The Development of Network Technology 11 2.2 History of Distributed Systems 13 2.3 History of Open Systems 15 2.4 Distributed Component Frameworks 15 2.4.1 Distributed Component Object Model (DCOM) 2.4.1.1 DCOM Development History 17 2.4.1.2 DCOM Architecture and Interfaces 19 2.4.1.3 Advantages and Disadvantages 21 2.4.2 Common Object Request Broker Architecture (CORBA) 22 2.4.2.1 CORBA Development History 22 2.4.2.2 CORBA Architecture and Interfaces 23 2.4.2.3 Advantages and Disadvantages 25 2.4.3 Java Platform 2.5 17 27 2.4.3.1 Java Development History 27 2.4.3.2 Java Language and Architecture 28 2.4.3.3 Advantages and Disadvantages 30 Chapter Summary 31 VI 3. OVERVIEW OF DISTRIBUTED GISERVICES 3.1 The History of GIServices 3.2 34 3.1.1 The Xerox PARC Map Viewer 34 3.1.2 GRASSLinks 35 3.1.3 Alexandria Digital Library Project 36 Standards for Distributed GIServices 3.2.1 The OpenGIS Specification 3.4 3.2.1.1 The OpenGIS Abstract Specification 39 3.2.1.2 OpenGIS Implementation Specifications 41 3.2.1.3 The OpenGIS Standard in Practice 41 3.2.2 The ISO 15046 Standard and ISO/TC 211 45 3.2.2.1 The Reference Model for ISO 15046 Standard 47 3.2.2.2 The Geospatial Data Model of ISO 15046 Standard 48 3.2.2.3 The ISO Standard in Practice 49 Metadata Development 4.2 51 57 3.3.1 The ISO Standard for GIS Metadata 59 3.3.2 Metadata Conformance 60 Chapter Summary 62 4. RESEARCH DESIGN 4.1 38 38 3.2.3 Comparison between OGC and ISO/TC 211 3.3 34 66 Dynamic Integration for Distributed Components and Data Objects 66 4.1.1 The Design of Dynamic GIService Architecture 70 4.1.2 The Network Strategies for Constructing Dynamic GIServices 72 4.1.2.1 Two Scenarios for Distributed GIS Components Access 74 4.1.2.2 Two Scenarios for Distributed Geodata Object Access 75 4.1.3 Categorizing GIS Components by a Task-oriented Approach 76 An Object-oriented, Operational Metadata Scheme 79 4.2.1 The Design of Operational Metadata for Geodata Objects 81 4.2.2 The Design of GIS Component Metadata 82 VII 4.3 An Agent-based Communication Mechanism 4.3.1 The Roles of Software Agents 84 85 4.3.1.1 Information finder/filter role 85 4.3.1.2 Information interpreter role 86 4.3.1.3 Decision maker role 87 4.3.2 The Design of Software Agents 88 4.3.2.1 Agent Mobility 88 4.3.2.2 Agent Functionality 91 4.3.2.3 Agent Security 93 4.3.2.4 The Design of Agent Container 96 4.4 An Integrated Architecture for Distributed GIServices: GIS Nodes 97 4.5 A Walk-through Example for a Dynamic GIService Architecture 4.6 4.5.1 Scenario Description 100 4.5.2 GIS Operation Procedures 100 4.5.3 The Algorithm for the Location-allocation Decision Making 104 Chapter Summary 106 5. SOFTWARE EXAMPLES AND USER SCENARIOS 5.1 5.2 100 Software Examples 107 107 5.1.1 The Plug-ins for Web Browsers 107 5.1.2 The OpenGIS Web Map Server Implementation Interface Specifications 111 Scenario One: Travel Plan (On-line Mapping) 119 5.2.1 Scenario Description 119 5.2.2 Traditional GISystems Solution 120 5.2.3 OpenGIS Solution 120 5.2.4 Distributed GIService Solution 122 5.2.5 The Deployment of the Dynamic GIService Architecture 124 5.2.5.1 The Arrangement of Distributed GIS Components and Geodata Objects 124 5.2.5.2 Required Operational Metadata Contents 125 5.2.5.3 Required Agents’ Responsibilities 126 5.2.6 Discussion 126 VIII 5.3 Scenario Two: Wal-Mart Site Selection (Spatial Analysis) 127 5.3.1 Scenario Description 127 5.3.2 Traditional GISystems Solution 128 5.3.3 OpenGIS Solution 130 5.3.4 Distributed GIServices Solution 131 5.3.5 The Deployment of the Dynamic GIService Architecture 134 5.3.5.1 The Arrangement of Distributed GIS Components and Geodata Objects 134 5.3.5.2 Required Operational Metadata Contents 135 5.3.5.3 Required Agents’ Responsibilities 135 5.3.6 Discussion 5.4 137 Scenario Three: GPS Navigation (Cross-platform Application) 137 5.4.1 Scenario Description 137 5.4.2 Traditional GISystem Solution 138 5.4.3 OpenGIS Solution 139 5.4.4 Distributed GIService Solution 140 5.4.5 The Deployment of the Dynamic GIService Architecture 141 5.4.5.1 The Arrangement of Distributed GIS Components and Geodata Objects 142 5.4.5.2 Required Operational Metadata Contents 142 5.4.5.3 Required Agents’ Responsibilities 143 5.4.6 Discussion 5.5 143 Chapter Summary 143 6. SUMMARY AND IMPLICATIONS 145 6.1 Overview of the Research 145 6.2 Implications 146 6.3 6.2.1 Service-oriented Applications 146 6.2.2 Value-added Information Processes 148 6.2.3 The Exponential Growth of GIS Network Values 152 Future Impact 154 6.3.1 Future Impact on the GIS Industry 154 6.3.2 Future Impact on Geographers 156 IX 6.3.3 Future Impact on the Public 6.4 6.5 6.6 158 6.3.3.1 Positive Aspects 158 6.3.3.2 Negative Impact 159 Future Work 160 6.4.1 The Possible Implementation Tools 161 6.4.2 The Organization and Hierarchy of GIS Networks 162 6.4.3 The Creation of Intelligent Agents 163 The Alternative Futures 164 6.5.1 The First Path: Centralized GISystems 164 6.5.2 The Second Path: Private, Vendor-specialized GIServices 166 Conclusion 167 BIBLIOGRAPHY 171 X TABLES Table 2-1. The major development stages of distributed systems. 13 Table 2-2. The development history of DCOM and its related technologies. 17 Table 3-1. The contents of the OpenGIS Abstract Specification. 39 Table 3-2. Areas of overlap between ISO/TC 211 and OGC. 53 Table 3-3. The process comparison of ISO/TC211. 54 Table 3-4. The process comparison of OpenGIS. 55 Table 3-5. Comparison between ISO/TC211 and OpenGIS. 56 Table 5-1. The Map Request Interfaces. 113 Table 5-2. The Feature Request Interfaces. 113 Table 5-3. The Capabilities Request Interfaces. 114 XI FIGURES Figure 1-1. Three alternatives for GIS architecture. 2 Figure 2-1. The Gopher information client on a Telnet application window. 12 Figure 2-2. The Web page of the Geography Department, the University of Colorado. 12 Figure 2-3. An example of compound documents in Microsoft Word97. 18 Figure 2-4. The relationships between OLE, ActiveX, COM, and DCOM. 19 Figure 2-5. The architecture of DCOM. 20 Figure 2-6. The interface example in a map object under a DCOM framework. 21 Figure 2-7. OMA Reference Model interface categories (Vinoski, 1997). 23 Figure 2-8. The CORBA architecture (OMG, 1998). 25 Figure 2-9. The Java Platform architecture (Harmon and Watson, 1998, p. 70). 28 Figure 3-1. The Xerox Map Viewer. 35 Figure 3-2. GRASSLinks and its GIS operations. 36 Figure 3-3. The HTML/CGI version of the Alexandria Digital Library Project. 37 Figure 3-4. The Open GIS Technical Reference Model (OGC, topic 12, 1998). 42 Figure 3-5. The Geospatial Domain Services (OGC, topic 12, 1998). 43 Figure 3-6. Integration of geographic information and information technology in ISO 15046 Standard (ISO/TC211/WG 1, 1998a). 46 Figure 3-7. High-level view of the Domain Reference Model. 48 Figure 3-8. The Architectural Reference Model. 50 Figure 3-9. One example of metadata records in the Alexandria Digital Library project. 59 Figure 3-10. Details of ISO/TC 211 metadata relationships. 60 Figure 4-1. LEGO-like distributed GIS components. 67 XII Figure 4-2. The independent operations from software environments and computer platforms. 68 Figure 4-3. Dynamic construction of distributed GIServices by migrating and connecting geodata objects and GIS components. 68 Figure 4-4. Build GIServices “on-the-fly”. 70 Figure 4-5. The dynamic architecture of distributed GIServices in UML. 71 Figure 4-6. Two types of data connection for geodata objects. 72 Figure 4-7. Two types of GIS components invocation for distributed GIServices. 73 Figure 4-8. Two scenarios for GIS component access: thin client and thick client. 74 Figure 4-9. Two scenarios for geodata access: data migration and remote data access. 75 Figure 4-10. The relationships between six GIS tasks and three actors. 77 Figure 4-11. Six representative GIS components in UML. 78 Figure 4-12. Three types of GIS component classification. Error! Bookmark not defined. Figure 4-13. Two metadata schemes (relational and object-oriented). 80 Figure 4-14. The content of encapsulated metadata for geodata objects. 81 Figure 4-15. The contents and functions of GIS component metadata. 82 Figure 4-16. The metadata class relationship and hierarchy in UML. 83 Figure 4-17. The information finder/filter. 85 Figure 4-18. The information interpreter. 86 Figure 4-22. Collaborations among component agents, geodata agents, and machine agents. 92 Figure 4-23. The agent relationships and hierarchy in UML. 93 Figure 4-24. The design of agent containers. 96 Figure 4-25. A GIS node under a distributed GIServices framework. 98 Figure 4-26. The collaborations of GIS nodes in three network levels. 99 XIII Figure 4-27. Searching for requested geodata objects and GIS components. 101 Figure 4-28. The decision-making of the relocation for GIS components and data objects. 101 Figure 4-29. The dynamic download of [Map Display] from GIS node#B. 102 Figure 4-30. The dynamic download of [Colorado Roads] from GIS node#C. 103 Figure 5-1. The MIME configuration on a Web server. 108 Figure 5-2. The ArcIMS Java Viewer installation. 111 Figure 5-3. The four processing stages in a Web Map Server. 114 Figure 5-4. The three types of client models for Web Map Servers. 115 Figure 5-5. The picture case. 116 Figure 5-6. The graphic element case. 117 Figure 5-7. The data case. 117 Figure 5-8. The dynamic architecture for Web Map Services. 118 Figure 5-9. The travel plan scenario. 120 Figure 5-10. The travel plan component. 122 Figure 5-11. The dynamic architecture of travel plans scenario. 125 Figure 5-12. The Wal-Mart site selection scenario. 127 Figure 5-13. The [procedure-A] layer in the Wal-Mart Site location. 128 Figure 5-14. The buffer procedure in the Wal-Mart Site location. 129 Figure 5-15. The shape fitting analysis for the Wal-Mart site selection. 130 Figure 5-16. The roaming [Procedure-A] operation in the Wal-Mart site selection. 133 Figure 5-17. The dynamic architecture of the Wal-Mart Site selection. 134 Figure 5-18. The new metadata generated by an overlay operation. 136 Figure 5-19. The GPS navigation scenario. 138 Figure 5-20. Compaq Palm-PC with Trimble CrossCheck AMPS Cellular mobile unit. 141 Figure 5-21. The dynamic architecture of GPS navigation. 142 XIV Figure 6-1. The new values generated from the usage of GIS components. 148 Figure 6-2. The value-added process in distributed databases. 149 Figure 6-3. The life cycle of information value in traditional GISystems. 149 Figure 6-4. The life cycle of information value in distributed GIServices. 150 Figure 6-5. The life cycle of information uncertainty in distributed GIServices. 151 Figure 6-6. The exponential growth of distributed GIServices network. 152 Figure 6-7. The integrated GIService network for different users. 153 XV CHAPTER 1. INTRODUCTION The development of Geographic Information Systems (GIS) is highly influenced by the progress of information technology (IT). The motivations for adopting new technologies are derived from the essential needs of GIS users and the GIS community. Due to the popular use of the Internet and the dramatic progress of telecommunications technology, the paradigm of GIS is shifting into a new direction, Geographic Information Services (GIServices). Traditional Geographic Information Systems (GISystems) provide several capabilities to handle georeferenced data, including data input, storage, retrieval, management, manipulation, analysis, and output (Aronoff, 1989). However, the architecture of traditional GISystems is confined inside a single box. With a closed and centralized architecture, legacy GISystems are no longer appropriate for modern distributed, heterogeneous network environments. With the popularity of the Internet and Intranet technologies, centralized GISystems will be replaced by dynamic and distributed GIServices. “Information services include tools for data management, browsing, access, cleaning, processing, interpretation, presentation, and exchange” (Buttenfield, 1998, p.161). Legacy GISystems, as isolated islands, will disappear in the future due to their lack of interoperability, reusability, and flexibility. GIServices focus on open, distributed, task-centered services, which will broaden the usage of geographic information into a wide range of on-line geospatial applications and services, including digital libraries (NSF, 1994), digital governments (NSF, 1998), on-line mapping, data clearinghouses, real-time spatial decision support tools, distance learning modules, and so on. From an information service perspective, both GISystems and GIServices are value-adding processors, that add meaning value to data (Bracken and Webster, 1990). The main goal of information services is to provide users information in the right form, which requires selection and abstraction (Shuey, 1989). In the GIS community, many research projects in academia and industry focus on the need to provide geographic information services to the public and researchers (Li, 1996; Zhang and Lin, 1996; Plewe, 1997; Buttenfield, 1997). For example, the recent development of digital libraries provides library services to dispersed populations (Goodchild, 1997) and the prototype of on-line GIS courses provides a virtual GIS classroom for distance learning (Buttenfield and Tsou, 1999). In general, the long-term goal of geographic information services is to facilitate the synergy of the GIS community by sharing geographical information, spatial analysis methods, users’ experiences and knowledge. On-line, distributed GIServices will encourage multidisciplinary cooperation between the GIS community and other communities, including library information science, computer science, telecommunications, education, civil engineering, etc. To provide on-line geographic information services effectively in an open, distributed network environment, a new paradigm of GIS architecture must be established and adopted. The architecture of distributed GIServices should be platform-independent and applicationindependent. It should provide flexible and distributed geographic information services on the Internet, without the constraints of computer hardware and operating systems. Figure 1-1 shows three alternatives for GIS architecture. 1 Clients Interface GIS node GIS node GIS node GIS node Programs Data Server Traditional GISystems Client/Server GISystems Distributed GIServices Figure 1-1. Three alternatives for GIS architecture. Traditional GISystems are closed, centralized systems, incorporating interfaces, programs and data. Each system is platform-dependent and application-dependent. Migrating traditional GISystems into different operating systems or platforms is difficult. Different GIS applications may require different GIS packages and architecture design. Every element is embedded inside traditional GISystems and can not be separated from the rest of the architecture. Client/Server GISystems are based on generic client/server architecture in network design. The client-side components are separated from server-side components (databases and programs). Client/Server architecture allows distributed clients to access a server remotely by using distributed computing techniques, such as Remote Procedure Calls (RPC), or by using database connectivity techniques, such as Open Database Connectivity (ODBC). The client-side components are usually platform-independent, requiring only an Internet browser to run. However, each client component can access only one specified server at one time. The software components on client machines and server machines are different and not interchangeable. Different geographic information servers come with different client-server connection frameworks, which can not be shared. Distributed GIServices are built upon a more advanced architecture. The goal of this research is to design an appropriate architecture for distributed GIServices. The most significant difference is the adoption of distributed component technology, which can access and interact with multiple and heterogeneous systems and platforms, without the constraints of traditional client/server relationships. Under a distributed GIService architecture, there is no difference between a client and a server. Every GIS node embeds GIS programs and geodata, and can become a client or a server based on the task at hand. A client is defined as the requester of a service in a network. A server provides a service. A distributed GIService architecture permits dynamic combinations and linkages of geodata objects and GIS programs via networking. The driving force in the transformation of GIS architecture is the availability of new technology, especially in network communication and computer programming. In order to provide a fully platform-independent scheme for Internet/Intranet applications, new techniques have been developed, such as the Java language and ActiveX controls. These tools can provide platformindependent applications via the Internet. Moreover, advanced network technologies, such as Common Object Request Broker Architecture (CORBA), Distributed Component Object Model (DCOM), and Java Beans with Remote Method Invocation (RMI), focus on the development of distributed component technology, which provides a comprehensive scheme for Distributed Computing Environments (DCE) (Orfali and Harkey, 1997). Distributed component technology 2 will allow different clients to access heterogeneous servers dynamically, which is an essential feature of open and distributed GIServices. Hopefully, the future development of distributed component technology will provide a truly comprehensive distributed computing environment, where the network is the computer. In the GIS community, many research projects have provided on-line geographic information services and applications. Most early popular on-line GIServices were using the Web Browser via HyperText Markup Language (HTML) format and Common Gateway Interface (CGI) programs. Examples include the Xerox Map Viewer (Putz, 1994) and GRASSLinks (Huse, 1995). Research projects, such as the Alexandria Digital Library Project (Buttenfield and Goodchild, 1996; Frew, et. al., 1998), adopted advanced Java technologies to explore more comprehensive services for on-line spatial queries, map browsing, and metadata indexing. On the other hand, few distributed component technologies have been adopted for on-line GIServices. Many projects and organizations focus on this issue, addressing OpenGIS specification (Buehler and McKee, 1996; 1998), ISO/TC211 (Ostensen, 1995), componentoriented GIS (Li and Zhang, 1997), and Virtual Data Sets (Vckovski, 1998). Geospatial information is unique and different from other types of information. However, current research in distributed GIServices and Internet mapping facilities mainly focuses on the standardization of data formats and ad hoc technique-centered solutions without considering the uniqueness of on-line geospatial information. Recently, the GIS community realized the potential problems of ad hoc technical solutions and has initiated a series of international conferences, such as the International Conference on Interoperating Geographic Information Systems (INTEROP) (Goodchild et. al., 1999; Vckovski et. al., 1999), to address the relevant issues on the uniqueness of on-line, distributed geographic information. The topics addressed at the INTEROP conferences included the current state of research in related disciplines concerning the technical, semantic, and organizational issues of GIS interoperation. Some research projects also included case studies of GIS interoperation, the management of on-line spatial databases, the interoperability of heterogeneous geospatial data formats, and evaluations on alternative approaches (Goodchild et. al., 1999). In adopting these new computer technologies, the GIS community needs to broaden the conceptual framework for the delivery of GIServices beyond database-centered approaches. Currently, many on-line GIS projects, such as the OpenGIS Specifications and the ISO Standards, emphasize standardized, interoperable data model design (Sondheim et. al, 1999). What is not a focus, and should be, is distributed GIS processing and the dynamic integration of GIServices. In fact, GIS are both data-oriented and process-oriented. Without considering distributed GIS processing, data can be shared but processing remains centralized. Full interoperability without distributed, interchangeable GIS processes is impossible. From a distributed GIS-processing perspective, the main problem in developing open and distributed GIServices is the lack of a high-level, dynamic architecture. Most current on-line geographic information services and research projects adopt an ad hoc, technology-centered approach to provide a Band-Aid like solution. Once the technology changes, old on-line GISystems are difficult to migrate into a new framework. The legacy systems have to be abandoned if they are not compatible with new technologies. Without a high-level, upgrade-able architecture, distributed GIServices will not be adopted in the GIS community due to the short-term life cycle 3 and rapid change in information technology. Therefore, the major goals of this dissertation are to design an upgrade-able architecture for GIServices, to provide an integrated framework for data interoperability and component (programs) interoperability, and to facilitate the adoption of future network technologies for GIS applications. This dissertation will establish a dynamic architecture for distributed GIServices from an operational, distributed processing perspective. The term dynamic indicates that the architecture of GIServices is constructed temporarily by connecting or migrating data objects and GIS components across the networks. When users submit their GIS tasks or requests, the GIServices architecture will be re-constructed and the data and programs will be re-organized on the fly. The capability of dynamic construction will be achieved by using distributed components, an object-oriented metadata scheme, and agent-based communications. The goal of this research is to facilitate the transition of Internet-based GIServices from an ad hoc, temporary solution to a long-term, logical, and sustainable strategy. The architecture of Internet-based, distributed GIServices is deployed by: 1. defining appropriate modular relationships for distributed GIS components and geospatial data objects with task-centered design and dynamic integration, 2. establishing an object-oriented, operational metadata scheme for both geospatial data objects and GIS components, and 3. designing an agent-based communication mechanism for the integration of distributed GIServices. Before the deployment of the dynamic architecture, the following sections will discuss three important aspects regarding this research: the uniqueness of on-line geospatial information, the definition of GIServices terminology, and the problem statement. 1.1 The Uniqueness of On-line Geographic Information Geographic information is one of the most complicated information types stored in computer systems. Currently, the GIS industry and many research projects focus on the development of on-line GIServices and distributed component frameworks (OGC, 1998; ISO/TC211, 2000). However, these research projects mainly focus on computing technology and network communications without considering the uniqueness of on-line geographic information. Due to the uniqueness of geographic information, on-line distributed GIServices require a different solution from other types of information services, such as financial information services, or medical information services. The following paragraphs will discuss the unique characteristics of geographic information, especially on the challenge of how geographic information is represented and disseminated across the networks. First of all, the contents of geographic information vary in different resolutions, scales, time, and domains. Thus, it is a challenge to integrate heterogeneous data formats or set up a standardized data transfer procedure for distributing geographic information across the networks. For example, a series of raster-based remotely sensed images with 40 meters resolution will require different protocols and transferring procedures comparing to vector-based Digital Line Graphs 4 (DLG) with double precision accuracy. Current GIS software solutions have difficulty in providing interoperable geospatial data sets and automatic data conversion/sharing tasks (Buehler and McKee, 1998). Geographers, with appropriate knowledge to deal with geographic information and spatial phenomena, need to formalize the different characteristics of geographic information and to help software engineers to design comprehensive GIServices architecture. With the help of geographers, the GIS industry may be able to provide more reasonable and feasible frameworks for on-line GIServices applications. Another unique characteristic of geographic information is the power of GIS operation/overlay, which can process geographic information and generate new layers of information. For example, a road map will become more valuable for tourists if the data layer can be overlaid with points of interest (hotels, gas stations, parks, restaurants, etc.). Another example is the overlay of a population change map with available housing units to predict the potential needs for housing. These examples indicate that the value of geographic information will increase dramatically by providing GIS users with the capability of GIS operations and overlay procedures. However, current on-line GIS research mainly focuses on the display of geographic information without providing comprehensive on-line GIS operation tools. One of the major problems is the lack of appropriate mechanisms for exchanging or uploading GIS operations to servers. The current software architecture can not provide GIS users with distributed GIS operations and modeling procedures (OGC, 1998). From a geographer’s perspective, the study of on-line GIS should emphasize spatial analysis, modeling, and distributed GIS operations. The concepts of interoperable GIS programs, models, and analysis procedures need to be emphasized, with the participation of geographers, during the design process of distributed GIServices architecture. Although the idea of program interoperability has been introduced in computer science for a few decades, the development of GIS software rarely focuses on the actual implementation of interoperable GIS programs. It is clear for geographers that the design of distributed GIServices needs to provide a balance between data interoperability and program interoperability. Finally, in order to achieve both data interoperability and program interoperability, the GIS community needs to revise the metadata scheme for geographic information and emphasize the operational meaning of metadata. Traditional GIS research only uses descriptive metadata for tracking data lineage or facilitating the correct use of data (Gardels, 1992). On the other hand, the metadata research in computer science emphasizes machine-readable metadata for storing, searching, and integrating software components (Orfali, et. al., 1996). The research of distributed GIServices should adopt both ideas and designs an integrated metadata scheme for geospatial data and software components. The integrated metadata scheme is one of the key points for the successful deployment of distributed GIServices architecture. Due to the uniqueness of on-line geographic information, geographers are in the best position to identifying the actual needs of distributed GIServices, including data interoperability, GIS operation interoperability, and the design of operational metadata. This research is carried out by the author, as a geographer. This dissertation will identify the requirements of on-line geospatial information, which include the interoperability of heterogeneous geospatial data formats, the distribution of GIS processes and operations, and the integration of different GIServices. 5 1.2 Definitions of GIServices Terminology The following section gives selected definitions of GIServices terminology based on recent research and articles. This dissertation will use the following definitions of the GIServices terminology to present the concepts and architecture of distributed GIServices. GIS is the abbreviation for geographic information systems. It is useful to view GIS as a process rather than a phenomenon (Buehler and McKee, 1996). In this research, GISystems will be used to indicate the system perspective of GIS, which focuses on software/hardware implementation and operations. GIServices will be used to illustrate the service perspective of GIS, which focuses on the processes of information services and the various services for different GIS users and tasks. USER TASK refers to who is going to use the system/services to do what. The user task set should provide reasonably complete coverage of the functionality of the system/services (Lewis and Rieman, 1993). This research will emphasize user tasks from a user perspective instead of the software designer/programmer perspective. SERVICE is the supplying or supplier of a utility that meets a public need (Random House Webster’s Dictionary, 1993). Services are intended to fulfill the requests of USER TASKS. This research focuses on specific services, which are related to geographic information and spatial analysis. Some examples of GIServices include on-line mapping, digital libraries, virtual GIS classrooms, etc. GEODATA OBJECTS are information items that identify the geographical location and characteristics of natural or man-made features and boundaries of the Earth (Buehler and McKee, 1996). In this study, geodata objects will be encapsulated in object-oriented structures (vectorbased and raster-based), representing both natural resources and human activities. DISTRIBUTED GIS COMPONENTS are ready-to-run, modularized GIS programs that are loaded dynamically into a network-based system to extend GIS functionality. For example, a GIS buffering component will provide an extended buffering function for the targeted GIS application. The term distributed components often refers to distributed objects (Orfali, et. al., 1996). However, in order to distinguish from GEODATA OBJECTS, this paper will refer to GIS programs as distributed components. Distributed GIS components can be dynamically combined and remotely invoked to generate GIServices and accomplish different GIS tasks. CLIENTS/SERVERS are software items, and could be true objects or legacy programs. Whether software entities are clients or servers will depend on the actual task (Knapik and Johnson, 1998). In distributed network environments, a client requests an information service from a server across a network. Conventionally, the software components in traditional clients and servers are quite different. This dissertation will focus on the operational role of clients and servers, instead of on software and hardware comparisons. TECHNOLOGY is the practical application of knowledge (Random House Webster’s Dictionary, 1993). This dissertation uses the term technology to refer to the actual techniques, 6 specifications, standards, languages, or protocols used in some domain of knowledge, such as geographic information science, distributed computing, artificial intelligence, etc. INTERFACE will bring two things together and allow them to communicate. Three types of interfaces are commonly used in the computer industry. Hardware interfaces are electrical devices that connect two or more pieces of equipment. For example, the serial ports on PCs are the hardware interface used to connect keyboards and the CPU. Software interfaces are language specifications between programs which allow one program to call upon another for assistance in processing (Newton, 1996). For example, the Java language provides several software interfaces to communicate with database management software. User interfaces are the communication mechanisms between users and computer systems. For example, command lines and pull-down menus are different types of user interfaces. This dissertation will focus on the software interfaces mainly in the design of dynamic GIServices architecture. 1.3 Problem Statement Traditional GISystems have difficulty in delivering on-line, distributed geographic information services and providing flexible, friendly GIS solutions for users. Along with the progress of computer software engineering and the increasing volume of available geospatial data sets, traditional GISystems with legacy database engines are becoming obsolete because they can not communicate with other programs or access heterogeneous data via networking. Different GISystems have unique functions and data formats, which can not be shared. The computer programs inside traditional GISystems are fixed and difficult to customize for network-oriented, distributed GIS tasks. Many users have problems in designing their own GIS solutions due to the unfriendly, complicated programming environment and modeling tools. The GIS industry can not adopt the state-of-the-art technologies into legacy GISystems because of the lack of software compatibility and networking capability. Thus, the architecture of legacy GISystems has limited the power of GIS operations due to the lack of interoperability, reusability, and flexibility. What GIS users really need now is a distributed GIService architecture, which will provide a flexible and dynamic scheme for on-line geographic information services. The following sections describe the reasons for adopting a distributed GIService architecture for on-line geographic information services and the major problems in building a comprehensive, distributed GIServices environment. These discussions are organized from three different views: the management perspective, the user perspective, and the implementation perspective. 1.3.1 Management Perspective From the management perspective, there are two main reasons for on-line geographic information services. The first reason is the globalization of geographic information access and distribution. Currently, federal agencies face the problem of how to make information available to the public and meet research needs via effective and efficient methods. Traditionally, geographic information has been distributed via paper maps or off-line disks or tapes, which are costly and difficult to update. “We must put in place a global data and information system that makes environmental data, past and current, available to all who need it, in a form that they can 7 use” (Eddy, 1993, p.6). In order to build such a global information system/service, the GIS community should provide on-line geographic information services on the Internet accessible to the GIS users around the world. A global geographic information service will facilitate a large scope of geographic research in our scientific community. A second reason for on-line geographic information services relates to the decentralization of geographic information management and update. Along with the progress of data gathering techniques such as GPS, remote sensing, satellite images, more and more GIS applications and projects deal with huge databases. Huge and bulky GIS databases cause serious data management problems for maintaining, updating, and exchanging geographic information. Federal agencies are looking for new ways to more widely and effectively disseminate data, primarily via the Internet (Jones, 1997). On-line geographic information services under a distributed architecture provide one possible solution. One advantage is that data sets may be more appropriately maintained in one site rather than another. For example, the certification and quality control of specific data sets will be granted only from specialized agencies, such as demographic information from the U.S. Census Bureau, or the topographic map data from the U.S. Geological Survey. Another advantage is increased reliability, where failure at one site will not mean failure of the entire geographic information service (Worboys, 1995). In general, establishing open and distributed GIServices will improve the efficiency of GIS database management and reduce the cost of GIS database maintenance. 1.3.2 User Perspective From the user’s perspective, there are two main reasons for on-line, distributed GIServices. The first reason is the need of distributed GIS processing to cope with increasing size and variety of geospatial data sets, which impede GIS processing. Large files are time-consuming to download and convert, and processing may not always be on smaller workstations. With expected increases in data volume and variety, traditional GISystems will be less able to handle increasingly complex geospatial data sets in a single, centralized architecture. One possible solution is to establish a dynamic, distributed processing arrangement whereby one can send encapsulated GIS processing components to a large data clearinghouse. Data would be processed dynamically at the server, and results encapsulated within the processing component to be returned to the client. Distributed processing capability will facilitate the usage of distributed geospatial data sets and energize GIS processing without the constraints caused by running on local machines. The second reason for on-line, distributed GIServices is the need for customizable GIS modules for software package specialization. Most GIS software platforms have acuity for specific processing tasks. For example, some but not all packages can handle differential segmentation (breaking up linear features on the basis of a particular attribute); others are adept at merging field data with vector features; still others provide excellent address matching as a primary function. The complexity of modeling tasks undertaken by most GIS analysts increasingly demands a working knowledge of several GIS packages. In a truly distributed geographic processing environment, GIS analysts can federate GIS processing commands to the most appropriate GIS package available on the distributed network in order to conquer the complexity of spatial modeling. Also, in traditional GIS software, 90% of users utilize less than 10% of an 8 application's features. These users must nonetheless pay for the full monolithic software suite, as opposed to licensing only those modules they require. The remaining 10% of advanced users requiring more complex features are dependent upon version update cycles that dictate when new features become available. By using the distributed component technology, individual software modules may be updated independently. Distributed GIServices architecture will provide more flexible services for GIS users, where users can combine individual components based on their needs, plugging selected modules together. They will not be constrained to a single GIS package or software vendor. The pricing of GIS software licenses should also become more flexible and lower for individual GIS users. This and other economic aspects will be discussed in detail later. 1.3.3 Implementation Perspective From the implementation perspective, the first problem in developing on-line, distributed GIServices is the lack of a high-level architecture which can support logical construction methods. Most current on-line geographic information services and research projects adopt a quick, ad hoc, technology-centered approach to provide a temporary solution for open and distributed GIS. Once the technology changes, every component in the old system is abandoned and a whole new system has to be designed and implemented. Without an appropriate architecture, distributed GIServices could not be achieved due to the short-term life cycle and rapid change in technology. A dynamic, upgrade-able architecture will facilitate the development of open and distributed GIS from a short-term strategy to a sustainable development strategy. The second implementation problem is that current development of open architectures mainly focuses on data interoperability issues. However, GIS are both data-oriented and processoriented. The GIS community needs to focus on GIS processing, and on the interactions between GIServices. This dissertation will provide a high-level GIService framework that focuses on the dynamic integration of distributed GIS processing. Three operational issues must be addressed to implement a dynamic GIService architecture in distributed network environments. The first issue is the definition of client/server relationships among distributed GIS components and geospatial data objects. In distributed network environments, the major obstacle is the integration and the interactions among heterogeneous software (GIS components) and databases (geospatial data objects). A key issue for the integration is the development of modular, independent GIS components along with the comprehensive definitions of interactions and relationships between components. The second issue is the formalization of comprehensive metadata descriptions and GIS functionality. Metadata provides a mechanism for objects and processes to describe themselves, to communicate, and thus to interoperate. In distributed network environments, users can copy or download data objects and programs from one machine to another. Data sets and GIS operators become more dynamic, movable, and interoperable on the Internet. By defining the behaviors and requirements for geospatial data objects and GIS operators, a comprehensive metadata scheme will facilitate the effective and correct use of data sets and GIS components. 9 The third consideration is the problem of information overload in distributed network environments. Distributed network environments enlarge the scope and variety of available data. In distributed computing environments, users may wish to fuse heterogeneous data models in different GIS software. The two aspects (large data files and incompatible data models) will inhibit the implementation of distributed GIServices. Some research projects in the GIS community have addressed this data compatibility issue, by means of the Virtual Data Set (Vckovski, 1998) and the Open Geodata Model (Buehler & Mckee, 1996). However, another type of information overload is the complexity of GIS operations and modeling. Distributed network environments enable users to access hundreds of different GIS programs and models on line. Most users may not have adequate knowledge to bridge different models and programs together for their own GIS tasks. Thus, GIS users need some help in integrating heterogeneous GIS programs and models besides the data compatibility. 1.4 Chapter Summary This research is carried out in the discipline of geography to facilitate data integration, program interoperability, and operational metadata for Internet-based GIServices. In order to design such a dynamic architecture, three approaches will be adopted for distributing GIServices on the Internet. First, this research will define a modularized, distributed framework for task-oriented GIS components. Second, an object-oriented metadata scheme will be applied to geodata objects and GIS components, which become self-describing, and self-managing. Finally, an agent-based communication mechanism is proposed for the integration of heterogeneous, distributed GIServices. The goal of this dissertation is to establish a dynamic architecture for geographic information services and to provide customizable, reusable, and network-based GIS applications for users. In order to provide such comprehensive services, the design of GIServices architecture should focus on the process of dynamic construction, the management of distributed objects, and the integration of different GIServices. The background knowledge of this research will be reviewed in the next two chapters to set a context in two domains, namely distributed computing and distributed GIServices. In computer science, the development of distributed computing provides a fundamental technology support for open and distributed architecture. In geographic information science, the research of on-line, distributed GIServices motivates the re-design of GIS metadata models and component frameworks. Chapter Four will present the actual design of a dynamic architecture for distributed GIServices via three major approaches. Chapter Five will introduce software examples, justify the design of dynamic architecture with three hypothetical scenarios, and compare the advantages and disadvantages between traditional GISystems and distributed GIServices. The implications of distributed GIServices and possible impacts on geographers, the public, and scientists will be addressed in Chapter Six, the final chapter. 10 CHAPTER 2. OVERVIEW OF DISTRIBUTED COMPUTING Today, distributed computing emerges as one of several trends in information technology, providing a new perspective and scheme for the next generation GIS. The development in distributed computing (Schroeder, 1993; Orfali, et. al., 1996; Armstrong, 1997) provides a fundamental technology support for an open, dynamic architecture. “These shifts are not simply due to operating in a distributed or networked environment. Rather, great diversity and innovation of information technology accompanies distributed computing which, in turn, brings new models of the world and new ways of solving problems” (Ganti and Brayman, 1995, p. 33). The following sections will introduce the development of network technology, the history of distributed systems and open systems, and the in-depth introduction of three types of distributed component frameworks, including the DCOM, CORBA, and Java platforms. These frameworks provide the fundamental software platforms underlying on-line, distributed GIServices. The deployment of high-level distributed GIServices architecture will be built upon these low-level component frameworks for the actual implementation. 2.1 The Development of Network Technology Network technology is the key factor for the emergence of open and distributed systems. The network development began in early 70’s along with the rapid expansion of telecommunications technology. Network technology evolved from local area networks (LAN) to wide area networks (WAN) to the Internet (Inter-networking) (Sloman, 1994). The progenitor of the modern Internet was a network called ARPANET, set up in the 1970s by the U.S. Defense Department, developing a self-adjustable, decentralized networking system (Sloman, 1994). The original goal of the ARPANET project was to provide a reliable telecommunications network which would persist after nuclear war. In 1983, the ARPANET project adopted the Transmission Control Protocol/Internet Protocol (TCP/IP) as the standardized protocol for communications across interconnected networks, between computers with diverse hardware architectures, and between various operating systems (Newton, 1996). The dramatic success of the Internet and the popular adoption of TCP/IP pushed network development into a new age. By 1994, at least two million computers had connected to the Internet. Today millions of people use the Internet in more than 120 countries (Nemeth et. al., 1995). Along with the rapid development of the Internet, many applications and programs have been developed, such as Newsgroup, Gopher, Bulletin Board System (BBS), Telnet, etc. Figure 2-1 illustrates the screen shot of the Gopher information Client via the Telnet application. 11 Figure 2-1. The Gopher information client on a Telnet application window. The World Wide Web (the Web) is currently one of the fastest growing applications on the Internet for the public to publish and retrieve information. The original idea of the Web was to serve as a pool of human knowledge, which could allow researchers in remote sites to exchange ideas on a common project (Berners-Lee, et. al., 1994). The Web adopts a standardized communication protocol, HyperText Transfer Protocol (HTTP), for disseminating multimedia documents on the Internet. The HyperText Transfer Protocol was developed by the European Laboratory for Particle Physics in Geneva (CERN) in 1990. Later the protocol was popularized with the appearance of Mosaic in 1993, a multimedia browser created at the National Center for Supercomputing Applications (NCSA) (Berners-Lee, et. al., 1994). The Web provides an integrated method to distribute all types of data across all different types of computers in a unified format, called HyperText Markup Language (HTML). Figure 2-2. The Web page of the Geography Department, the University of Colorado at Boulder. 12 The main reason for the popular growth of the Web is its powerful capability of presenting multimedia documents on the Internet which can include texts, sounds, pictures, animation, etc (Figure 2-2). Other Internet applications, such as Newsgroup, BBS, and Gopher, only provide text-based information. The Web uses hypertext and multimedia techniques to make its content accessible to anyone. People can easily generate home pages by adding pictures, sounds, and hyperlinks in HTML format and create attractive contents on the specific topics. Since the powerful communication and popular use of the Web, many GIS researchers have launched some pioneering research and are developing applications on the Web. These GIS research projects will be addressed in detail later. 2.2 History of Distributed Systems Along with the rapid progress of network technology, distributed systems have been widely used in the computer industry (Schroeder, 1993). The development of distributed systems can be characterized in four major stages (Table 2-1). The following paragraphs discuss the four stages by their definitions, network features and system structures. Featur e Major stages Major functions Network Topology Closed ï‚ Stand Alone File Servers (UNIX NFS, Netware, Window NT Shared Directory) Generic Database Servers (Oracle, MS Access) Files and disk space sharing Many (clients)-to-One (server) with restricted access Query database and get results from servers Many(clients)-to-One (server) with dynamic access | | | Distributed Database Servers ( Oracle) and Distributed File Servers (Windows 2000) Query database or file sharing from an integrated server group Many(clients)-to-One (integrated server group) Homogenous servers |  Open Distributed Component Object Servers (CORBA, DCOM) Distributed component objects manipulation by sending requests Many(clients)-to-Many (distributed servers) Heterogeneous servers | | Table 2-1. The major development stages of distributed systems. A. Stand Alone File Servers (1982-). A file server is a device which delivers files to everyone on a local area network (LAN). It allows everyone on the network to get to files in a central storage space, on one computer. A file server directs movement of files and data on a multi13 user communication network. Users can store information and access application software on the file server (Newton, 1996). From a network management perspective, file servers usually handle a huge amount of transactions, which usually becomes a significant bottleneck in a local area network. The system structure of file servers is fixed in both clients and servers. Different file servers have their own protocol and file format, which may not be compatible with others. B. Generic Database Servers (1986-). A generic database server is a standalone computer that sends out database data to users on a LAN the way a file server sends out files. With a database server, the server does the picking, sending only the requested part of database to user’s workstations. Thus, a database server incurs less network traffic than a file server, in a multi-user database system. It also provides better data integrity, since one computer handles all the record and file locking (Newton, 1996). Database servers are more flexible than file server systems, especially on the client side. Multiple users can easily establish new clientside applications to access the same database server. However, the server-side applications are fixed in most cases. It is impossible to access multiple databases at one time or integrate heterogeneous databases under a single server architecture. C. Distributed Database Servers and File Servers (1992-). “A distributed database server appears to a user as a single logical database, but is in fact a set of databases stored on multiple computers. The data on several computers can be simultaneously accessed and modified using a network” (ORACLE, 1992, p. 21-2). Basically, the main functions and capabilities of distributed database servers mimic generic database servers, but the physical locations of databases are distributed across a network. Similar to the architecture of distributed databases, distributed file servers appear to a user as a single logical file server, but physically are distributed in different places. However, distributed file servers are designed for file sharing instead of database access. Distributed file servers can provide users with a virtual integration of distributed file servers on a local area network. An example of this is the active services functionality in the architecture of Microsoft Windows 2000 (Seltzer, 1998). Both generic and distributed database/file server systems basically follow the traditional client/server architecture, that is restricted to specific internal communications and processing capabilities. There are several problems with the traditional client/server architecture for GIS requests and processes, because it can not provide rich transaction processing and rich data management, or handle overly complex queries or operations. For example, if a traditional database server receives requests from 500 clientside applications at the same time, the server’s operating system may hang. Without a transaction control function, traditional database architecture is not appropriate for complex GIS applications. In some case, transaction processing monitors (TP monitors) have been used to assist major enterprise databases with their transaction services (Orfali, et. al., 1996). D. Distributed Component Object Servers (1995-). Distributed component object servers are advanced client/server systems, which can handle complex transactions and request from heterogeneous systems. Distributed component technology adopts the concepts of objectoriented modeling (OOM) and distributed computing environment (DCE). Currently, both academic and industrial studies of distributed systems are focusing on distributed components in open environments which can provide new capabilities for the next generation 14 client/server architecture (Montgomery, 1997). Common Object Request Broker Architecture (CORBA) developed by the Object Management Group and Distributed Component Object Model (DCOM) developed by Microsoft Corporation are two examples of distributed component framework (Orfali and Harkey, 1997). Comparing the distributed database/file servers, the main advantage of distributed component object servers is the interoperability, reusability, and flexibility for cross-platform applications. A detailed description of distributed components will be addressed in Section 2.4. 2.3 History of Open Systems The design of an open system model attempts to solve the problems that arise from a distributed system, where the systems from different vendors use different data formats and exchange protocols (Worboys, 1995). The Oxford American Dictionary of Current English (1999) defines the term open, as “not closed, spread out, unfolded, public, free to all, willing to talk, and be willing to consider new ideas.” The IEEE Technical Committee in Open Systems (TCOS) defines open systems as “a comprehensive and consistent set of international information technology standard and functional standard profiles that specify interface, services, and supporting formats to accomplish interoperability and portability of applications, data, and people” (Ganti and Brayman, 1995, p. 53). In order to communicate between heterogeneous systems, open systems include the following features. A highly modularized structure permits dynamic interactions between different software, hardware, and operating systems. Generalized interfaces and functionality means that programmers can easily develop additional functions from original software. The overall goal of open systems is to create products and technology that conform to non-proprietary industry standards (Ganti and Brayman, 1995). An open system model becomes more and more important to facilitate the long-term development of distributed systems. Some successful examples of the open system model include Transmission Control Protocol and Internet Protocol (TCP/IP), the X-windows environment, and the Linux operating system. In the GIS domain, an open system model should not only provide interoperability and portability from functional and technical perspectives, but also encourage the whole GIS community to “interact with entire new communities, ... and for geographic information to become even more important to a range of human activities” (Goodchild, 1996). By adopting the open model concept, the GIS community can share spatial analysis theories, geographical knowledge, and GIS technology together. 2.4 Distributed Component Frameworks The original concept of distributed components came from the development of both distributed systems and open system models. Distributed component technology is an advanced scheme for distributed network computing environments (Orfali and Harkey, 1997). The construction of distributed components breaks up the client and server sides of an application into smart components that can inter-operate across operating systems, networks, languages, applications, tools, and multi-vendor hardware. Examples of distributed components include roaming agents, rich data management, abstract and generalized interfaces, self-managing entities, and intelligent 15 middle-ware (Orfali, et. al., 1996). The current commercial market provides three major infrastructure for distributed component technology, which are the Common Object Request Broker Architecture (CORBA) developed by the Object Management Group, Distributed Component Object Model (DCOM) developed by Microsoft Corporation, and Java technology by Sun Microsystems Inc., and its subsidiaries, Sunsoft and Javasoft. The original idea of distributed components came from object-oriented modeling technology, which has developed over the past twenty years (Orfali and Harkey, 1997). Recently, distributed components have become the most important trend in the development of software technology. The generic features of distributed components adopt concepts of object-oriented modeling, including encapsulation, polymorphism, inheritance, framework and classification, and object relationships (Rumbaugh et. al., 1991; Taylor, 1992). The most important contribution of object-oriented technology is to provide an efficient way to make software constructed by standard and reusable components (Taylor, 1992). Objects correspond to real world entities such as cars or people. Each object encapsulates related procedures (methods) and data (variables). The method of encapsulation can prevent a program from being interfered with by other programs. Communication between objects depends on the calling of methods or functions for each object. Some methods can carry multiple meanings in a single form, which is called polymorphism. Polymorphism can simplify complex systems and improve the efficiency of the programming. Many objects can be organized and grouped as hierarchic classes. The classes of objects are similar to our real world. Different classes share different properties by using a mechanism called inheritance. Object-oriented modeling allows different parts of the software to be developed simultaneously, to be easily maintained and modified when necessary (Graham, 1994). It also improves the reliability of software and makes the information system more useful and flexible. By adopting the object-oriented modeling technology, distributed components can handle rich and complex requests and prioritize the sequence of requests from the client side. For example, when a data component server is busy, the next distributed request can wait in a queue instead of being canceled. Another important feature is that distributed components provide more flexible access and application on both client-side and server side. A single system can play both a server’s role and a client’s role. For example, a Colorado local GIS site can access many federal database servers as a client. When other GIS projects require data about Colorado, the Colorado site can act as a database server. Thus, distributed components are appropriate for open and distributed GIS environments since they can provide efficient and flexible client/server applications. Distributed GIS components and applications can freely interact and inter-operate on the Internet. The following section will provide an in-depth review of the development history and major features of the three types of distributed component technologies, DCOM, CORBA, and Java platform. These distributed component frameworks will provide a fundamental support for the deployment of high-level distributed GIServices architecture. 16 2.4.1 Distributed Component Object Model (DCOM) Distributed Component Object Model (DCOM) is one of Microsoft’s program interface architectures. In DCOM, client programs can request services from server programs on another computer via a network. Actually, DCOM technology is an extension of the Component Object Model (COM), which supports interoperability and reusability of distributed components under Microsoft’s Operating Systems, such as Windows 95/98 and Windows NT. Many programmers consider COM and DCOM as a single technology that provides a range of services for distributed component interaction. COM is designed for a process running on a single machine and DCOM is designed for processes operating across heterogeneous networks. The COM/DCOM technology is also closely related to other Microsoft technologies, including Object Linking and Embedding (OLE) and ActiveX. In order to clarify the relationships between COM, DCOM, OLE, and ActiveX technology, which are usually confusing to the public and non-programmers, the following section will give a brief introduction to the development history of DCOM and its related technology (Table 2-2). Years 1990 1991 1993 1996 spring 1996 summer Technology Development DDE (Dynamic Data Exchange) with Windows 3.0 OLE 1.0 for compound documents OLE 2.0 + COM for compound software ActiveX (the next generation of OLE) DCOM (the distributed version of COM) with Windows NT 4.0 Table 2-2. The development history of DCOM and its related technologies (Orfali and Harkey, 1997; Chappel and Linthicum, 1998). 2.4.1.1 DCOM Development History The original idea of COM/DCOM technology comes from the clipboard function created by Apple Computer in the late 70’s (Grimes, 1997). The COPY, CUT, and PASTE tools provided users a friendly way to share documents between different programs. In 1990, the release of Microsoft Windows 3 extended the clipboard idea and the publish-and-subscribe concepts developed by Apple, then introduced Microsoft’s own way to exchange data between applications, called Dynamic Data Exchange (DDE). DDE allowed different Windows applications to communicate with each other via a message-based protocol. In 1991, Microsoft released OLE 1.0, which modified the major functions of DDE and added an Application Programming Interface (API) on top of the DDE messages. The major improvement of OLE 1.0 is the ability to link and embed documents within applications. OLE is a technology that enables an application to create compound documents that contain information from a number of different sources. For example, a document in an OLE-enabled word processor can accept an embedded spreadsheet object. Unlike traditional cut and paste methods where the receiving application changes the format of the pasted information, embedded documents retain all their original properties. If the 17 user decides to edit the embedded data, Windows activates the originating application and loads the embedded document (Microsoft, 1996, p1.). Figure 2-3. An example of compound documents in Microsoft Word97. The linking function of OLE allowed applications with embedded documents to be linked together dynamically. If the original data were changed, the embedded contents would automatically be updated and vice versa. Figure 2-3 shows an example of compound documents, which includes graphics, pictures, sound clips, and an embedded Excel document. In 1993, the release of OLE 2.0 extended the capability of OLE beyond the compound document to compound software (Brockschmidt, 1996). The popular use of OLE 2.0 generated a shift of Microsoft software development, from an application-centered paradigm to a document-centered paradigm. The document-centered paradigm allows users to move documents between many different applications without even noticing the movements among different applications. Currently, almost every Microsoft package, including Office 97, Visual Basic, Visual C++, Excel relies on OLE 2.0 technology. OLE 2.0 provides more comprehensive architecture and communication protocols to allow programmers to design applications under Microsoft’s operating systems, such as Windows 98 and Windows NT. The Component Object Model (COM) was originally designed in 1993 to specify interface interactions and communication protocols between OLE 2.0 components. COM provides the underlying support for OLE components to communicate with other OLE components (Brockschmidt, 1994). “A straightforward way to think about COM is as a packaging technology, a group of conventions and supporting libraries that allows interaction between different pieces of software in a consistent, object-oriented way. COM objects can be written in all sorts of languages, including C++, Java, Visual Basic, and more, and they can be 18 implemented in DLLs or in their own executable, running as distinct processes” (Chappell and Linthicum, 1997, p. 58). COM’s language-independent feature means that components written in different languages can inter-operate via standard binary interfaces. ActiveX developed in 1996 is the next generation of OLE and extends the use of COM/DCOM to Web applications. ActiveX is a lean, stripped-down version of OLE, optimized for size and speed so it can execute in browser space. Actually, ActiveX loosely defines a group of Microsoft technologies, including ActiveX control, ActiveX scripting, ActiveX documents, ActiveX containers and so on. The name, ActiveX, was coined in December, 1995 by Microsoft (Grimes, 1997). Based on marketing considerations, Microsoft decided to re-package the related OLE technology and sell it as ActiveX technology, targeting future markets of Internet applications and becoming a major competitor to Java technology. ActiveX allows COM architecture to execute on a Web browser as buttons, list boxes, pull-down menus, and animated graphics. Currently, ActiveX has been widely used by corporate management information system (MIS) and independent software vendors (Knapik and Johnson, 1998). The release of DCOM technology was packaged with Windows NT 4.0 in mid-1996. The original design of COM assumed that components and their clients were running on the same machine. DCOM extends the COM technology to communicate between different computers on a local area network (LAN), a wide area network (WAN), or even the Internet (Microsoft, 1998). DCOM also includes a distributed security mechanism, providing authentication and data encryption (Chappell and Linthicum, 1997). Figure 2-4 illustrates the relationships between OLE, ActiveX, COM, and DCOM. In general, COM and DCOM represent low-level technology (interface negotiation, licensing, and event management) that allows components to interact, whereas OLE and ActiveX represent high-level application services (linking and embedding, automation, compound documents) that are built on the top of COM/DCOM technology. ActiveX OLE 2.0 New version High Level Services: Compound documents, Compound software Low Level Services: COM DCOM Combine COM: interface negotiation, licensing. DCOM: Network communication -- TCP/IP, HTTP, DCE RPC. Figure 2-4. The relationships between OLE, ActiveX, COM, and DCOM. 2.4.1.2 DCOM Architecture and Interfaces The architecture of DCOM is established on the client machines with a remote object proxy, and on the server machines with a COM stub (Figure 2-5). The network communication is accomplished through Microsoft DCE RPC, which is an extension of Open System Foundation’s 19 DCE RPC specification (Grimes, 1997). DCOM uses the method of marshalling to format and bundle the data in order to share it among different components (Orfali and Harkey, 1997). Client machine: Server machine: OLE automation controller COM OLE/ActiveX Container client Remote Object Proxy OLE automation server COM DCE RPC server OLE/ActiveX Controls COM Stub Figure 2-5. The architecture of DCOM. Marshalling begins as a COM client calls for its remote object proxy on the local machine. The object proxy then passes the calls over the network to a COM stub on the server machine, which marshals the parameters and passes them to the server applications. When the call is completed, the server COM stub marshals return values and passes them to the object proxy on the client machines, which returns them to the client-side applications (Orfali and Harkey, 1997). Beside the low-level communication of objects, the architecture of DCOM also incorporates a high-level object management scheme by using OLE automation controllers and servers (Figure 2-5). OLE automation controllers and servers provide the ability for distributed components to expose functions and commands for other components to access and facilitate the development of programming tools and macro languages, which can operate across applications (Chappell and Linthicum, 1997). The architecture of DCOM specifies the communication mechanism and object management between clients and servers. The actual operations and executions of DCOM objects are accomplished by using the software interfaces between DCOM objects. A DCOM interface is a collection of function calls and defined as a binary-type API based on a table of pointers, called a virtual table or vtable. An interface of DCOM will be given a name starting with a captial “I”, such as IUnkown, IClassFactory, and IDispatch. Each DCOM interface has a unique interface identifier, called interface ID (IID), which is automatically generated by DCOM. Figure 2-6 illustrates an example of DCOM object, MapObject, with three basic interface, IUnKown, IDisplay, and IZoomIn. The IUnkown interface is the most important interface of DCOM, which is used for run-time interface negotiation, life cycle management, and aggregation. The IDisplay interface and IZoomIn interface are the function calls for the MapObject. For example, a GIS application can call the IDisplay interface to display the map on a defined window as the following statement: [MapObject.Idisplay (Mapextent, Window’s name)]. If the GIS application needs to zoom in a specific area, the IZoomIn interface will be called as [MapObject.IZoomIn(X1,Y1,X2,Y2)]. 20 With the use of DCOM interfaces, software programmers can easily manipulate the behaviors of MapObject for different types of GIS tasks in their applications. IUnKnown IDisplay MapObject IZoomIn Figure 2-6. The interface example in a map object under a DCOM framework. The binary interfaces of DCOM are created by using Microsoft Interface Definition Language (Microsoft IDL), which describes the interfaces’ methods and their arguments. Beside the use of Microsoft IDL, DCOM technology provides another type of language for DCOM automation, called DCOM Object Definition Language (ODL). DCOM automation allows client program to dynamically invoke methods of DCOM objects, in order to allow clients to dynamically discover the methods and properties (Orfali and Harkey, 1997). 2.4.1.3 Advantages and Disadvantages The major advantage of DCOM technology is the popularity of Microsoft’s operating system (Windows 95/98, Windows NT 4.0, and the Windows 2000), desktop applications (Word, Excel, PowerPoint, Access, Internet Explorer and so on), and programming tools (Visual Basic, Visual C++, and Visual J++). All Microsoft’s products are based on and will be based on DCOM technology. Thousands of PC related applications and software developed by other companies are also based on Microsoft’s DCOM technology, such as ESRI’s MapObjects and InterGraph’s GeoMedia. For Microsoft Windows-based applications, DCOM is more feasible and more popular for developing distributed components than other compatible technologies, such as CORBA and Java platforms. The second advantage is that the DCOM technology is designed from the evolution from DDE to OLE, to ActiveX. The design of DCOM results from extensive implementation experience instead of being designed from pure theory, as in the case of CORBA. Its core concepts and functions have been revised, changed, and extended over almost ten years. DCOM technology has been adopted and implemented in thousands of application programs. The third advantage of DCOM is the language-independent interface design based on Microsoft’s binary interface structure. Software programmers can develop DCOM components or ActiveX controls in any languages, including Visual Basic, C++, or even Java. Moreover, Microsoft’s J++ development tools provide an integration of Java and DCOM, which allows Java programmers to write Java application with DCOM easily. The multi-language development ability will attract more involvement in DCOM programming and development. 21 However, there are some disadvantages to the DCOM technology. The first drawback is that it is not based on a pure object-oriented (OO) implementation. For example, DCOM objects do not support multiple inheritance, which will limit the extensibility of DCOM object development effort. Software programmers have to manually aggregate different COM components by using a complicated software packaging approach in order to compromise the limitation. The second problem with DCOM technology is the complexity of DCOM interfaces (Brockschmidt, 1994; Vckovski, 1998). Once created and defined, the interfaces will exist forever to ensure backward compatibility with future DCOM applications. As a result, hundreds of component interfaces and functions have been specified in different DCOM objects, which increase the difficulty of understanding and developing DCOM applications. The third problem with DCOM technology is the inadequate support from other platforms, such as UNIX, and Macintosh. Currently, Microsoft is working to make DCOM, and some other parts of the ActiveX family available on other operating systems. Microsoft has provided ActiveX support for Macintosh (Chappell and Linthicum, 1997). DCOM implementation on all major UNIX platforms, such as Solaris, Linux, and HP/UX, is also available from a third party company, Software AG (Microsoft, 1998). The main problem of DCOM with non-Window platforms remains a lack of popular applications. Most software companies develop DCOM applications on the Windows platform rather than on the UNIX and Macintosh because of the marketing considerations. The fourth problem is the compatibility with other distributed component frameworks, such as the Java Virtual Machine (VM). Currently, only Microsoft’s own Java Virtual Machine (with IE 4.0 or later versions) can run DCOM components or ActiveX controls. Non-Microsoft Web browsers, such as HotJava and Netscape Communicator are not able to run DCOM components and ActiveX controls. In general, the DCOM applications have become a Microsoft-dependent technology, which is not easily to cooperate with other software companies and not a fully interoperable framework for distributed components. To sum up, Microsoft’s DCOM technology is closely connected with current PC desktop applications. However, the intangible relations with Microsoft’s other products cause the future development of DCOM is again the original principle of distributed computing: bridging the heterogeneous platforms and environments. CORBA and Java platforms, on the other hand, provide a more vendor-independent capability for distributed components. 2.4.2 Common Object Request Broker Architecture (CORBA) The Common Object Request Broker Architecture (CORBA) is another distributed component framework developed and standardized by the Object Management Group (OMG). CORBA provides a standardized interface model and object framework for solving network computing problems in a distributed heterogeneous environment. 2.4.2.1 CORBA Development History 22 The development of CORBA has been in progress over ten years and dominated by the Object Management Group (OMG). OMG is a non-profit consortium founded in May 1989 by eight companies: 3Com Corporation, American Airlines, Canon Inc., Data General, Hewlett-Packard, Philips Telecommunications N. V., Sun Microsystems and Unisys Corporation (Yang and Duddy, 1996). In 1998, OMG included over 800 member companies internationally. The main goal of OMG is to promote theories and practices of object technology in distributed computing, including reusability, portability, and interoperability. The direction of the OMG does not focus on developing new computing technologies, but rather relies on existing technologies offered by member companies. OMG’s members may propose specifications based on OMG’s Requests for Proposals (RFPs) under different commercial-available computing technologies. The proposed specification will be reviewed and voted by the OMG Board of Directors to decide whether the specification is formally accepted or not (Yang and Duddy, 1996; Vinodki, 1997). Essentially, OMG is a standards organization. Other examples of standards-building efforts will be discussed in Chapter Three. OMG released the first specification of CORBA 1.1 to the computer industry in 1991, following the standardized Object Management Architecture (OMA). Later, OMG released CORBA 2.0 in 1994, and CORBA 2.2 in 1998. The specification of CORBA defines an Interface Definition Language (IDL) and Application Programming Interfaces (APIs), which enable client/server object interaction within a specific implementation of an Object Request Broker (ORB) (Orfali and Harkey, 1997). The architecture of CORBA differs from DCOM in that it does not distinguish between clients and servers, as discussed below. 2.4.2.2 CORBA Architecture and Interfaces CORBA’s architecture is based on Object Management Architecture (OMA), a high-level conceptual infrastructure for distributed computing environments proposed by the OMG. OMA provides the means to build interoperable software systems in heterogeneous network-computing environments. Application Interfaces Domain Interfaces Common Facilities Object Request Broker Object Services Figure 2-7. OMA Reference Model interface categories (Vinoski, 1997). The Reference Model of OMA has been consistently modified since it was published in 1990. The 1996 version of OMA RM added a new category, Domain Interface and introduced the 23 Object Frameworks (Thompson et. al., 1997). Figure 2-7 shows that the OMA Reference Model consists of an Object Request Broker and four software interface categories (Application Interface, Domain Interface, Common Facilities, and Object Services). Object Services are used for the management of distributed object programs and the discovery of other available services. Two examples include the Naming Service, which allows clients to find objects based on names, and the Trading Service, which allows clients to find objects based on their properties. Other Object Services specify software life-cycle management, security, transactions, and event notification, etc (Vinoski, 1997). Common Facilities provide standardized interfaces to common application services, such as system management, data interchange, printing, and user interface, etc. They are oriented towards end-user applications. An example of such a facility is the Distributed Document Component Facility (DDCF) that permits interchange of objects based on a document model, for example, facilitating the linking of a spreadsheet object into a report document. Other types of Common Facilities include the printing facilities, database facilities, and electronic mail facilities, user interfaces, etc (Yang and Duddy, 1996; Vinoski, 1997). Domain Interfaces are oriented towards specific task domains. One of the first OMG Domain Interface categories is Product Data Management (PDM), which focuses on the manufacturing industry domain. Other types of OMG Domain Interfaces will soon be issued for telecommunications, medical, and financial application domains. In Figure 2-6, multiple boxes are shown for Domain Interfaces to indicate the existence of many separate application domains (Vinoski, 1997). Application Interfaces are developed specifically for a given application. Because they are application-specific, and because the OMG does not develop applications (only specifications), the interfaces are not standardized. However, if over time it appears that certain broadly useful services emerge out of a particular application domain, they might become candidates for future OMG standardization (Vinoski, 1997). For example, a GIS vendor can develop its own Application Interfaces for a specific GIS product in the framework of OMA and utilize other types of interfaces for the purpose of system management or object services. In addition to the Reference Model, OMA also defines an object model and framework. An OMA object is an encapsulated entity using Object-Oriented modeling techniques. The Object Model of OMA defines common object semantics for specifying the externally visible characteristics of objects in a standard and implementation-independent way. A client-side object can request services from a target object (a server) through a software interface, which is specified in OMG Interface Definition Language (IDL). The request includes an object reference of the service provider, which is a unique object identifier. The design of each object reference protects the content from the client side intervention (Vinoski, 1997). In general, each object has its own types of interfaces in order to provide their functionality and communicate with other types of objects (OMG, 1998). Essential to CORBA is the design of the Object Request Broker (ORB). The main function of ORB is to deliver requests from clients to target objects. In general, the ORB is middleware that 24 maintains client-server relationships for the application programmers. The protocol for client/server interaction is defined through a single implementation language-independent specification, the Interface Definition Language (IDL). The IDL can be defined underneath different programming languages, such as C++, Java, and SmallTalk, etc. The IDL provides operating system independent interfaces to all the services and components that reside on a CORBA bus. Programmers can use the IDL to specify a component’s attributes, the parent classes it inherits from, the exceptions it raises, the typed events it emits, and the methods it supports, etc (Orfali, et. al., 1996). Current CORBA specification 2.2, released in 1998, provides several language mappings for the IDL, including C, C++, Smalltalk, COBOL, and Java (Vinoski, 1997, OMG, 1998). Object Request Broker (ORB) Figure 2-8. The CORBA architecture (OMG, 1998). Figure 2-8 illustrates the CORBA architecture. A CORBA client can use the Dynamic Invocation interface or an IDL stub to make a request to the server-side objects. The client can also directly interact with the ORB for some functions. The Object Implementation (server-side object) receives a request as an up-call either through the IDL generated skeleton or through a dynamic skeleton (Schmidt and Vinoski, 1998). The ORB locates the appropriate implementation code, transmits parameters, and transfers control to the Object Implementation through an IDL skeleton or a dynamic skeleton. In performing the request, the Object Implementation may obtain some services from the ORB through the Object Adapter. When the request is complete, control and output values are returned to the client (Vinoski,1997; Orfali and Harkey, 1997). 2.4.2.3 Advantages and Disadvantages The design of CORBA provides a scalable and flexible framework for distributed client/server components and for the Internet and the Intranet. CORBA follows comprehensive OMA guidelines with the full range of object services, common facilities, domain interfaces, and 25 application interfaces. CORBA developers can create a sophisticated, well-organized object set whose elements can interact dynamically via the ORB. Well-defined object categories can facilitate communication between objects. CORBA implementation procedures can help programmers to conquer the most critical challenges in distributed network environments, such as monitoring object life cycles, global naming procedures, transaction services, licensing, security problems and so on. A second advantage is that CORBA provides a pure object-oriented concept for modeling, including encapsulation, inheritance, polymorphism, etc. in the object implementation and the language mapping methods. At the same time, CORBA objects can be implemented using traditional procedure languages, such as the C, FORTRAN, and COBOL. The third advantage is the extensibility for future development of distributed objects/components. CORBA has been developed for almost ten years, which is much longer than the other competitors, DCOM and Java. CORBA specifications create innovative design and concepts for distributed network environments, such self-describing, self-managing objects. Other distributed component technologies, such as Java platform and DCOM have followed the same concepts from the original CORBA and OMA specifications. Indeed, the development of CORBA illustrates the future direction for distributed network environments and distributed computing. However, there are still some drawbacks of CORBA. The first problem with CORBA development is that the desktop integration with Microsoft Windows-based environments is difficult. Although CORBA implementations support a wide range of mainframe and workstation UNIX platforms, CORBA provides only limited support for Windows NT applications and other Windows-based environments. Most Windows-based environments use DCOM technology, which as stated above is a closed architecture. The second problem is a marketing issue. CORBA is not a free technology. Users have to purchase the development tools and implementation frameworks to develop CORBA objects, whereas DCOM and Java technology provide free download for programmers to develop their applications. Therefore, the marketing strategy limits the popularity of CORBA objects and applications. The third problem is the slower development process compared with other distributed component technologies. Java and DCOM evolve very quickly with many new functions released every year. The main reason for the slow evolution of CORBA is that all major changes and modifications must be approved by the OMG members in hundreds of different software companies. Although, the democratic approach of CORBA can ensure the standardization of methods, it also slows the progress of CORBA relative to its competitors, which are developed by a single software company. This presents a classic trade-off in standards development, namely the dynamic tension between institutional consensus and market competition. To summarize, CORBA has a comprehensive, extensible, well-defined architecture to support complicated applications in distributed, heterogeneous network environments. The implementation of CORBA also supports both new and legacy languages and applications, which is a very important feature for the integration of distributed data and programs. The main 26 goal of CORBA is to allow distributed business applications to work together seamlessly across a network. Many programmers prefer CORBA technology because of its innovative design and well-defined architecture. However, the integration of desktop computers will be a critical issue for the future success of CORBA development. One possible solution for the desktop integration is to let Java technology bridge the gaps between CORBA applications with desktop PCs. The Java/CORBA integration has emerged as a new direction for the CORBA development (Orfali and Harkey, 1997). 2.4.3 Java Platform In contrast to DCOM and CORBA, the original development of the Java platform is as a programming language instead of in support of distributed object frameworks. However, with the rapid growth of Java applications, the Java language has developed its own component framework, called JavaBean, with the architecture specifications for distributed computing. Currently, many Java-related technologies are already beyond the scope of programming language. Its original developer, Sun Microsystems, Inc. called all related Java technologies and specifications an integrated Java Platform. “The Java programming language platform provides a portable, interpreted, high-performance, simple, object-oriented programming language and supporting runtime environment” (Gosling and McGilton, 1996, p. 11). The original goal of Java is to meet the challenges of application development in the context of heterogeneous, network-based distributed environments (Gosling and McGilton, 1996). The key to Java’s power is its “write once, run anywhere” software model. The Java runtime environment translates Java byte-codes into a virtual machine that runs on any supported platform (Hamilton, 1996). With its powerful cross-platform capability, many software vendors and organizations have launched their projects to explore the potential of the Java language and on-line applications (Halfhill, 1997). The following paragraphs introduce the briefly history of Java language development and the Java platform. 2.4.3.1 Java Development History The Java language was developed at Sun Microsystems in 1990 by James Gosling as part of a research project to develop software for consumer electronics devices (Anuff, 1996; Lemay and Perkins, 1996; Harmon and Watson, 1998). The original name of the new language was called Oak. The purpose of Oak was to provide an object-oriented programming language and a software platform for smart consumer electronics, such as cable boxes, video game controls, and so on. In 1991, a team called Green Project from Sun Microsystems began to work on Oak. Sun renamed the Oak language as Java and introduced it to the public in 1995. Java technology has become one of the most important developments in Internet history. In 1997, Sun released Java 1.1 which includes many important new feature and functions for distributed computing, including Java Beans, Internationalization, New Event Model, Jar Files, Object Serialization, Reflection, Security, JDBC, and RMI. One of the most important new features of Java 1.1 is the Java Bean for creating reusable, embeddable software components, which are similar to the Microsoft’s ActiveX model. Two other significant features in Java 1.1 for distributed computing are the Remote Method Invocation (RMI) API and Java Database 27 connectivity (JDBC), which allow a Java program to invoke methods of remote Java objects or communicate with remote database management systems (DBMS) directly (Weber, 1997). In late 1998, the Java 2 Platform was released and provided more advanced network-centered functions and APIs. The new content included the Java-version of ORB for the integration of CORBA, Java 2D APIs, Java Foundation Classes, and Java servlets for enterprise server-side applications (Horstmann and Cornell, 1998; Flanagan, 1999). 2.4.3.2 Java Language and Architecture The Java language is a pure object-oriented language, designed to enable the development of secure, high performance, and highly robust applications on multiple platforms in heterogeneous, distributed networks (Gosling and McGilton, 1996). From the computer programming perspective, Java looks like C and C++ while discarding the overwhelming complexities of those languages, such as typedefs, defines, preprocessor, unions, pointers, and multiple inheritance (Gosling and McGilton, 1996). The design of Java language draws on the best concepts and features of previous objectoriented languages, primarily from Eiffel, SmallTalk, Objective C, and C++. Java also incorporates garbage collection and dynamic links from Lisp and Smalltalk, interface concepts from Objective C and OMG’s Interface Definition Languages (IDL), packages from Modula, concurrency from Mesa, and exceptions from Modula-3 (Harmon and Watson, 1998, p. 62). Compile-time Environment Run-time Environment (Java Platform) Class loader and Bytecodes verifier Java source code (.Java) Java Bytecodes “compiler” (Javac.exe) Java class library The Java Virtual Machine Download from the Internet Java interpreter Just-in-time compiler Run-time system Java Bytecodes (.class) Operating system Hardware Figure 2-9. The Java Platform architecture (Harmon and Watson, 1998, p. 70). 28 The architecture of Java platform is illustrated in the Figure 2-9. There are two procedures for the implementation of Java applications, a Compile-time environment (server-side) and a Runtime environment (client-side). The Compile-time environment can be constructed by using the Java Development Kit (JDK) provided by Sun, includes a Java compiler (Javac.exe), a Java interpreter (Java.exe), a Java debugger (jdb.exe), and several standardized Java libraries. Programmers can use the Java compiler to generate a Java class from a text-based Java source code to a Java byte-codes format and put the class on the server-side machine. Then, the Java class is ready for download by client machines. The Run-time environment is comprised of three components, Class loader, Java class Library, and Java Virtual Machine. When a client requests a Java class, the client-side Virtual Machine will download the Java class via the Class loader and combine it with other required Java class from the library. Then, the Java class will be interpreted or compiled into the actual machine codes in the Run-time system, which can be executed under the client-side operating system and hardware environment. Beside the mobile class download functions, the Java platform also supports remote method invocations on object across different Java Virtual machines by using the Remote Method Invocation (RMI). By using RMI, Java programmers can create a remote Java class with object serialization and create client stub and server skeleton for the communication between clients and servers. The implementation of RMI is very similar to the procedure of CORBA object implementation (Orfali and Harkey, 1997). Three types of Java programs include Java application, Java applets, and Java servlets. Java applications are stand-alone programs. They don’t need to be embedded inside a HTML file, or use any Web browser to execute the programs. Java applications can provide full access to the entire local machine resources, such as writing files and changing database contents. Also, Java applications run faster than Java applets because the applications do not need to deal with browsers and have full control of the local client environment. A Java applet is a specific kind of application that can only run from within a Web browser which contains the Java Virtual Machine. In contrast to a Java application, Java applets must be included as part of a Web page in HTML format. Java applets are designed for WWW and can be dynamically downloaded via the Internet. In order to protect the Web users and prevent possible damage to the local machines, Java applets execute within a closed, secure Web browser environment and have only limited access to the memory, data, and files on the local machine. More recently, server-side Java programs, called Java servlets, become more and more important for distributed computing environment and the Internet. Java servlets can let a user upload an executable program to the network or server. These servlets can actually be linked into the server and extend the capabilities of the server. By interacting with server-side applications, a Java servlet can share the loads between servers and clients. The results will reduce server load and provide the balance of functionality on server and client machines. Most programmers and technology consultants are very optimistic about the future development of Java technology. The main reason is that Java language is truly designed for the distributed network environment, such as the Internet and Intranet. In the future, Java technology will 29 embrace more new functions and APIs in order to cope with the rapid development of network technology. 2.4.3.3 Advantages and Disadvantages In general, Java provides a simple and creative way to develop, manage, and deploy distributed client/server applications. It also provides an easy way to distribution and update of applications and programs via the Internet immediately. From a distributed computing perspective, there are two advantages of Java technology. The first advantage of Java is to provide a dynamic component framework of Java applets and servlets. The use of Java applets and servlets in Web applications can facilitate a more dynamic and efficient interaction between client and server. Therefore, the Java platform can provide a truly distributed computing environment with the balance of server-side/client-side processes. The second advantage of Java technology is the similarity between the Java language and C++ language. The similar syntax and statements encourage more and more software engineers to develop powerful Java applications without too much struggling. Programmers with C++ experiences can shift to Java programming very quickly. Therefore, Java programming becomes more and more popular due to its similarity to C++. The third advantage of Java is its robust performance with the cross-platform capability. Traditional programming languages, such as C++, can not provide such a robust, cross-platform program because “their designs primarily support for programmer-directed memory allocation and de-allocation, pointers and pointer arithmetic, multiple inheritance and procedural features such as functions, structures, union, typedefs, defines, and pre-processor directives, including macros” (Knapik and Johnson, 1998, p.279). Since Java gets rid of many problematic designs and functions in traditional programming languages, the execution of Java programs becomes more robust and reliable while across different platforms. The fourth advantage is the dynamic binding feature for Java with the downloadable Java-applets framework. “Imagine a multi-media word processor written in Java. When this program is asked to display some type of data it has never encounter before, it might dynamically download a class from the network that can parse the data, and then dynamically download another class (probably a Java bean) that can display the data within a compound document. A program like this uses distributed resources on the network to dynamically grow and adapt to the need of its user” (Flanagan, 1997, p.5). The dynamical download for new classes will facilitate the sustainable growth of Java applications in the heterogeneous network environments. On the other side, Java technology still has some weakness. First, the Java platform doesn’t provide a standardized distributed object infrastructure, such as CORBA’s OMA. Many different software companies develop unique Java libraries and applications with non-standard frameworks. Without the standardized categories, open architecture and integration between different packages and libraries will remain very difficult. 30 Second, the performance of Java byte-code programs is slower than genetic machine-level binary code programs written in C++ or other languages. Sun provides some solutions for improving the Java program performance, such as the Just-in-Time compiler (JIT) and Java Chips. However, the general performance of Java applets and applications is still slower than traditional programs. To summarize, Java is a simple, object-oriented, distributed, interpreted, robust, secure, architecture neutral, portable, high-performance, multi-threaded, and dynamic binding language (Anuff, 1996). Java technology is still evolving and changing. The great success of Java technology changes the nature of the Internet and the World Wide Web. In the future, Java technology may extend territory to provide smart electronic devices, such as interactive TVs, smart air conditioners, or smart microwaves, or palm-size GPS applications (Horstmann and Cornell, 1998). 2.5 Chapter Summary This chapter reviewed the major topics in distributed computing, including the development histories of network technology, distributed systems and open systems, and the detailed introduction of three different distributed component frameworks, DCOM, CORBA, and Java. In general, the progress of network technology provides a modern hardware/software infrastructure for distributed GIServices. The concepts of distributed systems and open systems facilitate the shift of GIS architecture from a centralized system to distributed services. The indepth description of three distributed component frameworks (DCOM, CORBA and Java platforms) illustrates the possible choices of technical framework for distributed GIServices. The in-depth explanations of these technologies should help the GIS community and software designers understand the potential capabilities and the technical limitations of distributed GIServices. In the next chapter, distributed component technologies will be shown to provide great promise for future GIServices development, such as the dynamic embedding and linking GIS functions via the network; self-managing and self-describing of GIS components, and “write-once-run-everywhere” types of software coding. There are constraints on these technologies, such as vendor-dependency, complex software specifications and design, and the lack of integration between different component frameworks. To deploy a distributed GIService architecture, the GIS community has to confront the limitations and the drawbacks of distributed component technology, and utilize the potential capabilities of distributed component technologies, such as dynamic binding, self-managing components, remote method invocations, etc. The following discussion illustrates the major considerations of adopting distributed component technologies from the GIService-oriented perspective. First of all, selecting the right component technology for distributed GIServices is extremely difficult. The selected technology should provide a robust, secure, and efficient communication mechanism via the Internet/Intranet. Security and stability will become major considerations for distributed GIServices because many geospatial datasets and services are valuable and critical and networking brings opportunities for viruses, hacker attacks, network traffic jams, etc. The integration of legacy systems must be another criterion for GIServices because many valuable GIS programs providing essential services reside in legacy systems. The best example of this is 31 federally produced public domain GIS data such as census data. Current distributed component technologies usually provide certain approaches to integrate legacy systems, such as objectwrapping and middleware solutions. However, these approaches may reduce the performance of legacy systems or simply can not be applied in some specific cases. The third criterion is the future development of these technologies. Many people think that a superior technology will guarantee successful adoption in the future. However, many cases in the computing industry do not reflect such an assumption, such as the failures of NeXt operating system, OpenDoc, and IBM’s OS2. The best technology does not ensure ongoing development automatically. Support from software vendors, marketing strategies, and users’ feedback will also decide the future development of component technologies. Thus, to choose an appropriate distributed component technology, one should not only consider its technical features and implementation details, but also the actual users’ experiences, vendor support, and marketing strategies. Second, customizing these technologies for distributed GIServices is the major task for distributed GIServices research. Distributed component technologies are not designed specifically for GIServices, but for general information services. Many requirements and functionality of GIServices are not considered in the original design of generic distributed component technologies. For example, the complexity of geodata models and functions, the huge volume of geospatial databases and remote sensing images, and the visualization requirements of geospatial information are not taken into account. Adequate GIS functions above the low-level technical frameworks are essential for the successful implementation of distributed GIServices. This dissertation will address detailed design issues in a later chapter. Third, integrating different distributed component technologies is essential for providing truly distributed GIServices. Inevitably, the future development of distributed GIServices will have to tolerate heterogeneous techniques and frameworks, because there is no perfect distributed component technology for all kinds of GIServices. A sound technology should be able to integrate and migrate different frameworks. Currently, most distributed component technologies have proposed solutions for integrating other technologies. However, only few successful cases demonstrate that these approaches really work. Software vendors are not willing to integrate their technology with others because of marketing considerations. The GIS community should push these vendors for a true integration of distributed component technologies, otherwise it will not happen automatically. It is dangerous for the GIS community to just wait and see what happens. Ironically, both the OpenGIS specifications and ISO/TC 211 (discussed in the next chapter) do not propose integration solutions for heterogeneous distributed component frameworks. These tasks are left to software vendors and the computer industry (OGC, 1998). This dissertation will propose an agent-based mechanism for the integration of heterogeneous distributed component technologies in a later chapter. To summarize, maximizing the capability of GIServices by using distributed component technologies is the main goal of distributed GIServices. Traditional GISystems do not provide users with flexible and dynamic services. The future development of distributed GIServices should provide innovative GIS functions and services instead of mimicking the original functions of GISystems. Putting traditional GISystems on-line is not equal to distributed GIServices. The 32 GIS community should invent new services and functions specifically for distributed GIServices, such as digital libraries, distance learning, cyberspace navigating, network-based decision support systems, virtual tourism, etc. Innovative GIS services and functions will energize the development of distributed GIServices into a higher level and provide users with more comprehensive services. 33 CHAPTER 3. OVERVIEW OF DISTRIBUTED GISERVICES The development of distributed GIServices has becomes more and more essential as the GIS community becomes reliant on the Internet. The major need of the GIS community is to “develop a high level GIS federation, i.e., fully interoperable, where users can transparently access remote services, and yet they still maintain their autonomy” (Bishr Yaser, 1996, p. A.1). The early research of on-line, distributed GIServices (Gardels, 1996; Plewe, 1997; Tang, 1997) has been motivated by the adoption of an open and distributed architecture and the re-design of GIS metadata and component frameworks in the GIS community. The literature review of distributed GIServices will focus on development history, on standards for distributed GIServices, and on metadata development. 3.1 The History of GIServices Three projects are of primary importance in the development of GIServices. They are important because these projects initiated the design of preliminary distributed GIServices frameworks and the adoption of early Internet technologies, such as HTTP and CGI programming. It is important to note in reading this history that a disconnection exists between a focus on services as provided by programs and a focus on data as specified by standards organizations. Without standardized procedures on interfaces between procedures, GIServices cannot become fully distributed. 3.1.1 The Xerox PARC Map Viewer The Xerox PARC Map Viewer is the earliest prototype of distributed GIServices concurrent with the rapid development of the Web. The Map Viewer was developed at Xerox Corporation’s Palo Alto Research Center, as an experiment in providing interactive information retrieval via the World Wide Web (Putz, 1994). The Map Viewer is an interactive Web application, which combines the ability of HTML documents to include graphical images with the ability of HTTP servers to create new documents in response to user input (Figure 3-1). The Map Viewer used a customized server module (a CGI program) written in the PERL scripting language. Map images in GIF format were generated by two separate utility programs on a UNIX server. The first program, MAPWRITER, produced raster map images from two public domain vector map databases. The second program, RASTOGIF, converted raster images to GIF format. In subsequent work, the Xerox Map Viewer was integrated with U.S Gazetteer WWW services created by Plewe (1997), to provide a text-based query function, which is essential for a complete prototype of distributed GIServices. 34 Figure 3-1. The Xerox Map Viewer. 3.1.2 GRASSLinks GRASSLinks was developed in 1995 by Huse (1995). GRASSLinks was the first fully functional on-line GIService which connects GRASS GIS software (from the US Army Corps of Engineers) and the World Wide Web. GRASS is a grid-based GIS package offering public domain access to environmental and geographical data. The development of GRASSLinks was supported by the Research Program in Environmental Planning and GIS (REGIS) at the University of California at Berkeley. To utilize GRASSLinks, a user only needed a Web browser (Figure 3-2). GRASSLinks encouraged cooperation and data sharing between different environmental agencies. In traditional GIS applications, each agency would maintain its own database as well as data obtained from other sources. GRASSLinks introduced a new model of data sharing, where each agency could maintain data which they produced, and access other agencies’ data over the network as needed (Huse, 1995). GRASSLinks could perform many GIS operations, including map display, query, overlay, reclassification, buffering, and area calculation. On-line users can save their works temporarily on the server and retrieve files later. GRASSLinks demonstrates an ideal prototype for high-end distributed GIS functions, and provides an example of the first true on-line GIServices (Plewe, 1997). 35 Figure 3-2. GRASSLinks and its GIS operations. 3.1.3 Alexandria Digital Library Project The Alexandria Digital Library (ADL) Project illustrated a digital library framework for heterogeneous spatially-referenced information, which can be accessed across the Internet. The Alexandria Project was launched in 1994 concurrent with five other digital library projects (NSF, 1994). Many important collections of information, such as maps, photographs, atlases, and gazetteers, are currently stored in a non-digital form, and collections of considerable size and diversity are found only in large research libraries. ADL provides a framework for putting these collections on-line, providing search and access services to a broad class of users, and allowing both collections and users to be distributed throughout the Internet (Goodchild, 1995; Buttenfield and Goodchild, 1996; Buttenfield, 1998). 36 Figure 3-3. The HTML/CGI version of the Alexandria Digital Library Project. The major contribution of the ADL project is to introduce digital library services metaphor for distributed GIServices, and to extend the types of GIServices to cataloging, gazetteer searching, and metadata indexing. A third contribution of ADL is to explore Internet-based interface design processes. ADL utilized three different technologies for the actual implementation. The first version ran as a customized ArcView project. The second version was based on HTML and CGI programs (Figure 3-3). The final version ran as a Java applet and Java application. However, the incompatible technologies caused inconsistent problems of data integration and delivery of services. The ADL user interfaces proved difficult to migrate to each new version, and each one cost significant money and effort to redevelop. Overall, the ADL project explored different computer technologies and frameworks, identified major tasks of digital libraries, and become the first online service to provide comprehensive metadata browsing, display, and query functions for geospatial information. Recently, the Alexandria Digital Earth Prototype (ADEPT) is a follow-on to the Alexandria Digital Library Project (ADL). ADEPT aims to use the digital earth metaphor for organizing, using, and presenting information at all levels of spatial and temporal resolution with specific focus on geodata and images in California. 37 3.2 Standards for Distributed GIServices The three examples mentioned above illustrate the early development of distributed GIServices and their major contributions. However, these projects and prototypes utilize different types of Internet technologies and frameworks, and cannot share their data sets with each other. Therefore, the GIS community also aims to develop a standardized framework for the interoperability of GIServices. With the comprehensive architecture for bridging heterogeneous GIServices, researchers and scientists can easily share their geospatial data and GIS models. Two major organizations that set standards for distributed GIServices are the Open GIS Consortium, Inc. (OGC) and the Technical Committee tasked by the International Standards Organization (ISO/TC211), both founded in 1994 (Buehler and McKee, 1998; Rowley, 1998). The main task of OGC is the full integration of geospatial data and geoprocessing resources into mainstream computing and the widespread use of interoperable geoprocessing software and geodata products throughout the information infrastructure (OGC, 1998). ISO/TC211 emphasizes a service-oriented view of geoprocessing technology and a balanced concern for information, application, and systems (Kuhn, 1997). 3.2.1 The OpenGIS Specification In 1993, the Open GIS Consortium, Inc. (OGC) proposed a comprehensive software architecture called the Open Geodata Interoperability Specification (OGIS), which supported distributed geoprocessing and geodata interoperability in distributed network environments. The OGIS project obtained support from federal agencies and commercial organizations. OGC renamed OGIS to the OpenGIS Specification later on. The OpenGIS Specification defines a comprehensive software framework for distributed access to geodata and geoprocessing resources. The OpenGIS Specification includes an Abstract Specification and a series of Implementation Specifications for various distributed computing platforms (DCPs), such as CORBA, OLE/COM, Structured Query Language (SQL), and Java. Software developers use OpenGIS conformant interfaces to build distributed GIServices, which include middle-ware, component-ware, and applications. These distributed GIServices will be able to handle a full range of geodata types and geoprocessing functions. “The OpenGIS Specification provides a framework for software developers to create software that enables their users to access and process geographic data from a variety of sources across a generic computing interface within an open information technology foundation” (Buehler and McKee, 1998, p.7). The contents of the OpenGIS Specification (both the abstract and implementation specifications) are based on three conceptual models. The Open Geodata Model (OGM) provides a common data model, using object-based and/or conventional programming methods. OpenGIS Services define the set of services needed to access and process geodata defined in the OGM and provide the capabilities to share geodata with the GIS community. An Information Communities Model employs the OGM and OpenGIS Services in a scheme for automated translation between different geographic feature lexicons. Together, this establishes communication mechanisms among different communities of geodata producers and users (Buehler and McKee, 1998). In general terms, OGM provides a common means for digitally representing the Earth and Earth phenomena, mathematically and conceptually. OpenGIS Services implement geodata access, 38 management, manipulation, representation, and sharing between information communities (Buehler and McKee, 1996; 1998). These three conceptual elements of the OpenGIS Specification formalize the contents of the Abstract Specification and the Implementation Specification. The two types of specifications focus on the establishment of high-level software design and technology-centered implementation. The following section will introduce the two specifications and outline three conceptual models to demonstrate the OpenGIS standard in practice. 3.2.1.1 The OpenGIS Abstract Specification The OpenGIS Abstract Specification presents a high-level abstraction defining characteristics for designing geospatial data models and services. The OpenGIS Abstract Specification also demonstrates the two central technology themes of OGC, which are sharing geospatial information and providing geospatial services (OGC, 1998). The contents of the OpenGIS Abstract Specification embrace fourteen different GIS topics (Table 3-1). Number Topic 0 Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10 Topic 11 Topic 12 Topic 13 Topic 14 Topic Abstract Specification Overview Feature Geometry Spatial Reference Systems Locational Geometry Stored Functions and Interpolation The Open GIS Feature and Feature Collections The Coverage Type Earth Imagery Relations between Features Quality Transfer Technology Metadata The Open GIS Service Architecture Catalog Services Semantic and Information Community Technology Goals Prerequisites for Sharing Geospatial Information Sharing Geospatial Information Prerequisites for Sharing Geospatial Information Providing information theoretic content for information communities Prerequisites for Sharing Geospatial Information Providing Geospatial Services Providing information theoretic content for information communities Table 3-1. The contents of the OpenGIS Abstract Specification. 39 The OpenGIS Abstract Specification is modified and edited at each OGC Technical Committee Meeting (roughly twice per year). Following the object-oriented approach developed by Cook and Daniels (1994), the OpenGIS Abstract Specification provides an Essential Model for geographic information representation and processing (Buehler and McKee, 1998). The Essential Model proposed by OGC contains nine levels of abstraction. 1. Real World: This is the world as it is. The Real World means the collection of all facts, whether they are known by mankind or not. 2. Conceptual World: This is the world of things people have noticed and named. The method by which the conceptual world interfaces to the Real World is the extraction of the essence of a fact. 3. Geospatial World: This is the cartoon-like world of maps and GIS, in which specific things in the Conceptual World are selected to represent the Real World in an abstract and symbolic way, using maps and geodata. 4. Dimensional World: This is the Geospatial world after it has been measured for geometric and positional accuracy. 5. Project World: This is a selected piece of the Dimensional World (certain thematic layers in a GIS, for example) which are structured semantically for a particular purpose, profession, discipline, or industry domain. 6. OpenGIS Points: How points are defined, either generically or for a particular Project World, in a way that all software systems can relate. 7. OpenGIS Geometry: How geometry is constructed based on OpenGIS Points, in a way to which all software systems can relate. 8. OpenGIS Features: How features are constructed from geometry, attributes, and a spatial referencing system, in a way that lends itself to use in open interfaces for geoprocessing. OpenGIS Features are digitally coded abstractions of real-world entities that have a geometric representation, and spatial, temporal, and other attribution. 9. OpenGIS Feature Collections: A Feature Collection is the unit of trade in a geoinformation sharing transaction, and the primary object of manipulation within a geospatial software processing environment. OpenGIS Feature Collections can be of any size and content depending on the context of the transaction. (Buehler and McKee, 1998, p. 38 and 41) The first five levels of the OGC’s Essential Model are based on Cook and Daniel’s theory (1994), which deals with the abstraction of real world facts, and are not modeled in software. The final four levels of the OGC’s Essential Model deal with mathematical and symbolic models of the world, and thus are subject to being modeled in software (Buehler and McKee, 1998). However, the Essential Model proposed by the OGC focuses on the design of generic data models without considering different GIS operations and distributed processes via the networks. In fact, network-based GIS processes may require different types of data models. If geospatial information is distributed across the Internet, the design of the data model will need to emphasize the integration of heterogeneous data formats and the adoption of distributed computing technologies in order to facilitate distributed GIS operations. 40 3.2.1.2 OpenGIS Implementation Specifications OpenGIS Implementation Specifications follow the models proposed in the OpenGIS Abstract Specification. The Implementation Specifications give explicit instructions for interoperability with other OpenGIS Specification-conformant software written by other developers around the world. Application developers or software programmers are the primary users of the OpenGIS Implementation Specifications, which define explicit Application Programming Interfaces (APIs) for accessing geodata and geoprocessing functions (Buehler and McKee, 1998). In contrast to the OpenGIS Abstract Specification, which is created by the OGC’s Technical Committee, Implementation Specifications are created by GIS software vendors, based on specific Distributed Computing Platforms (DCP). As parts of the Abstract Specification are completed, the OGC invites proposed Implementation Specifications from software vendors (Buehler and McKee, 1998). Vendors submit DCP-specific Implementation Specifications (such as OLE/COM, CORBA, Java, or SQL), which are reviewed by the Technical Committee. Once accepted, the OpenGIS Implementation Specifications specify in DCP-specific terms the functionality of particular OpenGIS interfaces and services. Currently, few OGC’s Implementation Specifications are available now. For example, three Implementation Specifications (for CORBA, OLE/COM, and SQL) were released for Simple Features in March 1997 (OGC, 1997a; 1997b; 1997c). Simple Features Specifications implement topic 8 in the OpenGIS Abstract Specification. These Simple Feature Specifications provide interfaces to allow GIS software engineers to develop applications that comprise the definition of OpenGIS Features and Geometry (from the Essential Model) using several DCPspecific technologies. These standardized specifications will facilitate horizontal software integration (with graphic user interfaces, database connectivity, or task management) and vertical software integration (with different GIS software vendors and packages under the same DCP platform) (Buehler and McKee, 1998). 3.2.1.3 The OpenGIS Standard in Practice Two fundamental geographic types are recognized in the OpenGIS Specification, called features and coverages. Both features and coverages can be used to map real world elements or phenomena to an OpenGIS Specification representation, which ultimately provides a common way to describe these elements in software design (Buehler and McKee, 1998). A feature is a representation of a real world element or an abstraction of the real world. It includes a spatial domain, a temporal domain, or a spatial/temporal domain as an attribute. Features are usually managed in groups as feature collections. A GIS thematic map layer that shows roads, for example, is a collection of features. Features, as the basic elements of geospatial information, include geometry, semantic properties, and metadata (Gardels, 1996; Buehler and McKee, 1998). A coverage is an association of points within a spatial/temporal domain. “A coverage in the OpenGIS Specification is simply a function which can return its value at a geometric point. 41 Scalar fields (such as temperature distribution), terrain models, population distributions, satellite images and digital aerial photographs, bathymetry data, gravitometric surveys, and soil maps can all be regarded as coverages” (Buehler and McKee, 1998, p. 42). In a coverage, data value is associated with every location. Since coverages have all of the characteristics of features, they become a subtype of feature (Buehler and McKee, 1998). In practice, OpenGIS Services create feature collections, share feature and project schema, share metadata, discover data through catalogs, traders, and standard imaging functions (Buehler and McKee, 1998). The architecture for these services basically follows ISO Reference Model for Open Distributed Processing (RM-ODP). The specification for OpenGIS Services Architecture is called the OpenGIS Technical Reference Model, which is still under development. The Reference Model includes definitions of interfaces and its behaviors that may be supported. The same services may be supported by multiple, well-known geospatial data types (WKTs) without modification (Buehler and McKee, 1998). Following the design of ISO RM-ODP, the OpenGIS Services include applications, shared domain services, common facilities, distributed computing and object services, platform services, and external entities (OGC, 1998, topic 12). These services interoperate according to the architecture shown in Figure 3-4. Figure 3-4. The Open GIS Technical Reference Model (OGC, topic 12, 1998). Applications are custom-built computer programs that allow users to perform specific tasks (e.g. a buffering program). Shared Domain Services are computer programs that are specific to a single information domain (e.g. transportation, healthcare, geospatial domain, etc.). Common Facilities are computer programs that provide general support across multiple domains (e.g. spreadsheets and word processors). Platform Services and External Entities are the software that communicate between actual hardware and operating system and the other devices in the platforms, such as databases, modems, etc (OGC, 1998, topic 12). 42 Figure 3-5. The Geospatial Domain Services (OGC, topic 12, 1998). Distributed GIServices are those of particular references to Shared Domain Services called Geospatial Domain Services (Figure 3-5). “Geospatial Domain …[Services] will be defined by the Open GIS Consortium to ensure that the Open GIS Services Architecture can be realized with standards-based, Commercial-Off-The-Shelf (COTS) products available from multiple vendors” (OGC, 1998, topic 12, p. 9). The intention is that once complete, the OpenGIS standards will allow (for example) a feature generalization routine written by one vendor to interoperate with (for example) a coordinate transformation routine written by another vendor, transparent to the user. However, neither the service standards nor the architecture underlying the services has been developed. Several problems need to be addressed here based on the current specification of the OpenGIS Service Architecture. First, the classification of Geospatial Domain Services is quite arbitrary and ambiguous without a high-level integration. Some contents overlap with others. For example, some contents of Geospatial Domain Access Services are duplicated with Feature Manipulation Services and Geospatial Display Services. Many similar types of services sharing the same GIS functions should be included under a single service domain, such as the integration of Geospatial Annotation Services and Geospatial Symbol Management Services. Second, the separation between geospatial datasets and remote sensing images will cause a serious problem for the integration of true geospatial information services because many GIServices will require both types of datasets at the same time. Third, the OpenGIS Service Architecture does not specify any approach for dynamic binding of these geospatial domain services or business objects. The specification should address how to combine these GIServices for different applications across the Internet. This dissertation will provide a different perspective of GIServices architecture with a high-level classification and integration of GIServices, and propose an agent-based approach for the dynamic binding and combination of different GIServices. 43 A third component of the OpenGIS standard in practice is the OGC’s Information Communities Model. The Model provides for automated translation between different geographic feature lexicons, and establishes communications among different communities of geodata producers and users. The purpose of the Information Communities Model is to share information between geospatial databases with inconsistent definitions, and help communication between different communities, who describe geographic features in different ways (Buehler and McKee, 1998). An information community is a collection of people (a government agency or group of agencies, a profession, a group of researchers in the same discipline, corporate partners cooperating on a project, etc.) who, at least part of the time, share a common digital geographic information language and share common spatial feature definitions. This implies a common world view as well as common abstractions, feature representations, and metadata. The feature collections and geoprocessing functions that conform to the Information Community's standard language, definitions, behaviors, and representations belong to that Information Community (Buehler and McKee, 1998, p. 53). Although the OpenGIS Information Communities Model has not been fully developed, there are some interesting perspectives on its preliminary design. One is that inter-community sharing of geodata will be achieved through software Semantic Translators. A Semantic Translator can be used by an Information Community to filter its view (in the database sense) of the data in another Information Community. “The Semantic Translator will contain all of the information it needs to find and translate feature collections from the source to the target semantics” (Buehler and McKee, 1998, p. 55). Another interesting concept, called a Trader, will help an Information Community determine what information will be exposed and shared to whom. However, current OGC documents do not specify any actual implementation approaches and functions of Semantic Translators and Traders. This dissertation will introduce an agent-based mechanism to achieve the goal of OGC’s Information Communities Model, which is similar to the Semantic Translators and Traders. Different from the design of a single, full-knowledge-based Semantic Translator and Trader, the agents designed in this dissertation will focus on the collaboration of multiple agents, GIService components, and geospatial datasets. An in-depth explanation of agent-based communication mechanisms will be addressed in a later chapter. To summarize, the primary emphasis of OGC’s Specifications is on the interoperability of geospatial data model instead of distributed processing. However, fully distributed GIServices cannot happen without open standards and communication mechanisms for distributed processes. This dissertation will focus on the network features of distributed GIS processing and the communication mechanisms for GIServices. The OGC’s standard-building effort illustrates that the success of establishing comprehensive distributed GIServices requires collaboration among GIS community members. Another major player is ISO/TC 211, setting distributed GIServices standards for the international community. 44 3.2.2 The ISO 15046 Standard and ISO/TC 211 ISO/TC 211 is the Technical Committee of Geographic information/Geomatics tasked by the International Organization for Standardisation (ISO) to prepare a family of geographic information standards in cooperation with other ISO technical committees preparing related information technology standards. ISO is a worldwide federation of national standards bodies (ISO member bodies) including many different stakeholders from government, authorities, industry and professional organizations. Currently, ISO/TC 211 is working on the International Standard ISO 15046, which is a multi-part International Standard under the general title, Geographic Information, consisting of 20 different parts. ISO 15046 Standard specifies methods, tools, and services for data management, processing, analyzing, accessing, presenting and transferring geospatial data in digital form between users, systems, and locations. The ISO/TC 211 has now 25 active member nations from all over the world, plus 16 observing member nations. In addition there are 19 liaison organizations, of which OGC was one of the first. Two other liaison organizations that will base their future revisions on ISO/TC211 standards include the North Atlantic Treaty Organization (NATO), through its Digital Geographic Information Working Group (DGIWG), and the maritime society through the International Hydrographic Organization (IHO) (Kuhn, 1997; Buehler and McKee, 1998; ISO/TC211/WG 1, 1998b). ISO 15046 proposes a standard framework for the description and management of geographic information and geographic information services. The main goals of the ISO 15046 Standard are to: “increase the understanding and usage of geographic information; increase the availability, access, integration, and sharing of geographic information; promote the efficient, effective, and economic use of digital geographic information and associated hardware and software systems; and contribute to a unified approach to addressing global ecological and humanitarian problems” (ISO/TC211/WG 1, 1998a, p. V). The standard framework of ISO 15046 is based on five major areas, which incorporate information technology concepts to standardize geographic information (Figure 3-6). The first area is Framework and Reference Model, which identifies how components fit together. The Reference Model provides a common basis for data sharing and communication. The second area is Geographic Information Services, which define the encoding of information in transfer formats and the methodology for cartographic presentation of geographic information. Services also include satellite positioning and navigation systems. Data Administration is the third area, which focuses on the description of quality principles and quality evaluation procedures for geographic information datasets. Data Administration also includes the description of metadata, together with feature catalogues. The fourth major area, Data Models and Operators, is concerned with the underlying geometry of the globe and how geographic or spatial objects may be modelled (as points, lines, surfaces and volumes). The fifth area, Profiles and Functional Standards, considers the technique of putting together packages/subsets of the total set of standards to fit individual application areas or users. For example, different countries may have different profiles for their own geospatial datasets. This supports rapid implementation and penetration in user environments. Equally important is the task of absorbing existing de facto standards from the commercial sector and harmonizing them with profiles of the emerging ISO standards (ISO/TC211/WG 1, 1998a). 45 Geographic Information •Spatial reference •Temporal reference •Spatial properties •Spatial operations •Topology •Quality •………... Framework and Reference Model Reference Model, Overview Conceptual schema language, Terminology, Conformance and testing Geographic Information Services • Positioning services Portrayal Services Encoding Data Administration Information Technology •Open Systems Environment (OSE) •Information Technology Services •Open Distributed Processing (ODP) •Conceptual Schema Languages (CSL) •…………………………. Cataloguing Spatial Reference Descriptive Reference Quality Quality Evaluation Procedures Metadata Data Models & Operators Spatial schema Temporal Schema Spatial Operators Rules for Application Schema Profiles & Functional Standards Figure 3-6. Integration of geographic information and information technology in ISO 15046 Standard (ISO/TC211/WG 1, 1998a). Based on the framework of ISO 15046, ISO/TC 211 created five Working Groups (WG) and identified twenty parts of the ISO 15046 Standard in order to cover the full range of standardization issues that need to be addressed. Working Group 1 is tasked with the framework and reference model, with providing an overview and defining terminology for the standards, a conceptual scheme language along with methods for testing conformance. WG1 also involves the integration of imagery and grid data standards. Working Group 2 focuses on the geospatial data models and operations, which include the spatial schema and temporal schema for geographic information, the rules for application schema, classification of geography objects and their relationships, and the spatial operators for access, query, management, and processing geographic information. Work Group 3 focuses on geospatial data administration, including the feature cataloguing methodology, the guidelines for spatial referencing by both coordinates (geodetic reference) and geographic identifiers (indirect spatial reference), geographic data quality principles and evaluation procedures, and the definition of metadata schema. Work Group 4 is tasked with the geospatial services, including positioning services, portrayal definition, and encoding rules for geographic information, and the service interface and the relationship to the Open System Environment Model. Work Group 5 focuses on the profiles and functional standards. WG5 will define the guidelines for profiles and functional standards for other international standardization, and try to harmonize between these standards and the ISO/TC 211 standards (ISO/TC 211 Secretariat, 1998; 2000). 46 3.2.2.1 The Reference Model for ISO 15046 Standard The core concept of ISO 15046 Standard is illustrated in the Reference Model, which is a guide to ensure an integrated and consistent approach to structuring ISO 15046 Standard. The Reference Model for ISO 15046 Standard plays the same role as the OpenGIS Abstract Specification for the OpenGIS Standard. This reference model uses concepts derived from the ISO/IEC Open Systems Environment (OSE) approach for determining standardization requirements and the IEC Open Distributed Processing (ODP) Reference Model, and other relevant ISO standards and technical reports (ISO/TC211/WG 1, 1998a). Similar to the three conceptual models (OGM, OpenGIS Services, and Information Communities) in the OpenGIS Abstract Specification, the Reference Model also has four conceptual components: Conceptual Modelling, the Domain Reference Model, the Architectural Reference Model, and Profiles. Their relationships and major tasks are summarised and explained in the following paragraphs. Conceptual Modelling is used to describe and define services for transformation and exchange of geographic information. Conceptual Modelling is the process of creating an abstract description of some portion of the real world or a set of related concepts. ISO’s Conceptual Modelling module specifies the languages (EXPRESS, ISO Interface Definition Language, Object Modelling Technique), approaches (conceptual schema, conceptual schema languages, and conceptual formalism), and principles (the 100% principle, the conceptualisation principle, the Helsinki principle) for the standardization of Conceptual Modelling and the integration of ISO Standards (ISO/TC211/WG 1, 1998a). The Domain Reference Model provides a high-level representation and description of the structure and content of geographic information. The Domain Reference Model includes a General feature model, which defines what kinds of descriptive information shall be recorded about features and the relationships that exist between features and this information. The Domain Reference Model encompasses both the information and computational viewpoints, focusing most closely on the structure of geographic information in data models and operations, and the administration of geographic information (ISO/TC211/WG 1, 1998a). The Architectural Reference Model describes the general types of services that will be provided by computer systems to manipulate geographic information and enumerates the service interfaces across which those services must interoperate with each other. This model also provides a method of identifying specific requirements for standardization of geographic information that is processed by these services. Standardization at these interfaces enables services to interoperate with their environments and to exchange geographic information (ISO/TC211/WG 1, 1998a). Profiles combine different parts of ISO 15046 Standard and specialize the information in these parts in order to meet specific needs. Profiles and functional standards facilitate the development of geographic information systems and application systems that will be used for specific purposes (ISO/TC211/WG 1, 1998a). 47 To summarize, the ISO 15046-1 Reference Model provides a comprehensive development framework for the 15046 family of standards. The contents of the Reference Model are very similar to the OpenGIS Abstract Specification. For example, Conceptual Modelling is related to the first four levels of OGC’s Essential Model. The scope of the Domain Reference Models is similar to the last four levels of OGC’s Essential Model and some parts of the Open Geodata Model. The contents of the Architecture Reference Model and Profiles are overlapped with the OpenGIS Services Model, and the OGC’s Information Communities Model. The Reference Model provides a general understanding of the underlying principles and requirements of the ISO 15046 Standard, the detailed presentation of system implementation approaches, and data standard conformance. The following sections will provide a more practical overview of the Reference Model and related works of ISO/TC 211. 3.2.2.2 The Geospatial Data Model of ISO 15046 Standard Different from the OpenGIS implementation specifications, ISO/TC 211 did not specify the actual implementation specifications for different platforms and the private sector. Instead, ISO/TC 211 defines a high-level data model for the public sector, such as governments, federal agencies, and professional organizations. The geospatial data model is described in the view of the Domain Reference Model (Figure 3-7). 0..1 Application Schema referred Refers to Application Model Level 0..* IInstance Level References 0..* 1 Metadata Dataset 0..* Defines Content of Geographic Information Service Provides metadata for 0..* 0..1 0..* Operates on Dataset (Instances) 0..* 0..* described 0..* 0..* Feature* described 0..* 0..* 0..* Describes spatial structure of Spatial Object 0..* described 0..* Position Describes location of 0..* 0..* Coverage 0..* Figure 3-7. High-level view of the Domain Reference Model (ISO/TC211/WG 1, 1998a, p. 15). 48 The Domain Reference Model defines a high level view of geospatial data model, which includes four major components: Dataset, Application Schema, Metadata Dataset, and Geographic Information Services. Dataset consists of Features, Spatial Objects, Positions, and Coverages. Features define feature attributes, feature relationships, and feature functions. Spatial Objects describe the spatial aspects of features. Position describes the spatial object’s location by using units of measure provided by reference systems. Coverages combine the associate values of attributes to individual positions within a defined space or geographic area. For example, a Coverage contains the values of one or more attributes to geographic location over a region of interest. Application Schema provides a description of the semantic structure of the data set and identifies the spatial object types and reference systems. Data quality elements and data quality overview elements are also included in the application schema. Metadata Dataset allows users to search for, evaluate, compare and order geographic data. It describes the administration, organization, and contents of geographic information in datasets. The structure of Metadata Dataset is standardized by ISO 15046-15, which will be mentioned later. Geographic Information Services define how to implement software programs operating on geographic information. These services reference information in the Metadata Dataset in order to perform retrieval operations correctly as well as manipulation operations such as transformation and interpolation (ISO/TC211/WG 1, 1998a). 3.2.2.3 The ISO Standard in Practice There are many ISO Standards proposed by ISO/TC 211. One of the most important standards for the actual implementation in practice is the definition of software architecture, the Architecture Reference Model. The Architectural Reference Model defines a structure for geographic information services and a method for identifying standardization requirements for those services. This model provides an understanding of what types of services are defined in the different parts of ISO 15046 and distinguishes these services from other information technology services (ISO/TC211/WG 1, 1998a). GIS developers and GIS users can use the Architectural Reference Model to establish standardized geographic information services. The Architectural Reference Model is shown in Figure 3-8. The model shows application systems and services residing at different computing sites linked by a network. Services are capabilities provided for manipulating, transforming, managing, or presenting information. Service interfaces are boundaries between applications, external storage devices, communications network, and human beings across which services are involvved. 49 GIS applications GIS applications API API API Geographic Information Services ISI API Information Technology Services Information Technology Services HTI API Geographic Information Services API Service access CSI CSI HTI NNI NNI Legend API - Application Programming Interface HTI - Human Technology Interface ISI - Information Services Interface CSI - Communications Services Interface NNI - Network to Network Interface ISI Data sharing and transfer based on common conceptual models Figure 3-8. The Architectural Reference Model. The Architectural Reference Model identifies four general types of interfaces in order to enable the interoperability of GIS in distributed computing environments: Application Programming Interface (API), Human Technology Interface (HTI), and Information Services Interface (ISI). The Application Programming Interface is the interface between services and application systems, which is used to invoke geographic information services. The Communications Services Interface is the interface for accessing data transport services and communicating across a network. Different computing networks may be connected through a special interface. The Human Technology Interface allows the end-user to access the computing system, which includes graphic user interfaces and keyboard specifications. The Information Services Interface is a bridge across heterogeneous database services and allowing persistent storage of data (ISO/TC211/WG 1, 1998a). The Architectural Reference Model defines these service interfaces in order to enable a variety of applications with different levels of functionality to access and use geographic information. Geographic information system and software developers will use these interfaces to define and implement geographic information services (ISO/TC211/WG 1, 1998a). Beside the specification of service interfaces, the Architectural Reference Model also identifies six classes of generic services for geographic information. Model/Information Management Services (MS) are services for management of the development, manipulation, and storage of metadata, conceptual schemas, and datasets. Human Interaction Services (HS) are services for management of user interfaces, graphics, multimedia, and for presentation of compound documents. Workflow/Task Services (WS) are services for support of specific tasks or work-related activities conducted by humans. … Processing Services (PS) are services that perform large-scale computations involving 50 substantial amounts of data. Examples include services for providing the time of day, spelling checkers, and services that perform coordinate transformations… Communication Services (CS) are services for encoding and transfer of data across communications networks. System Management Services (SS) are services for the management of system components, applications, and networks (ISO/TC211/WG 1, 1998a, p. 28). There are several potential problems in the ISO/TC 211’s Architecture Reference Model. First, the four types of service interfaces (API, CSI, HTI, and ISI) are too generic and emphasize the computational view instead of service view. Moreover, the definitions of ISO’s interfaces are not compatible with the current computer industry and may cause software development problems. For example, the Communication Services Interface (CSI) should be defined as the network communication protocols, such as TCP/IP or FTP. The Human Technology Interface (HTI) should be defined as the user interface and the Information Services Interface (ISI) should be the database connectivity. Second, with the interface standardization proposed by the Architecture Reference Model, it is difficult to achieve software interoperability. Different computer languages have their own APIs, which are not compatible with others. Also, different component technologies (DCOM, CORBA, and Java platform) also have their own interface frameworks, which is also quite difficult from the Architecture Reference Model. Moreover, the standardization of user interface (HTI) is problematic because different GIS applications have different GIS tasks, which will require unique and application-oriented user interface design. The standardization of interfaces is only reasonable for the database connectivity (ISI) and network communication protocol (CSI). Third, the six generic classes of geographic information services are too ambiguous to guide the implementation of GIServices components. From a GIS processing perspective, it is really difficult to distinguish the differences between Workflow/Task Services (WS) and Processing Services (PS) because each GIS task is always involved with complex computing and substantial amounts of data. Also, it is difficult to separate the Human Interaction Services (HS) from the WS and PS because the design of HS is highly dependent on the fundamental features of WS and PS. In fact, geographic information services are unique and should have their own domains and classifications. Thus, the strategy of ISO/TC 211 -- mapping the geographic information service domains into general information service architecture, is not appropriate because the uniqueness of geographic information services requires specialized design and considerations. This dissertation will propose a different classification method of GIServices domains comparing to the ISO Architecture Reference Model. GIServices will be designed and classified by a taskoriented approach and the integration of interfaces will be accomplished by an agent-based mechanism. The detailed descriptions of task-oriented design and agents will be addressed in a later chapter. 3.2.3 Comparison between OGC and ISO/TC 211 The above sections reviewed the major concepts and models developed from both OGC and ISO/TC 211. In general, ISO is an international organization and its members are mainly from 51 the public sector, including national standards bodies and organizations. For example, the US national standard body is American National Standards Institute (ANSI). On the other hand, OGC’s members come mainly from the private sector, including software vendors and GIS companies, such as ESRI Inc., ERDAS Inc., INTERGRAPH Corp., AutoDesk Inc., etc. Since the backgrounds and resources of OGC and ISO/TC 211 members are quite different, the strategy and emphasis of open and interoperable GIServices frameworks are not really compatible. Basically, OGC focuses on both abstract definitions of OpenGIS frameworks and the technical-oriented implementations of data models and (to a lesser extent) on services. ISO/TC 211 mainly focuses on the high-level definitions of the GIS standards from an institutional perspective. Although both OGC and ISO/TC 211 are formed in 1994, the early development and activities of the two organizations were isolated as two parallel processes. Until early in 1997, there was a strong case to reassure the market that the two activities were compatible and would produce a consistent family of standards (Rowley, 1998). Both OGC and ISO/TC 211 are currently dedicated to harmonize the works, models, and standards between each other. 52 ISO/TC211 Work Item 1 Reference Model 2 Overview 3 Conceptual Schema Language 4 Terminology 5 Conformance and Testing Methodology 6 Profiles OGC equivalent Essential Model OpenGIS Guide up to RFP authors draft document including ISO/TC211 terms Project Document 97-200 RFP is profiling, submissions define profiles 7 Spatial Subschema Abstract Spec provides General Feature Model 8 Temporal Subschema None 9 Rules for Application Schema Domain Working Groups 10 Feature Cataloguing Methodology None 11 Spatial Referencing by Coordinates Project Document 97-017 12 Spatial Referencing by GID's None 13 Quality Principles Quality Topic in Abstract Specifications 14 Quality Evaluation Procedures None 15 Metadata RFI results, scenarios 16 Positioning Services Abstract Specification, topic 2. 17 Portrayal of Geographic Information WWW Mapping SIG 18 Encoding Well-Known Structures Transfer Technology RFP 19 Services Abstract Specifications, topic 12 20 Spatial Operators Abstract Specifications, topic 4 Table 3-2. Areas of overlap between ISO/TC 211 and OGC (modified from Kuhn,1997, p.8). In general, the contents of the OpenGIS Specification and the ISO 14056 Standard have significant overlaps, but adopted different frameworks in their data models and architectures. Table 3-2 lists the major areas of overlaps in the programs of OGC and ISO/TC 211. Beside the overlap of work programs and topics, both ISO/TC 211 and OGC have their own advantages and disadvantages in their programs. Table 3-3 and 3-4 illustrate a crystal comparison between two organizations (Rowley, 1998). The comparison is based on the four perspectives of business cases: strengths, weaknesses, opportunities, and threats. 53 THE ISO/TC 211 PROCESS STRENGTHS WEAKNESSES ï‚· Recognized in global inter-government ï‚· Starting new work is a cumbersome agreements process ï‚· Rules of process and maximum time ï‚· Perceived (wrongly) to be a slow steps are set by a high level body (ISO) cumbersome process overall ï‚· Consensus is drawn from a wide range ï‚· Perceived to produce ‘academic’ of interests at national levels. standards ï‚· European implementations should hold ï‚· Does not engage interest of software 15046 mandatory developers (engineering implementation requires additional ï‚· TC 211 standards are abstract or consensus process) foundational in character and find an extended ‘family’ in other ISO ï‚· Client (not client-server) centric and standards not process centric ï‚· Provides a ‘natural’ support ï‚· Incomplete as to the implementation (foundation) for OGC interface details specifications OPPORTUNITIES THREATS ï‚· Cooperation across a broad spectrum of ï‚· Can be subverted by (divergent) ISO activity implementation specifications if these are widely supported ï‚· Ability to enjoin outside groups through the Liaison process ï‚· There is never enough volunteer resource (diluted vote endangers ï‚· Use of liaisons to achieve intimate corporate support) bonding with implementations ï‚· Inter-operability may not result from ï‚· Ability to bring on board groups implementation of the standards supporting functional standards alone ï‚· Growth through support for information ï‚· Competition with OGC for domain domains (discrete areas of geospatial standards application) ï‚· Potential for non-compliant ï‚· Replacement of CEN / TC 287 family functional standards from a third of standards party ï‚· Lack of commitment to maintenance Table 3-3. The process comparison of ISO/TC211 (Rowley, 1998). In order to build a close working relationship among ISO/TC 211 and OGC, both organizations adopted some actions and modified their work programs. First, OGC is currently a Class-A external liaison with ISO/TC 211, which means OGC’s experts can provide their knowledge and contribute to the setting of the ISO 14056 Standard. However, OGC does not have a vote seat in ISO/TC 211 to participate the decision-making processes. On the other hand, Olaf Østensen, Chairman of ISO/TC 211, holds a voting seat on OGC's Management Committee. Second, the two groups have formally committed to work closely together to converge and match their respective work plans to avoid duplicate or divergent work in their efforts (Kuhn, 1997). 54 THE OPEN GIS PROCESS STRENGTHS ï‚· Engages a wide range of interests including software developers, integrators, academics and consumers (of software, data, both) ï‚· Members can devise rules which suit the speed at which they will work ï‚· New work or method can be initiated rapidly ï‚· Reactive to market needs and new technology ï‚· Certainty that results can be implemented and provide the interoperability ï‚· OGC process trusts TC 211 processes ï‚· Focus on market ï‚· Focus on interfaces ï‚· Builds on market technology (e.g. SQL, OMG) OPPORTUNITIES ï‚· Close cooperation with similar de facto groups ï‚· Growth through support for specific information domains ï‚· Global technical and market acceptance of interfaces. WEAKNESSES ï‚· Incremental cost of operation ï‚· Non-members have limited access to the process including adoption votes ï‚· Can be a bottom-up process ï‚· Specific effort needed to overcome continent bias ï‚· Interfaces could be implemented without the benefit of compliance with TC 211 conceptual models THREATS ï‚· The de jure standards process ï‚· Insufficient core membership and support ï‚· Insufficient acceptance in marketplace Table 3-4. The process comparison of OpenGIS (Rowley, 1998). In 1998, a formal cooperation agreement was generated in the ISO/TC 211-N472 document, which indicates the cooperation tasks between OGC and ISO/TC 211 through planning, coordination and quality control activities. These tasks emphasize that the cooperation should lead to conformance with a single industry Reference Model; provide confirmation that TC 211 standards remain relevant and conformant with market-driven requirements; provide confirmation that OGC technology is conformant with TC 211 standards, and provide the opportunity for stable OGC interface specifications to be transposed into ISO endorsed documents (Rowley, 1998). 55 OGC and ISO/ TC 211: the Coordinated Processes STRENGTHS WEAKNESSES ï‚· Results of coordinated work items or ï‚· Speed of progress proportional to parts deliver a vertically integrated resources dedicated family of standards ï‚· Delivers interoperability across disciplines within core ï‚· The single family is delivered faster ï‚· Guarantee of interoperability within domains ï‚· Domain standards will draw from two cooperating institutions ï‚· TC 211 gains a channel for maintaining its standards in alignment with the marketplace OPPORTUNITIES ï‚· Industry-wide commitment ï‚· Competition in a genuine global market ï‚· Provides guidance to contributors on how to allocate resources to standards development ï‚· Ensures conformant engineering implementation ï‚· Global institution community involvement: one voice ï‚· Single voice in information society ï‚· Exposes TC 211 to changes in the marketplace THREATS ï‚· Possible slowing-down of the ISO process (in current scope, but savings in overall scope) ï‚· Possible frustration in the OGC TC ï‚· Potential for competition for functional domain standards development resources Table 3-5. Comparison between ISO/TC211 and OpenGIS (Rowley, 1998). One example of the cooperation between ISO/TC 211 and OGC is the metadata standard. ISO/TC 211 proposed a joint project with OGC to demonstrate mutual cooperation and the feasibility of implementing the emerging ISO metadata standard (ISO WD 15046-5 Geographic information - Part 15: Metadata). In the future, OGC will adopt the metadata standard developed by ISO/TC 211 and create several scenarios based on the ISO metadata model. Table 3-5 illustrates the advantages and disadvantages of the coordinated processes between OGC and ISO/TC 211 (Rowley, 1998). In general, ISO has broader goals and is working at a level of abstraction above OGC, so the two efforts complement each other, and both are necessary. ISO's work is not likely to result in immediate implementation-level specifications, so it is in both organizations' mutual interest to see that OGC's implementation specifications fit into the ISO framework as implementation 56 profiles (Kuhn, 1997). According to the document of ISO/TC 211-N563, the general principles of cooperative agreement between ISO/TC 211 and OGC are: a. The OGC produces publicly available industry specifications through an open, consensus based process with international participation by hundreds of individuals and organizations; b. ISO/TC 211 wishes to adopt suitable industry specifications as ISO deliverables; c. Both organizations desire to harmonize their procedures. Initial technical development work relevant to this agreement is done primarily in the Consortium with provisions for ISO participation. Once the technical work is stable and editorial state is satisfactory, final editorial and independent assessment technical work, eventually resulting in an International Standard, is done primarily in the ISO/TC 211 with provisions for OGC participation; and d. Both organizations desire that OGC Implementation Specifications be adopted as ISO standards or deliverable as quickly as is feasible and with only minimal changes based on an agreed set of criteria. (ISO/TC 211 Chairman, 1998, p. 3) To summarize, the collaborations between ISO and OGC are essential for the future development of distributed GIServices. However, due to the differences of their backgrounds, members, and development strategies, many potential problems reside in their development agendas and working items. OGC is a highly commercial-oriented organization and is supported by GIS vendors. On the other hand, ISO is a non-profitable organization and relies on the national body and academic supports. In the future, ISO will have significant impacts for the development of National Information Infrastructure (NII) in U.S. and many countries. ISO works closely with other federal organizations, such as FGDC and ANSI in order to deploy a standardized geographic information infrastructure for nationwide use. The focus of OGC is more marketing-oriented for the future development of GIS software and applications. Therefore, how to resolve the diversified perspectives among two organizations and pursue an integrated framework for distributed GIService architecture will be the most important issue for both OGC and ISO. In the next few decades, GIS will integrate with other IT technology and become one of the family members in the IT industry. However, without the comprehensive deployment infrastructure, the integration of GIS and IT technology may be delayed or even be prevented. Thus, the GIS community should encourage the cooperation between ISO and OGC in order to provide a comprehensive infrastructure to harmonize different types of IT technologies. The collaborative work on GIS metadata standard is a good start for both OGC and ISO. GIS metadata standard will be used to support many distributed GIS applications. The detailed of metadata standards will be mentioned in the next section. 3.3 Metadata Development The development of geospatial metadata is essential to distributed GIServices. Metadata can provide users and systems with the descriptions in accessing, archiving, and operating geodata objects and components and make them self-describing and self-managing in distributed network 57 environments. Both the OGC’s Information Communities Models and the Domain Reference Model of ISO 15046 propose a metadata framework. In general terms, metadata describe the content, quality, condition, and other characteristics of data. The major uses of metadata include: organizing and maintaining an organization's investment in data; providing information to data catalogs and clearinghouses; and providing information to aid data transfer (FGDC, 1995). The concept of metadata has been widely used in daily life. For example, the reference list in a research article is an example of metadata, which indicates authors, years, titles, and publishers. The metadata can tell the readers where and how to find references to explore the article's topics in depth. A conceptual model for metadata includes description, history, and findings (Gardel, 1992). Description focuses on the generic feature of the data itself. History refers to the derivation, update, and processing chronology of the data sets. Findings consist with aspects as precision, consistency, and accuracy. The conceptual model indicates a wide range of potential metadata functionality. In the development of distributed network environments, the use of metadata plays a key role for the interoperability of heterogeneous systems and data models. The use of metadata in GIS applications began at the Federal level with the work of the Spatial Data Transfer Standard (SDTS) committee in 1980’s (Moellering, 1992). The goal of SDTS is to provide a common ground for data exchange by defining logical specifications across various data models and structures (Fegeas, Cascio, & Lazae, 1992). The original concepts of metadata were described in the data quality component of SDTS, but without a detailed specification (Wu, 1993). Fifteen years later, a content standard for digital geospatial metadata was approved by FGDC on June 8, 1994. The standard includes eight major components, which are identification, data quality, spatial data organization, spatial reference, entity and attribute, distributed information, and metadata reference information. Hundreds of fields need to be filled to complete a comprehensive, standardized metadata record. Recently, the Alexandria Digital Library project designed its own metadata scheme by extending the FGDC metadata standard and combing it with the USMARC standard, a national metadata scheme for libraries. Neither FGDC nor USMARC standards can fully represent digital spatial data nor map materials. Therefore, the ADL project created a spatial data extension to add additional fields for specialized items, which are not found in either standard. Figure 3-9 illustrates an example of ADL metadata records. Currently, one of the major metadata standards is the ISO 15046-15, ISO Standard for GIS Metadata. The ISO metadata standard proposed a conceptual framework and an implementation approach for geospatial metadata. The detailed description of the ISO metadata scheme will be mentioned in the next section. 58 Figure 3-9. One example of metadata records in the Alexandria Digital Library project. 3.3.1 The ISO Standard for GIS Metadata The ISO GIS metadata standard is one of the most comprehensive metadata schemes for distributed GIServices. The ISO GIS metadata standard will be adopted and used by both OGC and FGDC. FGDC will develop extensions and profiles to the ISO metadata standards. OGC will focus on actual metadata applications and the association with OGC’s Feature Collections and geospatial services. OGC documents leave the definitions of standardized metadata elements and entities to ISO/TC 211 and FGDC (OGC, 1998, topic 11). Metadata structure proposed by ISO/TC 211 includes a standardized metadata schema and its relationship to metadata datasets and geographic datasets (Figure 3-10). A metadata schema is represented by a conceptual schema language at the application-model level. This schema provides the metadata element definitions (or types of metadata elements) for the metadata in a metadata dataset. A metadata dataset describes the administration, organization, and content of a dataset at the data level. The metadata dataset provides necessary 59 information in order to support access to, and transfer of, the dataset. The metadata dataset is shown as conforming to the standardized metadata schema (ISO/TC211/WG 1, 1998a). Conceptual schema language Meta-model level 1..* 1..* Application model level 0..* Represents Metadata schema Represents 1..* Is composed of Metadata element definition 1..* Conforms to Define elements in 0..* Application schema 0..1 Application model level Refers to 1 Data level 0..* 0..* Metadata dataset Describes content of Provides Metadata to 0..1 0..* 0..* Dataset Figure 3-10. Details of ISO/TC 211 metadata relationships. The ISO 15046-15 metadata schema includes three major scopes. The first scope is the mandatory and conditional metadata sections, metadata entities, and metadata elements. These sections include the core or minimum set required to serve the full range of metadata applications; data discovery, determining data fitness for use, data access, data transfer, and use of digital data. The second scope is the optional metadata element, which allows for a more extensive standard description of geographic data. The third scope is a method for extending metadata to fit specialised needs (ISO/TC211/WG 3, 1998). The design of mandatory, conditional, and optional items in the ISO GIS metadata standard allows the implementation of metadata standards to become more flexible and dynamic and easily be adopted in distributed network environment. 3.3.2 Metadata Conformance Based on the scopes of ISO metadata standards, ISO/TC 211 identifies two levels of conformance for metadata elements. 60 Conformance Level 1 is the minimum metadata required to identify a dataset uniquely. This level of conformance shall be used ONLY for the purpose of cataloging datasets and dataset series and to support data clearinghouse activities facilitating data discovery. Thus, only one type of metadata schema is specified on this level, which is called Cataloguing information. The Cataloguing information schema includes sixty metadata elements, including title, initiative identification information, responsible party information, dataset extent, category, abstract, metadata date, etc (ISO/TC 211/WG 3, 1998). Conformance Level 2 provides the metadata required to document a dataset completely. This level of conformance fully defines the complete range of metadata required to identify, evaluate, extract, employ, and manage geographic information. This level also specifies a method for extending metadata to accommodate user-defined requirement. Eight information sections and five supporting entities are specified on this level. Information Sections include Identification information, Data quality information, Lineage information, Spatial data representation information, Reference system information, Feature catalogue information, Distribution information, and Metadata reference information. Supporting Entities includes Citation information entity, Responsible party information entity, Address information entity, Extent information entity, and On-line resource information entity (ISO/TC 211/WG 3, 1998). Over 300 metadata elements are defined in Conformance Level 1 and 2. Each metadata element has a descriptor indicating whether a metadata element shall always be present or not. The descriptor has three types of values: Mandatory (M) means the element shall be present, Conditional (C) means the element shall be present if the dataset exhibits the characteristics defined by that element, Optional (O) means the element may be present or not. The values of descriptors can be modified by individual communities, nations, or organizations, which will develop a community profile of the ISO standard. A community profile is used for customizing the value of descriptors to meet the actual needs of specific communities. The community profile provides a flexible and customizable framework of the ISO metadata standard. Another flexible approach proposed by the ISO metadata standard is the metadata extension capability specified in the extent information entity of Conformance Level 2. The extent information entity schema provides the rules for defining and applying additional metadata to meet special user needs. Users can add a new metadata element, or a new metadata entity type, or even a new metadata section based on their needs. Another signification feature of the ISO metadata standard is to provide a language-based implementation framework for metadata structure and encoding. ISO/TC 211 suggests that metadata software will support the input and output of metadata entries using Standard Generalized Markup Language (SGML), as defined by ISO. Each metadata entry will be encoded as a SGML document entity including a SGML declaration, a base Document Type Declaration (DTD), and the start and end of a base document element. The same format is also used in the Extensible Markup Language (XML) DTD. Although the metadata implementation method using SGML and XML is not mandatory for the ISO metadata standard, the metadata encoding using XML DTD format will become a major advantage for the future development and implementation of metadata datasets, especially for the Web-based applications, as many 61 research projects indicate that XML will replace HTML and become the main language used by the Web applications in the future. To summarize, the use of metadata can facilitate the identification, interoperability and autotransfer functions in distributed GIServices. A comprehensive metadata structure is essential for the future development of open and distributed GIServices. However, the complicated metadata standards may undermine the real use of metadata and its implementation procedures. Since different types of geospatial data sets have their unique data structures and formats, the contents of metadata should represent its unique data features. In fact, the construction of metadata should be flexible and have alternative methods for different data types because metadata is both data-oriented and application-oriented. Current metadata standards by FGDC are sophisticated, but restricted. An alternative approach is to establish the metadata exchange mechanisms instead of enforcing the standardization of metadata structure (Gardner, 1997). Currently, the metadata standard proposed by ISO/TC 211 illustrates a flexible framework for the construction of geospatial metadata, which has great potentials to become de facto metadata standard in the future. Besides the argument between establishing a standardized format and a standardized exchange mechanism, current metadata research also focuses on its potential functionality, such as machine-readable features, error propagation, and data lineage (Wu, 1993; Lanter and Surbey, 1994; FGDC, 1995). The use of metadata will become more and more important for bridging the heterogeneous frameworks in distributed network environments. 3.4 Chapter Summary This chapter reviewed the three major areas in distributed GIServices including the development of distributed GIServices, standards for distributed GIServices, and metadata development, which cover the fundamental, background knowledge in distributed GIServices. The following discussion will identify the major problems and tasks of developing distributed GIServices based on previous reviews. First of all, the development history covers three examples of distributed GIServices, each of them represents a milestone achievement of distributed GIServices. The Xerox Map Viewer provided a preliminary technical solution for distributed GIServices by using HTTP servers and CGI programs. The technical framework of the Map Viewer is followed by many early on-line GIServices applications. The Xerox Map Viewer also indicated that a single GIS service, map browsing, is not sufficient and need to extend other capabilities of GIS, such as text-based searching and gazetteers. GRASSLinks illustrated a comprehensive prototype, which can provide full traditional GIS functions, such as map browsing, buffering, overlaying, etc. However, both the Xerox Map Viewer and GRASSLinks are only built to mimic traditional GIS functions. The Alexandria Digital Library Project introduced a new content of GIServices, a digital library metaphor, and provided sophisticated library functions, including collections holding, catalog searching, and metadata indexing. Although these three examples are successful and well recognized in the GIS community, there is a fundamental problem -- these examples were all developed in different database frameworks and used different GIS technologies. The heterogeneous techniques and software programs prevented the integration and sharing of information among these three examples. The development of ADL user interfaces also illustrates the difficulty in technology migration from one platform to another. 62 The early development history of distributed GIServices told us that a technical-centered design of distributed GIServices is problematic and will prevent the future integration of GIServices. The GIS community realized the problem of lacking a standardized framework of distributed GIServices and two organizations, OGC and ISO/TC 211, were formed to provide solutions for this critical problem. The OpenGIS Specification and the ISO 15046 Standard illustrate efforts made by the GIS community and try to solve the problem mentioned above. The approaches and specifications provided by both organizations are very feasible, which provide the GIS community with a promising future. The early development of specifications and standards in both organizations were isolated and had no connections between each other. Fortunately, both organizations recently realized the importance of integration and cooperation between each other and have since worked together for the harmonization between ISO/TC 211 and OGC. However, based on the documents and specifications from both organizations, there are several potential problems in their model designs, which were mentioned in the previous sections. Examples include the lack of high-level classification of GIServices, the needs of dynamic combination and integration for distributed GIS components, the ambiguous definition of Semantic Translators and Traders in OGC’s Information Communities Model, etc. The main problem in OGC’s and ISO/TC 211’s specifications and proposals is the lack of identifying distributed GIS processing and designing an integration framework for heterogeneous GIS components. This dissertation will address these problems in a later chapter and propose possible solutions. The development of geospatial metadata is closely related to the standardization of GIS data model. Establishing a standardized metadata schema has become one of the major tasks in setting up GIS data standards, such as SDTS and ISO 15046. The early development history of metadata standards, such as SDTS and FGDC’s metadata standard indicates that setting up a rigid, ad hoc metadata standard may not be well accepted by the GIS community and GIS users. The ISO metadata standard solves the inflexible problem of metadata structure by proposing an extension capability and customizable profiles to meet the various needs from different GIS users. The ISO metadata standard has received support from OGC, FGDC, and other GIS community members. However, there are still potential problems for the design of ISO metadata standard. First, the contents of metadata standard are only designed for geospatial datasets without considering the distributed GIS components. A comprehensive metadata scheme for distributed GIServices should consider both geospatial datasets and GIS components and the interactions between GIS operators and datasets. The metadata for distributed GIS components should be designed and specified in order to provide self-managing, self-describing GIS components, which can be freely combined and used with various datasets and provide comprehensive GIServices. Second, the ISO metadata standard does not specify how to protect the contents of metadata from distributed network environments and how to ensure the connections between metadata and geodata. The contents of metadata need to reside in a safe place, where the metadata will not be modified without authorization nor lose connections with data. One possible solution is to encapsulate metadata elements into data objects and protect the contents of metadata automatically by the object-oriented modeling mechanism. The deployment of both geospatial metadata and GIS component metadata and metadata encapsulation will be discussed in detail in Chapter Five. 63 To summarize, Chapter Two and Chapter Three review the development history of two main topics, distributed computing and distributed GIServices to provide some background knowledge for open and distributed GIServices. The dramatic progress of distributed computing technology provides a low-level technical infrastructure for the current design of distributed GIService platforms. The development and standardization of distributed GIService domains provides a high-level service framework for the future directions of GIService architecture. To establish a truly interoperable, distributed GIService architecture, both the main IT industry and the GIS community should play essential roles on the setting of these standards, including cross-platform environments, communication protocols, and collaboration with different GIS components and data sets. This chapter also identifies several potential problems in the current development of distributed computing and distributed GIServices. Three important issues need to be solved in the future for the development of distributed GIServices. First, the Information Technology (IT) industry needs to improve the interoperability among different component technology, such as Java, CORBA and DCOM. Currently, the integration among different component frameworks is not really seamless and smooth. The main problems of interoperability are not only derived from their technology deployments but also from their development strategies, marketing supports, and software management. A truly interoperable platform environment will need the cooperation among different software vendors, government agencies, professional communities, and users. That means the major software development companies, such as Microsoft, Sun, Apple, and IBM should talk to each other and work out a solution for everybody. However, it is very unlikely for a truly collaboration among different companies due to their own marketing considerations. Second, the future development of standardized GIS service framework needs the collaboration between two main organizations, OGC and ISO. The specifications and standardization of distributed GIService architecture require the participation and contributions from both GIS vendors and federal agencies. Third, the design of GIService architecture should consider the balance between flexibility and standardization. The standardization of GIService architecture can facilitate the wide-adoptions of GIS technology and the reusability for GIS software and applications. On the other hand, the flexibility of GIService architecture can endure the long-term development of GIS technology for the rapid changed network environments. One example of the balanced flexibility and standardization requirements is the metadata scheme for geospatial data objects. In general, how to provide a standardized distributed GIService framework with the flexible extensions, which can cope with the future network technology will become a major task for the GIS community. Fourth, the GIS industry has to be integrated into the mainstream of the IT industry in order to provide comprehensive GIServices for a wide-range of users. Traditionally, the GIS industry are separated from the mainstream of IT industry because of the uniqueness of GIS software, applications and knowledge. Recently, it becomes a major trend for the GIS vendors to utilize the state-of-art computer technology to improve the functions of GIS software. The future development of distributed GIServices will highly rely on the modern IT industry for technical supports and frameworks. In fact, GIS is a perfect case study for distributed component technology in IT industry. The GIS community should play a more active role in the IT industry. 64 GIS professionals should contribute their GIS knowledge for the development of distributed component technology in order to design a GIS-oriented component technology. The collaborations between IT and GIS professionals are essential for the future development of distributed GIServices. In general, the design of distributed GIService architecture will need the collaborations among different organizations, the integration for heterogeneous technologies and specifications, and the communications between the GIS community and the IT industry. This dissertation proposed an integrated development strategy for the GIS community to conquer these problems mentioned above. The integrated development strategy is to utilize the LEGO-like GIS components and geodata objects under an operational metadata scheme and the agent-based communication interface. The next section, Research Design, will illustrate how these distributed component platforms, metadata models and agent-based mechanisms are used for the deployment of an open and distributed GIServices architecture. 65 CHAPTER 4. RESEARCH DESIGN The previous discussions indicate potential problems in developing and integrating distributed GIServices. The first problem is how to provide dynamic migration and integration of distributed components and geospatial data objects via networks. The second problem is the formalization of comprehensive metadata descriptions and structure. The third problem is information overload in heterogeneous distributed network environments. To solve the three problems, this research proposes dynamic construction of distributed GIServices, an objectoriented metadata scheme, and an agent-based mechanism to deploy a dynamic architecture for distributed GIServices. The deployment requires clear specifications and visual representations. This dissertation adopts the Unified Modeling Language (UML) as the modeling language for specifying and constructing distributed GIServices architecture (Booch, Rumbaugh, and Jacobson, 1998). Prior to UML, there was no clearly leading Object-Oriented (OO) modeling language for deploying a distributed GIS architecture. Three major OO languages, Object Modeling Technique (OMT) (Rumbaugh, et al, 1991), Object-Oriented System Engineering (OOSE) (Jacobson, et al, 1992), and Booch’93 (Booch, 1994), are well-known models adopted by many OO projects. In general, OOSE is a use-case oriented approach that provides excellent support for business engineering and requirements analysis. OMT and its successor, OMT-2, are especially designed for analysis and data-intensive information systems. Booch ’93 is particularly expressive for the design and construction of engineering-intensive projects. UML fused the concepts of Booch, OMT, and OOSE into a single, common, and widely usable modeling language. UML promotes a development process that is “use-case driven, architecture-centric, and iterative and incremental” (Booch, Rumbaugh, and Jacobson, 1998). This dissertation will use UML for the specifications of geodata object and GIS components architecture, and user case representations. For clarity and brevity, detailed UML notations and diagrams will not be introduced in this dissertation. Diagrams using UML notations will clearly explain and define the meanings of the UML notations in the bottom of each diagram. 4.1 Dynamic Integration for Distributed Components and Data Objects This dissertation will establish a dynamic architecture for distributed GIServices from an operational, distributed processing perspective. The term dynamic indicates that the architecture of GIServices is constructed dynamically by connecting or migrating data objects and GIS components across the networks temporarily. Different users’ tasks or requests will cause the reconstruction or reorganization of the GIServices architecture on the fly. The capability of dynamic construction is achieved by linking and combining distributed components and data objects. Currently, both academic and industrial studies of distributed GIServices focus on distributed component technologies, which can provide extensive capabilities and flexible services for the next generation of GIS. DCOM, CORBA and Java platform are three major technical frameworks for distributed GIServices mentioned in the literature review section. In this section, 66 distributed components will refer to a general concept instead of specific vendor-based frameworks. Under a general definition, a distributed component is "a ready-to-run package of code that gets dynamically loaded into your system to extend its functionality" (Pountain, 1997). For example, a Java applet, an ActiveX control, or even a plug-in function for the Web browser can be called a distributed component. In principle, the features of distributed components should include plug-and-play, interoperable, portable and reusable, self-describing and selfmanaging, and being able to be freely combined in use (Orfali at al, 1996; Pountain, 1997). In practice, distributed components are LEGO-like pieces of binary code. The LEGO metaphor refers to the well-known children’s toy blocks that can be interlocked and stacked. Similar to LEGO blocks, the idea is to create software modules (GIS processes) that stack and interlock to form a dynamic GIS package. The LEGO architecture may persist only briefly for completion of a single GIS task. Then the LEGO modules are broken down, rearranged and restacked to form a new architecture for a new GIS task. LEGO-like components can be moved, combined and used in distributed network environments. One important advantage of distributed components is the independence from different operating systems, hardware, network environments, vendors, and applications. The same component can be copied, moved, and executed in different machines with different configurations. Distributed components will interact with each other or be combined together to provide integrated services to users. The development of the distributed component shifts the software paradigm from a monolithic, feature-heavy approach to a flexible, modularized, and plug-and-play approach. This modularized, reusable software framework can improve the cycles of program development and efficiency of software engineering. Figure 4-1 shows an example of a LEGO-like module including a map display component which can be used, for example, in a word processing application or a GIS package. The word processing application is combined with several distributed components, including a graphic user interface component, a spell checker, and so on. The map display component is independent to the extent that it can be easily plugged into other packages when users need a map display function. Moreover, the component strategy is hierarchical. Here, the map component is made up of sub-components, including (in this case) a projection control and a vector display control. Alternative sub-components can be added into a map display component to extend its display functions, such as adding a symbol display control. Fonts and Formats Control Spelling Check Component Map Display Component Graphic User Interface Component Projection Control Component Vector Display Component Symbol Display Component Print Preview Component Word Processor Map display component Figure 4-1. LEGO-like distributed GIS components. Another advantage of distributed component technology is the independent operations from different software environments, database servers, and computer platforms. Distributed component technology allows a program (component) to operate on different computer platforms and to access heterogeneous database servers via a standardized protocol. For example, a Java 67 applet (component) can be downloaded into different computer platforms (Mac or Windows) and can access different types of databases, such as Microsoft SQL servers or Oracle database servers via the channel of Java Database Connectivity (JDBC) (Figure 4-2). Machine –C (UNIX) Machine –A (Windows) Client components Client components Java applet download Client components Client components Oracle database JDBC JDBC Machine –D (Win NT) Client components Client components MS SQL database Java applet Machine –B (Mac) Operate in different types of computer platforms. Access heterogeneous database servers via a standardized database connection. Figure 4-2. The independent operations from software environments and computer platforms. However, the availability of dynamic GIServices must take into account issues of interoperating with a thin client model or a thick client model. In networking terminology, the thick client model is defined as having major operations and calculations executed on the client side. On the other hand, a thin client model may require that selected operations run on the server side. Whether the client-side GIS component should be thick or thin will depend on the user task and associated performance requirements. For example, it may be appropriate to use thick clients for map display services to let the GIS user take over many intuitive decisions on the graphic design, layout, etc. Network routing or location modeling may be better off to run on the server side as complicated calculations and algorithms may be more efficiently handled by the server without an intervening network. The balance of functionality between client components and server components is a critical issue in the deployment of distributed GIServices. Dynamic Construction User Scenario: ï‚· GIS Task ï‚· GIS node profile ï‚· Network performance GIS nodes (Servers) GIS node (Client) GIS user GIS component Geodata object Figure 4-3. Dynamic construction of distributed GIServices by migrating and connecting geodata objects and GIS components. 68 Currently, many research papers argue about whether the thin client or thick client model is the appropriate approach for on-line GIServices. This dissertation proposes a dynamic approach to solve the dilemma of the thin or thick client framework by dynamically rebuilding the GIS components according to different GIS tasks. The GIServices architecture is dynamically constructed by LEGO-like GIS components, based on different user scenarios, which include user tasks, client-machine profiles, networking performance, etc. Such a dynamic approach for constructing distributed GIServices is illustrated in Figure 4-3. The arrangement of GIS components and geodata objects will be based on the following criteria of the user scenario: The first criterion is user-defined GIS tasks, which refers to the party who is going to use the system/service to achieve their goals. Different GIS tasks will require different types of GIService architecture. For example, the GIS architecture for a map query task will focus on GIS database functions. A TIN modeling task will require access to GIS analysis procedures. GIS-node hardware profiles are the computer hardware specifications of the GIS nodes, including CPU speed, RAM, available hard disk spaces, etc. Different types of GIS hardware requires different design strategies for distributed GIService architecture, such as the thin/thick client models. Networking performance is the third essential factor for the deployment of GIService architecture. Different bandwidth and connection types, such as Ethernet, ATM, DSL, and Cable modem services require different types of design and deployment. A distributed GIService with a 56K bandwidth connection may have difficulty in uploading huge data sets to the server-side machines. A GIService with a 100 MB Ethernet connection can easily upload or download geodata sets to different machines. The establishment of GIServices is collaborated by several GIS nodes, a group of network-based GIS workstations. Two kinds of elements are stored in these GIS nodes in order to provide comprehensive GIServices: GIS components (programs) and geodata objects. The geodata objects and GIS components can be re-arranged and linked dynamically among GIS nodes to establish dynamic, flexible, and distributed GIServices. Figure 4-4 illustrates that the dynamic architecture of distributed GIServices is constructed instantly by migrating or connecting GIS components and geodata objects among several GIS nodes based on specific GIS tasks and scenarios. In this scenario, Mike needs to display [Colorado Road] map on his GIS node-A machine. He will submit his GIS task to the local machine and the local GIS node-A will create a dynamic service architecture by connecting or migrating GIS components and geodata sets, which are located on node-B and node-C originally. After Mike’s GIS task is completed, the data objects and components are restored to their original places and wait for the next calls. 69 Original Setting GIS node-B GIS node-C GIS node-A Dynamic Construction User Scenario: Map Display (Colorado Roads) Mike GIS nodes (Servers) GIS node-A (Client) User’s machine: node-A GIS component Task completed Geodata object (restore original setting) GIS node-B Migrating GIS node-C Connecting GIS node-A Figure 4-4. Build GIServices “on-the-fly”. 4.1.1 The Design of Dynamic GIService Architecture Figure 4-5 illustrates a simplified dynamic architecture for distributed GIServices using UML. The UML notations use diamond shapes to represent the concept of aggregation. For example, the map coverage class is aggregated by geometric features and attributes. The line between two objects indicates the association relationship between them. The texts next to the line are association names and the arrows represent their directions. For example, lines can be buffered to create a buffer zone, which is the association between lines and buffer zone. And the arrow indicates that their association direction is from line to buffer zone. 70 GIS tasks Users have User Scenarios Hardware profile dynamically construct Fulfill provide GIServices Architecture GIServices Distributed GIS Components Distributed Geodata objects Client-side Geodata objects Server-side Geodata objects Client-side GIS components UML Notations: Assembly Class: Map Coverage (Aggregation) Network performance Part-1: Geometric Features Part-2: Attributes Server-side GIS components Lines Association: Buffer Buffer zone Figure 4-5. The dynamic architecture of distributed GIServices in UML. Based on the UML notation, Figure 4-5 identifies four major object classes in the deployment of distributed GIServices: [Users], [User Scenarios], [GIServices Architecture], and [GIServices]. There are four types of associations. [Users] have [User Scenarios]. [User Scenarios] will be used to dynamically construct [GIServices Architecture]. [GIServices Architecture] can provide [GIServices]. [GIServices] will fulfill [Users] needs. Four types of aggregation relationships are illustrated in Figure 4-5. [User Scenarios] aggregates [GIS tasks], [Client machine profiles], and [Network performance]. [GIServices Architecture] is constructed by [Distributed Geodata objects] and [Distributed GIS Components]. [Distributed Geodata object] consists of [Client-side Geodata objects] and [Server-side Geodata objects]. [Distributed GIS components] consists of [Client-side GIS component] and [Server-side GIS component]. 71 The dynamic construction is established by using the association between the [User Scenario] and [GIService Architecture]. For example, different [Users] will have different [User Scenario], which aggregate different GIS tasks, machines, and network environments. After users define the scenario, the [GIService Architecture] will be constructed by combining GIS components and data objects located in either server-side machines or client-side machines. 4.1.2 The Network Strategies for Constructing Dynamic GIServices Two kinds of strategies accomplish dynamic construction for distributed GIServices – object migration or remote connection. The basic elements that are connected in a dynamic construction are GIS components (programs) and geodata objects. The following paragraphs will explain both types of dynamic construction for both elements. Dynamic Construction for Geodata Objects a). Connecting data objects remotely. GIS Node-A GIS Node-B GIS Application Database connectivity API: (JDBC or ODBC) SQL Database Server Data object b). Migrating data objects. GIS Node-A GIS Node-B GIS Application Data Container (Local Disk) FTP Server FTP Data object Figure 4-6. Two types of data connection for geodata objects. Two different approaches access distributed data objects dynamically (Figure 4-6). Remote connection operates by using SQL and distributed database functions, using Application Programming Interfaces (APIs) of database connectivity, such as JDBC, ODBC or OLE DB, to establish the database connection between two systems. The second approach (object migration) utilizes a FTP server to download a required data object to the targeted system and save it on the local disk or data container. Usually, the data migration approach may require both automated and manual procedures for the download and conversion of different data types into a single local GIS database. In the first case, a link is established between distributed databases, in the second, a data object is actually transported via FTP. Likewise, there are two different approaches for accessing distributed GIS components dynamically (Figure 4-7). The first approach invokes GIS operators remotely, by using 72 distributed component technologies and Remote Procedure Calls. Several technology frameworks are available for this approach, including DCOM, CORBA and Java RMI. A GIS application will send a request to client-side component services to invoke a remote GIS component. The client-side service will use its client stub to build a connection with server-side server skeleton. The server-side component services then invoke the required GIS component. The communication between the two systems uses RPC or other possible protocols, such as Internet Inter-ORB Protocol (IIOP). Dynamic Construction for GIS Components a). Invoking GIS components remotely GIS Node-A GIS Node-B GIS Application Client-side Component Service Client Stub GIS component Remote Procedure Calls Server Skeleton Server-side Component Service b). Migrating GIS components. GIS Node-A GIS Application Component Container GIS Node-B HyperText Transfer Protocol HTTP Server GIS component Figure 4-7. Two types of GIS components invocation for distributed GIServices. The second approach is to actually move GIS components from one site to another. The migration process uses an HTTP server to download the required GIS procedure dynamically into the targeted GIS application. The downloaded GIS component is stored inside a component container, which binds with the local GIS application. This kind of approach is available from several technology frameworks, such as Java applets with Virtual Machine, or Active X with Active X container. Currently, most on-line GIS services, such as ArcIMS by ESRI and MapGuide by AutoDesk, can not utilize both network strategies, due to the lack of a high-level integration framework. To integrate object migration techniques and remote data connection frameworks under a single service architecture, two types of GIS component access (remote invocation and component migration) and two types of geodata object access (data migration and remote connection) will be described next in the context of two-pairs scenarios to explain how these technologies might 73 operate in practice. The following scenarios will demonstrate the advantages of such a dynamic architecture for distributed GIServices and their requirements. 4.1.2.1 Two Scenarios for Distributed GIS Components Access Figure 4-8 illustrates two representative scenarios to demonstrate the advantages of dynamic GIS component access. In the first scenario, a GIS user needs to perform a road buffering operation on a 700 MHz Pentium III PC with a 10 MB Ethernet connection. Since the client machine has enough computing power to handle a buffering operation, the best solution is to download the GIS buffering component to the client side and to integrate it with other client GIS components. This is a thick client model design in networking terminology. After the buffering component is downloaded, the client-side component services will initiate dynamic binding with the local GIS application and perform the buffering operation. In the second scenario, a GIS user requests a TIN modeling function on a DEM dataset using a 100 MHz PC with a 56K modem connection. Since the client machine is not powerful and the network performance is slow in downloading the TIN-model component, the best solution is to send the GIS TIN-modeling procedure to the server machine and then perform the GIS task. The result of the TIN-modeling will be sent back to the client machine. The second GIS scenario is a thin client model because the major TIN modeling calculations are executed on the server machine. Scenario 1: Solution: GIS task: Buffering, Client: 700MHz PC, Network: 10MB Ethernet Dynamic Construction ïƒ A Thick Client Server components Client components GIS buffering component Scenario 2: Solution: Buffering component is downloaded from the server GIS task: TIN modeling, Client: 100MHz PC, Network: 56K modem Dynamic Construction ïƒ A Thin Client Server components Client components TIN Modeling result is downloaded from the server GIS TIN-model component Figure 4-8. Two scenarios for GIS component access: thin client and thick client. 74 Under a dynamic distributed GIServices architecture, the thick client model and thin client model can be dynamically built or switched based on different GIS scenarios. The dilemma of thin-orthick architecture design for legacy on-line GIS no longer exists (Tsou and Buttenfield, 1998b). Thus, the design of on-line GIS will be able to focus on the contents of GIS applications instead of the consideration of technical architecture. In general, the requirements for dynamically migrating or connecting GIS components will include a decision making process for choosing an appropriate architecture, a self-describing GIS component framework, and a comprehensive distributed component service. Current distributed component technologies, such as DCOM, CORBA, and Java can provide some low-level distributed component services, such as object migration, global naming, life-cycle management, and object implementation. However, a dynamic GIService architecture will also need highlevel distributed component services, including an object-oriented metadata scheme for a selfdescribing GIS component framework and an agent-based mechanism for decision making processes. These issues will be discussed in Chapter Five. Besides the dynamic migration approach for GIS components, a dynamic GIService architecture should also provide the migration capability for geodata objects. The next section will use two representative scenarios to describe how geodata object migration works in practice. 4.1.2.2 Two Scenarios for Distributed Geodata Object Access Scenario 1: Solution: GIS task: Buffering [Colorado Roads], Client: 700MHz PC, Network: 10MB Ethernet Data migration: download the [Colorado Roads] to client machine. Client-side GIS Component Buffering component Colorado Roads (lines) Scenario 2: Solution: Server-side Data Objects Colorado Roads (lines] [Colorado Roads] is downloaded from the server GIS task: TIN modeling, Client: 100MHz PC, Network: 56K modem Remote data access: leaves the Colorado DEM data object on the Server and remotely connect [TIN] component with the server Server-side Data Objects Client-side GIS component TIN-modeling component Colorado DEM TIN Modeling result is downloaded from the server Figure 4-9. Two scenarios for geodata access: data migration and remote data access. 75 Figure 4-9 illustrates two representative scenarios for geodata object access. The two scenarios are similar to the previous examples, except that the data objects now reside on the server machine. The first scenario is that a GIS user needs to perform a road buffering operation on a 700 MHz Pentium III PC with a 10 MB Ethernet connection. Since the [Colorado Road] data object is on the server, the best solution is to download the [Colorado Road] to the client-side machine, then perform the buffering operation locally. This is called data migration model. The second scenario is that a GIS user requests a TIN modeling function for Colorado DEM from a 100 MHz PC with a 56K modem connection. Since the [Colorado DEM] data object is huge and the available networking bandwidth (56K) is narrow, the possible solution is to leave the [Colorado DEM] data object on the server, then to build the database connectivity remotely between client-side [TIN-modeling component] and the [Colorado DEM] data object by using JDBC or OLE DB. This scenario is a remote data access model. The result of the TIN-modeling will be sent back to the client-side GIS components and the user. To summarize, the four representative scenarios mentioned above illustrate the advantages of dynamic GIService architecture with LEGO-like GIS components and data objects. The flexibility of distributed GIS component can provide customizable services for different users, heterogeneous platforms, and various network connections. Moreover, the design of dynamic GIService architecture shifts from ad hoc, fixed systems to modularized, changeable component combinations. The architecture can be modified or updated later if users are facing different scenarios (for example, users may change their network connection from ISDN lines to T1 lines, or add a new category of services for their GIS applications). The dynamic combination and migration of GIS components and data objects will benefit GIS processing and analysis with current distributed network environments. After clarifying the relationship between client and server and the dynamic architecture of GIServices, the next step is to classify the appropriate GIS components and their functionality. 4.1.3 Categorizing GIS Components by a Task-oriented Approach One of the major design issues with distributed GIS components is their classification. Distributed GIS components will be designed and allocated by software programmers based on different categories. In theory, where to put GIS components along the network does not seem to matter because the components can be moved and re-organized whenever a user needs them. However, in practice, the movement and re-organization of GIS components requires significant computing resources. The best strategy is to design an appropriate category for GIS components, which would require minimal movement and re-organization efforts. This dissertation proposes a task-oriented design of GIS component classification. Task-oriented design of distributed components will reduce unnecessary movement and re-organization. Based on current GIS applications and GIS task research (Davies, 1995; Albrecht, 1996; Davies and Medyckyj-Scott, 1994; 1996), this dissertation identifies six representative GIS tasks for on-line geographic information services: map display, spatial and text-based query, data download and integration, spatial analysis, database maintenance and update, and extending GIS functions. Figure 4-10 adopts UML use-case diagrams to represent interactions among six representative GIS tasks and three types of GIS actors. In the UML notations, a use case is rendered as an 76 ellipse, which describes what a system or a component does in order to fulfill a user’s task. An actor represents a role, which interacts with use scenarios. The arrows represent the interaction between actors and use scenarios. Figure 4-9 illustrates six GIS tasks related to server-side services, which need to interact with [GIS software developer] and [GIS data providers]. The first four GIS tasks are also related to client-side services, which need to interact with [GIS users]. GIS Tasks Map Display Spatial and text-based query Data download GIS Data Provider Spatial analysis GIS User Maintain and update GIS databases Extend GIS functions GIS Software Developer UML Notations: Actor Use Case (Tasks) Figure 4-10. The relationships between six GIS tasks and three actors. Based on the six representative GIS tasks, Figure 4-11 defines six sub-classes of distributed GIS components: Map display, Spatial/Text query, Data download, Spatial analysis, Database, and GIS extensions. The UML notations use a triangle to represent the generalization of objects. For example, the Geodata object is the super class of the Vector data object and the Raster data object. 77 Map display component Spatial/Text query component Distributed GIS Components Data download component Spatial analysis component Database component GIS extensions component UML Notations: Sub-class: Vector data Super Class: Geodata (Generalization) Sub-class: Raster data Figure 4-11. Six representative GIS components in UML. The major advantage of task-oriented GIS components is that they are ready-to-use for specific GIS tasks. Users can select these GIS components based on applications and tasks without worrying about compatibility with their systems and the installation details. These GIS components can have plug-and-play functionality by adopting standardized communication protocols and specified metadata. The details will be presented in the next chapter. Task-oriented, LEGO-like GIS components can allow users to modify module combinations or extend existing functions, such as adding an image processing function or data conversion capability. Currently, there are two major systems of the distributed GIS components proposed by OGC and ISO/TC 211 (mentioned in the Chapter Three). The OpenGIS Specification proposed fifteen GIServices domains (page 70), which are based on low-level, generic GIS functions. One potential problem of this system is that it separates all image processing functions from vectorbased GIS functions. On the other hand, ISO/TC 211 proposed six high-level GIServices domains based on the generic software architecture without considering the uniqueness of GIServices. This dissertation suggests that the middle-level category of distributed GIS components should be based on the generic, independent GIS tasks, which are defined by GIS users (Figure 4-12). Different users may require specified component categories and should be able to design their own component functions. The low-level OGC functionality or the high- 78 level ISO/TC 211 domains will only be used by software developers or research institutions rather than actual GIS users. OGC Low-level Classification (p.69) ISO/TC 211 High-level Classification (p.82) 1. Geographic Human Interaction Services 2. Geographic Model Management Services 3. Geographic Workflow/Task Services 4. GDAS GCTS GAS IMS GFMS IES Geographic System Management Services FAS FGS IMGS 5. Geographic Processing Services GDS ISS IGMS 6. Geographic Communication Services GIES GSMS IUS “Task-Oriented” GIS Components: Middle-level Classification Map Display Data Download Spatial Analysis Spatial/Text Query Database Management GIS Extensions Figure 4-12. Three types of GIS component classification Moreover, distributed geodata objects and GIS components require mobility, remote connectivity and dynamic binding with local systems in distributed network environments. To provide such a dynamic construction, this architecture needs to acquire certain information from GIS components and geodata objects, to help users and client/server machines understand them. To build these dynamic connections for different GIS components and geodata objects, an objectoriented, operational metadata scheme in both geodata objects and GIS components will facilitate dynamic construction and linkage. This will be described in the next section. The information collection and decision-making process will be accomplished by an agent-based mechanism, described in section 4.3. 4.2 An Object-oriented, Operational Metadata Scheme For distributed GIServices, metadata is the information that supports the exchange and integration between clients and servers. In general terms, metadata describes the content, quality, condition, and other characteristics of data. An object-oriented metadata scheme for both GIS datasets and components solves the problem of the formalization of geospatial data sets and GIS operators in distributed network environments (Tsou and Buttenfield, 1998b). Currently, many GIS projects are conducting metadata research (FGDC, 1995; Smith, 1996; ISO/TC 211/WG 3, 1998). Existing work presents metadata schemes which emphasize the establishment of a standardized format and adopt traditional relational database concepts, where each metadata item is represented as an individual record. However, ad hoc approaches to the metadata issues do not scale and thus cause interoperability problems (Baldonado et. al., 1997). The standardization of metadata formats may undermine their application to services, because it 79 is impossible to design a single standard for heterogeneous geospatial data processing methods. For example, a single standard would be inadequate to simultaneously describe both a TIN data model and a raster data model, without lots of extraneous fields. Likewise, a single servicebased metadata model designed to describe both interpolation and buffering would be both cumbersome and inefficient. Thus a single standard for metadata likely will not be feasible. Metadata standards developed by ISO/TC 211 (Kuhn, 1997; OGC, 1998) demonstrated the need for extensions, such as adding to an existing data element or adding a new metadata element. However, many current metadata schemes still detach metadata from data, storing them separately. The detachment of metadata and data jeopardizes the availability of metadata when geospatial data sets are frequently moved, downloaded or modified in the dynamic network environment. It is quite possible to lose metadata during data processing and copying. Geodata objects should embed metadata as encapsulated items within the data itself. Also, the contents of metadata will be designed from an operational metadata scheme that facilitates dynamic interaction between geodata objects and GIS components (programs). The scheme of metadata for distributed information should embed metadata within the data object itself. Figure 4-13 demonstrates these two different metadata schemes. Data 1 Data 2 Metadata Object Metadata Record #1 Metadata Record #2 Metadata Record #3 Data Object or Service Object Encapsulated Metadata Scheme Traditional Relational Metadata Scheme Figure 4-13. Two metadata schemes (relational and object-oriented). There are several advantages for encapsulated metadata. First of all, metadata objects provide a flexible approach to construct metadata by object-oriented modeling. The adoption of objectoriented modeling methods will permit a flexible storage where metadata are tailored to the type of object they described. Second, when a user moves or copies geodata objects, metadata will automatically be exchanged. Users will never worry about where to find the metadata for their data objects. Third, encapsulation of metadata information will protect the metadata from exterior environments. Only authorized programs (with electronic access-key codes) can access the metadata information. Moreover, when a new geodata object is generated, the new metadata can inherit parent metadata information, and then add new metadata information for itself. For example, if a subset-area is clipped from a satellite image, the new metadata object could inherit the information about image resources, sensor types, and resolution from the original image and add new spatial boundary coordinates for the new metadata. In a comprehensive metadata object scheme, each data object should be able to automatically generate its own metadata and encapsulate it into the data object in the process. Each geodata 80 object must have metadata in distributed geographic information environments and metadata objects can be retrieved from data objects and saved in a repository, and be accessed by other application programs. This dissertation proposes that the metadata scheme can apply in both geodata and GIS components, which will be discussed next. 4.2.1 The Design of Operational Metadata for Geodata Objects Operational metadata for geodata should facilitate the access of geodata objects, remote GIS database connectivity, and the migration of geodata objects from one location to another. Two categories of metadata elements for geodata described below (Figure 4-13) are essential for distributed geodata objects. GIS-operation metadata bridges the connections between geodata objects and GIS components. For example, in order to interact with a specific GIS component, [Map Display], GIS-operation metadata should include scale, projections, map units, map extension, cartographic symbol mapping, etc. Other examples of GIS-operation metadata include topology for buffering, map accuracy for overlaying, etc. Data connectivity metadata connects the geodata objects with remote systems to facilitate the migration of geodata objects. Data connectivity metadata should identify the acceptable connectivity mechanism, such as JDBC, ODBC, and the preferred types of database engines, such as ORACLE, Informix, or MS Access databases. Data connectivity metadata should also include the types of migration, such as COPY (duplicate the data object on client and leave the original object on server) or MOVE (duplicate the data object on client and delete the original object on server), LIFECYCLE (does the geodata object temporarily or permanently reside in the client machines), and data transform methods for heterogeneous database environments. Geodata Object Feature: X, Y coordinates, lines, Topology, etc. GIS-operation metadata Attributes: Polygon Attribute Table, Arc Attribute Table. Data connectivity metadata Metadata Figure 4-14. The content of encapsulated metadata for geodata objects. Figure 4-14 also illustrates the concepts of encapsulated metadata. Under a comprehensive object-oriented scheme, the two elements of metadata should be wrapped inside data objects and the encapsulation will protect the critical contents of metadata from outside interventions. With the help of the operational metadata, distributed geodata objects will become more accessible, self-describing and self-managing. Therefore, distributed GIS components and client/server machines will be able to handle distributed geodata objects more automatically and more efficiently. 81 4.2.2 The Design of GIS Component Metadata An operational metadata scheme can be also applied to GIS components. For on-line GIS services, different GIS components will be developed and designed for specific user tasks and functions. The contents of GIS component metadata need to facilitate dynamic interactions with geodata objects and plug-in functions for client/server machines. Metadata embedded in a GIS component should include two major parts (Figure 4-15): GIS component GIS-operation metadata (A, B, C, D, E, F) GIS-operation requirement metadata (A, B) Geodata Object System integration metadata GIS component metadata Check if the geodata object fulfills the GIS-operation requirements GIS component Interface (APIs) Other GIS components Client/Server Machines Figure 4-15. The contents and functions of GIS component metadata. System integration metadata describes available functions, methods, and behaviors of GIS components for system controls, plug-in functionality, and the collaborations with other GIS components. System integration metadata will also include the types of GIS components (DCOM, CORBA, or JavaBean) and the component migration methods. Currently, many distributed component frameworks, such as CORBA and DCOM, have already included some of the system integration metadata in their design. GIS-operation requirement metadata specifies the data requirements for specified GIS operations. For example, a [Map Display] GIS component will embed a GIS-operation requirement metadata item, which specifies the data requirements for the map display function, such as map units, projection method, symbols, and scales. The [Map Display] component will check if the available geodata object meets the requirements of a map display operation by accessing the GIS-operation metadata inside the geodata objects. If the requested metadata are not available, the [Map Display] component cannot operate on the geodata object. With the collaboration between two types of metadata, distributed GIS components will become reusable, modularized, self-describing, and self-managing. To summarize, the use of operational GIS component metadata is the key to interoperability and plug-and-play function for distributed GIS components. Figure 4-16 uses the UML notations to specify the hierarchy of metadata object in distributed GIServices. One generalization relationship is that two types of [Metadata object] in distributed GIServices are [Geo object metadata] and [GIS component Metadata]. 82 Four aggregation relationships are that [Geodata object] consists of [Geo object metadata], [Feature], and [Attribute]; [Distributed GIS Component] consists of [GIS component metadata] and [GIS component interfaces]; [Geo object metadata] includes [GIS-operation metadata] and [Database connectivity metadata]; and [GIS-component metadata] includes [System integration metadata] and [GIS-operation requirement metadata]. Metadata object GIS component metadata Geo object metadata GIS-operation metadata Geo object metadata Geodata objects Database connectivity metadata Feature Attribute System integration metadata Distributed GIS components GIS component metadata GIS-operation requirement metadata GIS component interfaces (APIs) UML Notations Generalization Aggregation Figure 4-16. The metadata class relationship and hierarchy in UML. The operational metadata design emphasizes three important concepts for the future use of metadata. First, the design changes the traditional functions of metadata from a descriptive type of information into processing-oriented, operational, machine readable metadata contents. The processing-oriented metadata scheme will facilitate distributed GIS processing, accurate map display and overlay analysis, and automatically data conversion and exchanges via the networks. Second, encapsulation of metadata into data object will protect the metadata from being lost in the network environment and prevent accidental intervention for critical metadata content. 83 Third, the operational metadata scheme can be applied to both geodata objects and GIS components. In contrast to the traditional metadata design, which can only be applied to data objects, the metadata sharing between GIS components and geodata objects will improve the efficiency of GIS operations and analysis. The GIS components and geodata objects with encapsulated metadata will be able to communicate with each other in distributed network environments. An agent-based communication mechanism will retrieve information from metadata and be associated with different GIS processing and tasks. 4.3 An Agent-based Communication Mechanism Recently, software agents have become a new trend in both user interface design and artificial intelligence (AI) research. The goal of intelligent software agents is to reduce user work and information overload (Maes, 1994). Software agents can provide services in data filtering, information searching, on-line tutoring, and so on. Current research suggests that intelligent software agents will be widely used and implemented in the future, especially in open, distributed systems (Thomas and Fischer, 1997; Bradshaw, 1997). Dozens of intelligent software agent applications have been proposed and are under development (Knapik and Johnson, 1998). Within a traditional GISystems environment, users only deal with one centralized system which has its own data model and command syntax. In a distributed GIServices environment, on the other hand, users may need to interact with heterogeneous data models and different types of programs in different distributed component frameworks. This situation is intensifying due to the complicated nature of GIS tasks. Moreover, many GIS professionals and researchers can not share their data sets, programs, or GIS models with others due to the lack of an appropriate distribution channel. With the help of software agents, people can easily search the data they need, share or exchange their existing data sets, or distribute new GIS programs via the Internet. This dissertation introduces an agent-based communication mechanism for two main reasons: 1) to simplify the complexities of distributed GIServices environments and 2) to overcome the limitation of the existing monolithic GIS software approach. A software agent is a software entity which functions continuously and autonomously in a particular environment (Shoham, 1997). Each agent has some specific functions and responds to specific events, based on pre-defined knowledge rules, the collaborations of other agents, or users’ instructions. Agents can help users to search information, interpret or translate different data formats, and make logical decisions or suggestions. For example, Web search engines like Excite, Lycos, and AltaVista use software agents to help users search the information on the Web. By adding user-defined rules and appropriate knowledge bases, software agents will bridge the communication between heterogeneous data objects and distributed components. The original idea of the software agent came from the artificial intelligence research. Different from the traditional AI approach, the use of software agents emphasizes that their knowledge bases are located in hundreds of distributed small agent programs instead of a single huge omnipotent computer machine. In a distributed GIService environment, users may need to interact with heterogeneous data models and different types of programs in different computer platforms. Intelligent software agents can help users to access distributed data objects and GIS components 84 on heterogeneous GIS platforms by interpreting, filtering, and converting related information automatically (Tsou and Buttenfield, 1998a). 4.3.1 The Roles of Software Agents Three fundamental roles of software agents are essential to distributed GIServices: information finder/filter, information interpreter, and decision maker (Tsou and Buttenfield, 1998a). The following paragraphs will introduce and explain the three different roles of software agents for GIServices applications. 4.3.1.1 Information finder/filter role The first role of software agent is an information finder/filter, which helps users find the requested information and filter out unnecessary elements to a reasonable scope according to a specified user task. Moreover, software agents can play a more active role than a simple information filter. If the task can not be executed or completed, the software agent may suggest modifying the request or the task, or provide an alternative choice to users. Figure 4-17 shows the role of software agents as an information filter/finder. SITE-A Choice - 1 Choice - 2 Choice - 3 ~~~~~ Choice - 200 SITE-B Choice - 1 Choice - 2 Choice - 3 ~~~~~ Choice - 300 Information finder Information filter searching filtering KEYWORDS (User-defined rules) Reasonable choice - A Reasonable choice - B Reasonable choice - C Users Search information from different locations and Simplify choices according to the specified rules defined by users. Figure 4-17. The information finder/filter. One of the design issues in information finders/filters is the search mechanism. Three types of agent search methods are commonly used in the current software agent model: message broadcasting, agent roaming, and metadata repository (Knapik and Johnson, 1998). Message broadcasting mechanism is a traditional network communication approach, whereby the sender broadcasts messages to local networks and waits for responded machines (Peterson and Davie, 1996). The second type of search is the roaming of agents, in which software agents will move or copy itself to a different location or server, collect requested information directly, and then return to the original location. The third type of search mechanism is to build the connection between agents and database servers directly and to generate a metadata repository of connected databases in advance. When software agents receive a request from users, the agents can search the metadata repository immediately and then redirect the linkage to the target database server. Another issue in the information finder/filter is the information filter function. The first type of filtering is to follow user-defined rules directly. For example, users can define several keywords to limit the scope of search. The second type of filtering is the information priority. The priority 85 of information will be decided by whether the information can be used immediately without any pre-process procedures, such as conversion or partition. The priority of information can be defined and generated from the descriptions of metadata. For example, if a GIS user only uses a specialized GIS package, ARC/INFO, to process GIS data, the highest priority of information format should be Coverages or Shapefiles format, which can be used directly with this GIS package. The second information priority will be the other formats, which may require data conversion, such as Digital Line Graph (DLG). The lowest priority of information formats will be the raw data items, such as texts or tables, which need substantial pre-process procedures and partition. The third type of filtering mechanism is the agent’s predefined preferences and capability. For example, a software agent may only be capable to process 500 items at one time due to the limitation of their virtual memory and storage capacity. Therefore, the information items after the first 500 search results will not be retrieved or displayed for GIS applications. 4.3.1.2 Information interpreter role The second role of software agents is an information interpreter, which can access and convey information from one side to the other. In distributed network environments, heterogeneous data models and systems can not communicate directly. An agent can bridge heterogeneous information islands and translate different data types and models for different systems. In order to translate the information correctly, this agent has to acquire some knowledge and methods in translation procedures. The knowledge and methods can be defined and encapsulated in the metadata of object-based components (Tsou and Buttenfield, 1998b). The encapsulated metadata information can help an agent to interpret the information correctly. Figure 4-18 shows the role of software agents as information interpreters. [Buffering] in UNIX [Address matching] in Win 2000 Metadata Metadata Information Interpreter Metadata become the source of knowledge bases Metadata Metadata Metadata Arc Coverages Shapefiles Landsat images Figure 4-18. The information interpreter. As an information interpreter, software agents can help users perform GIS operations more accurately and efficiently. For example, if a GIS user, Mike, wants to perform a 200 feet buffering operation for [Colorado roads] in his UNIX-based ARC/INFO GIS workstation. The requested data object, [Colorado roads], is stored in a remote Windows NT machine in the ArcView Shapefiles format. Software agents can help Mike to download and convert [Colorado roads] from ArcView shapefiles to ARC/INFO coverage format automatically. By retrieving the original map units from the metadata of [Colorado roads], software agents can also help Mike transform the unit of buffer parameters from “200 feet” to “61 meters” immediately. With the 86 help of information interpreter, GIS users can easily work with different types of GIS data and programs together. 4.3.1.3 Decision maker role The third role of software agents is that of a decision maker, which can make decisions autonomously based on rational rules defined by its own knowledge base, user-defined rules, or the collaboration of other agents (Ferber, 1999). A software agent can collect and analyze information for a specific event, make an optimal decision based on the collaborations of users and other agents, and execute the final decision (Figure 4-19). For example, a GIS user, Mike, wants to generate a GIS buffering operation for [Colorado roads]. However, Mike found out that his workstation might not have enough disk space to save the results of [Roads buffered] ARC/INFO coverage. When Mike initiates the GIS buffering operation, he can assign a special rule to the software agent: “if the local storage is full, find another network disk and save the result.” When the software agent receives Mike’s defined rule, the agent will begin to monitor the local disk to see if the output coverage will exceed the limit of disk storage. If the agent finds out that the result of buffering operation can not be saved on the local disk, it will ask other agents to provide network storage information. Then, information finders will search for available network disk volume and information interpreters will convert the output data format from the original platform to the compatible format in the remote network disk. With the help of other agents, the decision maker will execute a new command to save the new buffered data into a network drive Z with appropriate data conversion (Figure 4-19). Agent (decision maker) Events Agents (Information Finder) Actions Agents/Users Collaboration Users (User-defined rules) Agents (Information Interpreter) Event: If the local disk is full, find another network disk. Agents/Users Collaboration: User: Define the rule of events. Info. Finder: search network volume Interpreter: convert data format to destined environment Action: Save [New disk-drive]: [Converted Data object] Figure 4-19. The decision maker. There are several issues in the design of decision makers. First, the setting of decision-making rules will require the design of appropriate user interfaces and procedure verification during the period of collaboration between users and software agents. The second issue is the collaborations among software agents, which needs to define an appropriate communication protocol and formalize agent interaction mechanisms. The third issue is how to choose the 87 participants in the decision making process and the setting of the decision making procedures. Since a decision maker will have more power than other types of software agents, the design of software agents will need to define the hierarchy of software agents. In general, the three roles of software agents are essential to distributed GIServices in order to provide a comprehensive and self-managing GIServices for different users and applications. Two types of implementation approaches can be adopted for the design of the roles of software agents. The first approach is to define the three different roles (finder/filter, interpreter, and decision maker) in three different agent types and each agent plays permanent role in the network environment. The alternative approach is to grant each agent multiple roles in their runtime period. For example, a software agent can act as information finder in one GIS case and become an information interpreter in another case. The dynamic role of a software agent will provide flexible functions for GIServices. However, it will also be more difficult to implement the agent framework. The dynamic agent role approach can be implemented by using “polymorphism” from the object-oriented modeling technique. Polymorphism is the technique to hide different implementations behind a common object or interface. With polymorphism, the same software agent can perform different types of functions and interact differently with other agents and therefore produce different but appropriate results. The dynamic design of software agent’s roles can provide more flexible and comprehensive GIServices for different users and applications. The next section will introduce the actual design issues of software agents. 4.3.2 The Design of Software Agents The design of software agents will need to consider three issues, mobility, functionality, and security. The mobility of agents is related to the dynamic feature of network environments and the communication between different machines. The functionality of agents refers to the interaction and responsibility for different tasks, such as the communications among machines, GIS components, and geodata objects. The security issue focuses on the actual implementation problems and potential security problems and countermeasures. 4.3.2.1 Agent Mobility The mobility of agents is related to the dynamic feature of network environments where agents reside and the communication approach between software agents. According to the recent research, such as the specification of Mobile Agent System Facilities in Common Object Request Broker Architecture (CORBA) framework (OMG, 1998) and IBM’s Aglets (Java Mobile Agents) project (Lange and Oshima, 1998), two types of agents, stationary agents and mobile agents, are distinguished by their ability to move around or not (Knapik and Johnson, 1998). 88 a). Stationary Agent Machine-A Stationary Agent01 Machine-B Remote Procedure Call Stationary Agent-05 b). Mobile Agent Machine-D Machine-C Mobile Agent-03 Mobile Agent-06 Copy (HTTP) Mobile Agent-03 Figure 4-20. The communication mechanisms of stationary agents and mobile agents. Figure 4-20 illustrates two types of software agents, stationary and mobile. Stationary agents work as an independent task on one processor or one machine. Stationary agents remain in a single location throughout the duration of their execution. If an agent needs information that is not on that system, or needs to interact with an agent on a different system, it typically uses a communication mechanism such as remote procedure calling (RPC). A mobile agent can be moved to another machine and executed remotely. It can also move to a network infrastructure close to the destination system in order to achieve the benefit of locality. The ability to travel allows a mobile agent to move to a system and take the advantage of being in the same host or network (Lange and Oshima, 1998). The movement of mobile agents may adopt the same transfer protocol used in the World Wide Web, such as File Transfer Protocol (FTP) or HyperText Transfer Protocol (HTTP) (Peterson and Davie, 1996). Currently, major research on software agents focuses on mobile agents because the dynamic movement of software agents has several advantages in distributed network environments. These advantages have been identified by Lange and Oshima (1998). They are illustrated in the context of distributed GIServices in the following paragraphs. Reducing network load. Mobile agents can carry GIS operations, upload them to remote GIS databases, and process GIS operations remotely. Since many GIS data objects are huge and difficult to download or copy remotely, mobile software agents can provide more efficient GIServices by initiating remote GIS operations than the traditional approach, which is to download GIS data objects to a local machine. 89 Overcoming network latency. Mobile agent can provide real-time responses on the remote GIS site and monitor the distributed GIS process independently. For example, if a GIS operation is carried out on a remote database server, mobile agents can monitor the whole GIS operation process more closely and promptly migrate themselves to the remote server. If the storage device on the remote server is full for the next GIS procedure, the mobile agent can quickly search alternative network storage devices to ensure completion of the GIS operations without bailing out of GIS operation procedures. Executing asynchronously and automatically. Mobile agents can perform more stable GIS operations without worrying about the problems of network breakdown or disconnection in fragile network connections. One of the current problems in distributed GIS operations is the difficulty in providing a stable network connection during the entire period of GIS operations, usually several hours. Mobile agents can help users solve the problem of network connections by conducting GIS operations remotely even when the network is disconnected during the GIS operations. The mobile agent can always reconnect later and send the final results back to users. Dynamical adoption. Mobile software agents can detect or monitor the changes in one GIServices environment and adopt the optimal GIS configuration dynamically and autonomously. For example, if a GIS software has been upgraded to a new version or a specific GIS data format has been changed by software vendors, mobile agents can adjust the GIS procedures to the new environment by adding the new functions or converting old data to a new data format. However, there are significant problems in the implementation of mobile agents. The following paragraphs will illustrate the challenges and problems in developing mobile software agents: Security. Security is the top concern in implementing the mobile agent. Because of the mobility and flexibility of mobile agents, the mobile agent model also creates high risks of virus attacks or hacker invasions, and is more vulnerable in a distributed network environment (Jansen and Karygiannis, 1999). In fact, a mobile software agent can become a computer virus because it can transport itself from one system to another and interact with data objects or system programs. Without providing a safe runtime environment for mobile agents, it is not feasible to adopt mobile agents in the future framework of distributed GIServices. Currently, several countermeasure approaches are available for the protection of mobile agents, such as password protection, sand-boxing software model, digital signature, and agent travel verification (Karnik, 1998). These countermeasures will be discussed in the next section in details. Cross platform implementation. The implementation procedure across different platforms is also a major problem with mobile agents. Since mobile agents need to travel between different systems and frameworks, the question of how to choose an appropriate software development framework becomes a major challenge for the implementation of mobile agents. Size and Functions. Another concern with mobile agent problems is the size of agent programs versus the functions of agents. In theory, the size of the mobile agent should be minimized as smaller agent programs travel faster across networks. However, agent developers tend to design more complicated agent programs and embed more functions into mobile agents, which makes 90 the size of a mobile agent bigger. Many current mobile agent research projects are facing the dilemma of an agent’s size and functionality. One possible solution is to utilize more lightweight programming technologies, such as scripting languages (JavaScripts or VBscripts) to reduce the actual program size of mobile agents during their runtime. Protocol Development. How to design an appropriate communication protocol is another challenge for mobile agents. Agent communication protocol is different from the traditional network communication protocol, such as Transport Control Protocol and Internet Protocol (TCP/IP) or HTTP. Software agent communication protocol is a high-level, application-oriented protocol, which will focus on the exchange of knowledge bases, user-defined rules, the control of agent behaviors, and the interactions between agents and systems (FIPA, 1998; Ferber, 1990). The protocol will need to consider the mobility of agents and provide a dynamic naming and data transfer services. More importantly, the protocol must be accepted by the GIS community and can be customized by different GIS applications. Level of Control. The final design problem with mobile agents is the level of control. Since mobile agents are movable and dynamic, how to define their behaviors and movements in the networks will need to be formalized into several types of control mechanisms. For example, by defining a master-slave relationship, a master agent can copy or migrate a slave agent from one location to another. By adopting a user-agent relationship, a GIS user can assign a task to a software agent or call back agents from a remote site. This type of control mechanism has not been defined clearly in the current software agent research and GIServices. Although there are significant challenges to the mobile agent design as mentioned above, the flexibility and dynamic adoption feature of mobile agent models are the key in providing distributed GIServices and utilizing dynamic network environments. This research will adopt both stationary agent and mobile agent frameworks in providing flexible and dynamic GIServices. The next section will introduce another element in the design of software agents: their functionality. 4.3.2.2 Agent Functionality The functionality of agents refers to the interaction and responsibility for different tasks, such as the communications among machines, the integration of GIS components, and the searching for requested geodata objects. Geodata agents, component agents, and machine agents are three types of agents based on their functionality in distributed GIServices. Three kinds of agents for distributed GIServices are proposed based on their functionality as the following: Component agents (mobile agents) are designed to access and integrate heterogeneous GIS components. The major tasks of component agents include retrieving metadata information from GIS components, migration of GIS components, dynamic interaction between heterogeneous component frameworks, such as DCOM, CORBA, and Java platform, and searching for GIS components located in other GIS nodes. Since component agents require intensive communication with GIS components on local machines or remote servers, the mobility of component agents is essential in achieving the dynamic interactions between component agents and GIS components. 91 Geodata agents (mobile agents) are designed to retrieve heterogeneous geodata objects via different database servers. The major tasks of geodata agents include retrieving the metadata information from geodata objects, transferring or converting geodata objects, dynamic interaction between geodata objects and GIS components, and searching for geodata objects located in other GIS nodes. Similar to component agents, geodata agents also require mobility when communicating with geodata objects located in different machines. Machine agents (stationary agents) are designed to access the local machine’s hardware profiles and peripherals, and network performances. Machine agents can interact with component agents and geodata agents to perform a GIS task according to the information provided by component agents and geodata agents. Machine agents can make a collaborative decision on the locationallocation of GIS components and geodata objects according to their machine profiles, network performance and GIS tasks. The major task of machine agents is to retrieve platform information. Thus, machine agents will be stationary in order to communicate with local machines efficiently. Each machine should only have one machine agent with a unique identification associated with the machine. The detailed design of the agent-based GIServices architecture is illustrated by the Unified Modeling Language (UML) and emphasizes the design of a high-level communication mechanism of heterogeneous GIServices. An agent-based communication mechanism will help users deal with heterogeneous distributed databases and GIS components in dynamic network environments. The collaborations among agents will provide a flexible and scalable framework for distributing GIServices on the Internet. Map Display Component Colorado Roads (Vector) Metadata Colorado DEM (Raster) Metadata Metadata Geodata Agents Component Agents Machine Agents GeoData Objects Spatial Analysis Component Metadata GIS components Machine - B Machine - A Figure 4-22. Collaborations among component agents, geodata agents, and machine agents. 92 Figure 4-22 demonstrates the collaborations among distributed GIS components, geodata objects and three types of agents. As mentioned in previous sections, the metadata information must be implemented into both GIS components and geodata objects to facilitate the collaborations. The main function of agents is to interpret different GIS components and data objects according to their metadata information. Also, the machine agents will retrieve information from client/server machines and communicate with component agents and geodata agents. The combination of agents, metadata and distributed components creates a self-describing, dynamic, reusable and interoperable GIService environment and facilitates a modularized GIS infrastructure. Figure 4-23 illustrates the classifications and hierarchy of agents in UML. The relationships among different agents are described in the following paragraphs. Machine Agent Stationary Agent Agent Component Agent Mobile Agent Geodata Agent UML notations Generalization Figure 4-23. The agent relationships and hierarchy in UML. There are three types of generalization relationships. Two types of [Agents] in distributed GIServices are [Stationary Agents] and [Mobile Agents]. One type of [Stationary Agents] in distributed GIServices is [Machine Agents]. Two types of [Mobile Agents] in distributed GIServices are [Component Agents] and [Geodata Agents]. The UML model mentioned above provides a preliminary framework for the future design of software agent model. In the future, the actual implementation of software agents will require more detailed specifications of software agent model, especially on the dynamic roles of agents and their interfaces. Also, the dynamic relationships among agents, GIS databases, programs, and network frameworks will need to be specified in UML to provide guidelines for software prototyping. 4.3.2.3 Agent Security Agent security is one of the most serious problems in the design of software agents. In general, security is central to all kinds of computer systems and information services, whether they are standalone workstations, network computers, or those systems specialized in geographic information or financial transactions. As long as people share information and exchange data via computers and networks, the security problem will always be a major concern for system 93 administration and implementations. This section will discuss the security issues with software agents for both mobile agents and stationary agents within their runtime environments. The security threat usually comes from viruses, hackers, fraudulent users, or incompetent employees. Although software agents share the same security problems with other types of computer systems, software agents will require special considerations because of their specialized purpose and mobile functions. In general, there are three types of security issues in the design of software agent model: disclosure of information (interception), denial of service (DOS), and the corruption of information (Jansen and Karygiannis, 1999). The first type of security problems in software agents is the disclosure of information. Since software agents usually carry important information, such as user accounts or command lists, the information items carried or encapsulated by software agents may be intercepted or retrieved from other programs or on-line users. The information can be disclosed in many situations, such as the unintentional release of password, the exposure of sensitive data, or unauthorized access by other agents or programs. One example of this type of security problem is that when an unauthorized software agent claims the identity of another agent in order to gain access to services and resources. The second type of security problems in software agents is the denial of services (DOS), where anonymous network computers and programs can launch attacks against the agent servers or systems by consuming an excessive amount of the agent platform’s computing resources. DOS is currently one of most serious security threats to on-line information services because it can prevent other users or agents from connecting or accessing the servers. The mechanism of DOS is to attack target agents or systems by repeatedly sending millions of messages in a very short time period. In February 2000, the first mass distributed DOS attack was launched against many commercial web servers, including Yahoo, E*TRADE, eBay, CNN.com, etc (Scambray et. al., 2001). This attack took down these web servers for several days and caused significant financial loss for these companies. Although the targets of this DOS attack example are commercial Web servers, software agents are also vulnerable for the DOS attacks. The third type of security problems in software agents is the corruption of information, where the original GIS tasks can be altered with different operation procedures, or the embedded sensitive geodata can be damaged or corrupted. The corruption of information can occur in both data processes and agent programs. Computer virus is one of the examples. Usually, if a software agent was affected by computer virus, the affected agent is no longer the original agent and may generate wrong information or damage the agent’s platform or runtime environment. In general, the corruption of information may reduce the accuracy of geodata, cause the crash of GISystems, or perform malicious GIS operations. These security problems may jeopardize the use of software agents in distributed GIServices and cause serious loss of critical information or resources. Fortunately, several countermeasures are available with modern computer technologies. These technologies can be applied in the platform of software agent models to prevent potential security problems. The following paragraphs will introduce these countermeasures in details. 94 The first type of countermeasures is the adoption of encrypted information transmission. This approach is the easiest and most effective way to prevent the disclosure of information and related security problems. Many software platforms and web servers are equipped with this feature already. It has been used in many sensitive information transmissions, such as credit card numbers, user passwords, or financial transactions. The encryption of information can prevent the potential security problem when transmitting sensitive information via public networks, which may be intercepted by others programs for unauthorized access. Currently, many Web browsers and servers provide these types of protections by implementing public keys and private keys for transmitting encrypted information (Scrambray et. al, 2001). The second approach is to have a recovery plan (backup procedures) for mission-critical information and software agent systems. Any security problems or accidental mistakes of system administration may cause the crash of systems or the damage of software agents. A comprehensive recovery plan is the most important procedures in protecting software agents and their runtime systems. The plan may include creating a mirror site for software agent systems and regular backup procedures for both on-line and off-line media. The third approach is to design a sand-boxing model for the runtime environment of software agents. Although the isolation of the software agent environment from the computer platform will limit the capabilities of agents, the sand-boxing model can prevent malicious software agents to access critical elements directly, such as memories, operating systems, and local hard disks. The sand-boxing model will prevent software-based fault or potential computer memory leaking problems. Currently, many distributed software component frameworks are adopting this approach, such as Java’s virtual machine with Java applets and the ActiveX container with ActiveX controls. The fourth type of countermeasures is the digital signature, where software agents can carry a signed document to confirm the authenticity and integrity of themselves. The digital signature can be assigned by a server and the authorized software agent will carry the document to access specified systems or networks. When software agents move to the specified computer or network environment, the client-side machine will send request to the server to verify the signature and then grant the access permission for software agents. The adoption of the digital signature will provide more flexible access controls of software agents because different agents can carry different signatures (documents) and access different types of client-side machines. Also, a single software agent can carry multiple signatures assigned by different servers for accessing different types of networks. The mechanism of digital signature mentioned here is similar to the visa function in our real world. For example, different countries can issue different types of visa for travelers to enter or visit their countries. The final approach is the implementation of agent travel logs, where software agents will keep an authentic record of their travel histories and events. The travel histories will indicate the possible security problems and maintain the integrity of software agents. For example, if software agents only travel around the Intranet or local area networks, it is safer than traveling to the Internet or wide area networks. By analyzing the travel logs of software agents, agent servers or remote access machines will be able to detect potential problems or security threat carried by malicious software agents. 95 Currently, only few approaches, such as encrypted information transmission and sand-boxing model, have been applied in the actual agent implementation. Other types of countermeasures for software agent security are still under development. In the GIS community, software programmers and GIS professionals need to understand the security problem in software agents for distributed GIServices. With appropriate implementation of these countermeasures, the GIS community will enjoy flexible and comprehensive GIServices provided by software agents and will not worry that software agents may jeopardize the runtime environment of GISystems, the accuracy of geographic information, or the proper procedures of GIS analysis. Another important issue in the implementation of security countermeasures is the awareness of security problems for GIS users and system developers. “Technology by itself cannot solve security problems. Technology for security must be complemented by an awareness of security issues and disciplined application of the techniques” (Harrision, et. al., 1997, p.238). Without proper training and education plans, curious GIS users and incompetent system operators may become the major source of security threats. The awareness of agent security is the key in preventing the misuse of software agents and in securing mission-critical information and resources. The next section will introduce the security approaches adopted by this research by using sand-boxing model and the design of agent container. 4.3.2.4 The Design of Agent Container The design of agent containers has two goals for software agent model. The first goal is to provide a runtime environment for mobile agents and the second goal is to enhance the security of software agents by implementing the sand-boxing model within the agent containers. The design of agent containers will provide a runtime environment for mobile agents to perform their operations and interact with local data objects and GIS components (Figure 4-24). Without agent containers, software agent model will have difficulty to move mobile agents from one location to another due to the security consideration and the heterogeneous computer platforms. Agent containers will provide an interoperable runtime environment for mobile agents operating among heterogeneous platforms without modifying their original coding. Agent containers will also provide a better security model for GIServices. Three connection channels, agent-todatabase, agent-to-programs, and agent-to-agent connections will be established in the agent containers (Figure 4-24). Machine-A Machine-B Agent container Agent container Agent #3 Agent #1 Agent #1 Database Agent #2 Programs Figure 4-24. The design of agent containers. 96 Currently, a few software platforms are available for the implementation of agent container, such as Java Virtual Machines developed by Sun Microsystems, ActiveX/COM container developed by Microsoft, and the XML with Web browser developed by W3C. Due to the limited time and efforts, this research will only introduce the concept of agent containers but do not focus on the detailed design of agent containers. Another important issue in the implementation of agent containers is the design of Agent communication language (ACL) and agent communication protocol (ACP). Currently, many research projects focus on the development of ACP and ACL, including the Knowledge Query and Manipulation Language (KQML) project (Finin and Weber, 1993; Finin, Labrou, & Mayfield, 1997), agent communication language (ACL) specifications by FIPA (FIPA, 1998), and the Internet Inter-ORB protocol (IIOP) for CORBA’s Mobile Agent Facility Specification (OMG, 1998). The final chapter will provide detailed descriptions and suggestions for the actual implementation of agent containers. To sum up, the design of agents in this dissertation introduces three major concepts for the deployment of a distributed GIService architecture. First, the agent-based mechanism will help users deal with heterogeneous distributed databases and GIS components in a dynamic network environment. The adoption of agent-based communications will prevent the information overload problem for users and software designers. These agents will become the interface between users and client/server machines and bridge the gaps among client/server machines, GIS components and geodata objects. Second, the design of software agents emphasizes that different tasks should be assigned to different types of agents. Different from the traditional concepts of AI, each agent in this design only focuses on its specific task and limited responsibility. Individual agents will need the collaboration with other types of agents in order to complete a GIS task. The agent-based mechanism can provide a flexible and scalable architecture for distributed GIServices. Third, the design of stationary agents and mobile agents can utilize the dynamic feature of current network technology and provide a network-centered solution for agent communications. The deployment of stationary agents and mobile agents will provide a more efficient communication mechanism and processing capability for distributed GIService architecture. The next section will introduce an integrated architecture for distributed GIServices by using LEGO-like GIS components, an object-oriented metadata scheme, and software agents. 4.4 An Integrated Architecture for Distributed GIServices: GIS Nodes In order to provide truly dynamic and distributed GIServices, the implementation design has to abandon the traditional concepts of clients/server model. This dissertation proposes an integrated architecture for distributed GIServices, called GIS nodes. GIS nodes are the minimum processing unit under a truly distributed GIServices architecture. 97 GIService Workstation (a GIS node) Hardware profiles: CPU, OS, CRT, printer, scanner GIS component container M M Machine agent M : metadata Geodata object container M M Component agent M M Geodata agent Agent container Figure 4-25. A GIS node under a distributed GIServices framework. A GIS node can be implemented in different computers with different operation systems and hardware. In order to collaborate with each other, a GIS node should have network access capability and four main elements: GIS component container, Geodata object container, Agent containers, and Hardware profiles (Figure 4-25). All GIServices and GIS tasks can be accomplished through the collaboration between GIS nodes using these elements. The following paragraphs will explain the major functions of the four elements inside a GIS node. Hardware profiles store all hardware and operating system information on the GIS nodes. This information is automatically retrieved from the operating system and updated whenever the change in hardware and peripherals happens. The hardware profiles also include the network performance of local networking and node-to-node network connectivity in real time. The information in the hardware profiles is interpreted by the machine agents inside GIS nodes. Machine agents use hardware profiles as one of the criteria for the deployment of a dynamic GIService architecture. GIS component containers are used to store GIS components and facilitate the connections between GIS component and component agents. Currently, several commercial software packages perform similar functions, such as the ActiveX container and the Java Virtual machine. In general, GIS component containers combine different GIS components either locally or remotely by using the plug-and-play and remote invocation mechanism. Component agents use the GIS component containers as the storage devices for downloading distributed GIS components via the network. Also, GIS component containers provide universal, virtual environments for GIS components to be executed on different types of GIS platforms. Geodata object containers store geodata objects and associate them with different types of database engines, such as Oracle, Informix, and Access. Geodata objects will be stored in and retrieved from geodata object containers based on the requests of geodata agents. Geodata 98 agents will use geodata object containers as the storage devices to migrate geodata objects from one node to another. Agent containers are used for storing different types of agents and transferring a mobile agent from one GIS node to another. The actual design of agent containers is already illustrated in the previous section. Agent containers facilitate the communications between agents and other elements of GIS nodes, including hardware profiles, GIS component containers, and Geodata Object containers. Three types of agents: machine agents, component agents and geodata agents, are stored in the agent containers and communicate with each other. With the four elements, each GIS node becomes an independent GIS-processing unit and is able to collaborate to perform a complete GIS task. GIS nodes broaden the capability of GIS from an isolated system into a group-based, collaborative GIS network. Figure 4-26 illustrates three possible types of collaboration among GIS nodes, including local area networks (LAN), the Intranet, and the Internet. The interaction of GIS nodes in the local area network can benefit data sharing and the integration of GIS applications inside a secure local network environment, such as an office building or a geography department. The collaboration between GIS nodes can also be extended to the Intranet level to share data and GIS components within a company or a university. The third type of GIS node network will use the Internet to share and distribute GIS data objects and components in a global scale. Figure 4-26 illustrates that a GIS task can be distributed from a GIS node in Geography Department at the University of Colorado to other universities and organizations, including San Diego State University (SDSU), a software vendor (ESRI), and U.S. Geological Survey (USGS). With the collaboration of the Internet-based GIS nodes, advanced GIS users can launch more complicated and large-scope GIS research and analysis, such as the problem of global environmental changes. With the scalable architecture and independent processing units, GIS users can accomplish their GIS tasks more efficiently and effectively. A GIS Task GIS GIS Nodes Nodes GIS GISNodes Nodes GIS GISNodes Nodes GIS GISNodes Nodes GIS Node GIS Node GIS Node Intranet (University of Colorado) Local Area Network (LAN) (Geography Department) GIS Nodes (USGS) GIS Nodes (SDSU) GIS Node (ESRI) Internet Figure 4-26. The collaborations of GIS nodes in three network levels. 99 To summarize, this section introduces the design of GIS nodes as an independent GIS processing unit. Each GIS task is completed in a distributed environment by collaborations between GIS nodes on the network. The design of GIS nodes emphasizes the scalability of distributed GIService architecture and the dynamic integration of GIS components and geodata objects. The design of GIS-nodes provides a comprehensive framework for storing and migrating LEGO-like GIS components and geodata objects, and different types of GIService agents. The next section will use a simple GIS task to demonstrate the detailed procedures and work flows under a dynamic, distributed GIService architecture. 4.5 A Walk-through Example for a Dynamic GIService Architecture In order to demonstrate the capability of dynamic distributed GIServices architecture, the following scenario will illustrate the conceptual procedures of a GIS task (map display) and the interactions among geodata objects, GIS components, agents, and GIS nodes. 4.5.1 Scenario Description ï‚· ï‚· ï‚· GIS user: Mike; GIS nodes: node#A (Mike’s machine), nodes#B, node#C, node#D Network type: Local Area Network (LAN) GIS Task: Mike wants to display a Colorado road map on his workstation (node #A). 4.5.2 GIS Operation Procedures 1. Mike sends his GIS task: [Display: Colorado Roads] to GIS node #A. 2. The machine agent in node #A forms criteria based on Mike’s GIS task as the following: ï‚· User-side Node: GIS node #A ï‚· Required GIS components: [Map Display] ï‚· Required geodata objects: [Colorado Roads] ï‚· GIS-node #A hardware profile: Pentium II 400Mhz CPU, 128MB RAM, local B/W Laser Printer. Network type: 100Mb/sec Fast Ethernet. ï‚· [Node-to-Node] network performance *: A-to-B: 32 Mb/sec, A-to-C: 20Mb/sec. B-to-C: 5 Mb/sec. B-to-A: 10 Mb/sec, C-to-A: 10Mb/sec. C-to-B: 7 Mb/sec. * (Note: The upload and download bandwidth between two GIS-nodes could be different due to the different types of networking technologies.) 3. The node #A machine agent asks the node #A component agent and geodata agent if the local machine has a [Map Display] GIS-component and a [Colorado Roads] geodata object. If the client machine has the required component and data object, the client machine agent will dynamically link the [Map Display] component with [Colorado Roads] geodata object together, then perform the GIS task. 4. If the client machine does not have the [Map Display] component and [Colorado Roads] data object, client-side component agent and geodata agent (on node#A) will send the requests to 100 other agents on node#B, node#C, and node#D, and search for the specific GIS components and geodata objects (Figure 4-27). User Machine Agent (node#A) User Tasks Client Component Agent (node#A) Client Geodata Agent (node#A) Search the required Geodata objects Node#B Geodata Agent Node#C Geodata Agent Node#D Geodata Agent Search the required GIS components Node#C Node#C Node#D Component Agent Component Agent Component Agent Figure 4-27. Searching for requested geodata objects and GIS components. 5. One server-side component agent finds the requested component [Map Display] in its machine (Node#B) and another server-side geodata agent finds the requested geodata [Colorado Roads] in its machine (Node#C). 6. In order to decide whether the GIS task needs to download [Map Display] component and [Colorado Raods] object to the client machine (node#A), three machine agents (node#A, node#B, and node#C) initiate communication based on their hardware profiles, network performances, and the GIS task (Figure 4-28). Hardware Profile Hardware Profile Node#B Machine Agent Hardware Profile Client (node#A) Machine Agent Network performance Node#C Machine Agent Messages passed between machine agents will decide the re-locations of GIS components and geodata objects during processing. Figure 4-28. The decision-making of the relocation for GIS components and data objects. 101 7. The three machine agents then reach an agreement and decide to re-locate both [Colorado Roads] and [Map Display] to the client machine (node#A). The detailed decision-making processes will be explained later. The next steps will be illustrated in the Scenario-A. (Scenario-B will illustrate if these agents decide to re-locate the [Map Display] component only and be remotely connected with the [Colorado Roads] data object.) Scenario-A 1. The client machine agent (node #A) tells the client-side component agents and geodata agents to download the display component from node#B and the road data object from node#C. 2. The GIS node#A component agent moves itself to node#B, then collaborates with node#B component agent in order to download the [Map Display] component (Figure 4-29). A Client Machine (GIS node # A) Hardware Profile Machine Agent GIS components Geodata Objects Map Display Component Agent Geodata Agent Agent Container Agent Container Client Component Agent (move from #A) Machine Agent Component Agent Geodata Agent GIS components Geodata Objects Hardware Profile Map Display GIS Node#B Figure 4-29. The dynamic download of [Map Display] from GIS node#B. 3. The node#B component agent retrieves the metadata of [Map Display] component as the following and send the information to Node#A component agent: ï‚· System Integration Metadata: Component type: JavaBean 102 ï‚· Required environment: Java Virtual Machine 2.0 GIS-operation requirement metadata: Required items: Coordinates, map projections Optional items: Map units, map symbol, data accuracy 4. The node#A component agent brings the [Map Display] component back to the client machine, integrates it with the Java Virtual Machine on the GIS-node#A, and sends the GISoperation requirement to the node#A geodata agent. 5. The node#A geodata agent moves to node#C and uses the GIS-requirement information to verify the metadata contents of the [Colorado Roads] with the help of the geodata agent in node#C. The node#C geodata agent retrieve the metadata of [Colorado Roads] as the following: ï‚· GIS operation metadata: Coordinates: N: -102.32, S: -103.12, E: 92, W: 94. Map projection: Robinson Map units: meters Symbols: 1: red, 2: brown, 3: blue. ï‚· Data connectivity metadata: Data format: ARC/INFO Coverage Database: Oracle 8.0; Remote Access: SQL; Available migration type: COPY A Client Machine (GIS node # A) Hardware Profile Machine Agent GIS components Map Display Geodata Objects Component Agent Geodata Agent Colorado Roads Agent Container Agent Container Machine Agent Hardware Profile Client Geodata Agent (move from #A) Component Agent Geodata Agent GIS components Geodata Objects Colorado Roads GIS Node #C Figure 4-30. The dynamic download of [Colorado Roads] from GIS node#C. 103 6. The node#A geodata agent confirms that the [Colorado Roads] qualifies the GIS-operation requirement for [Map Display]. Then, the node#A geodata agent downloads [Colorado Roads] object back to the GIS node#A and integrates it with local geodata objects (Figure 430). 7. When the re-arrangement is complete, the client-side machine agent will receive the messages from both component agent and geodata agent, then the client machine agent will link the GIS components and geodata object, then perform the GIS task. Scenario-B 1. If the three machine agents decide to re-locate [Map Display] to node#A only and leave [Colorado Roads] on node#C, the client machine agent will tell the client-side component agents to download the [Map Display] component. The client machine agent will also tell the client-side geodata agent to build remote database connection with node#C based on the data connectivity metadata of [Colorado Roads]. 2. The client-side component agent downloads the [Map Display] component into the client machine and integrates it with the local Java Virtual Machine. 3. The client-side geodata agent interacts with the node#C geodata agent in order to build a remote data connection with [Colorado Roads]. 4. When the component re-arrangement and remote data connection is done, the client machine agent will perform the GIS task. The real world scenario for distributed GIService may be more complicated than the example described above. The reason for choosing such a simple task (map display) is to illustrate the basic operations in a distributed GIService architecture, including the relocation of geodata objects and components, searching for required geospatial information, and the collaborations between agents. Chapter Five will provide more realistic scenarios in a distributed network environment. 4.5.3 The Algorithm for the Location-allocation Decision Making In the previous scenario, the machine agents in three GIS-nodes, #A, #B, and #C, made a collaborative decision on the location-allocation of the [Map Display] component and the [Colorado Roads] object. The following paragraphs will explain the detailed decision-making processes and a possible algorithm for relocation. The reason for explaining the detailed algorithm is to identify the knowledge rules used in that scenario and to demonstrate a possible approach for agent collaborations based on their embedded rules. 104 Goal: select the optimal location for [Map Display] and [Colorado Roads] in GIS node#A, node#B or node#C. Step 1. Collect the machine hardware and network profiles from the participated GIS nodes ï‚· ï‚· ï‚· ï‚· ï‚· ï‚· ï‚· Node#A profile: 400Mhz CPU, 128MB RAM, 100BT Ethernet. Node#B profile: 200Mhz CPU, 32MB RAM, 10BT Ethernet. Node#C profiel: 500Mhz CPU, 128MB RAM, 10BT Ethernet. Network performance: A-to-B: 32 Mb/sec, B-to-A: 10Mb/sec. B-to-C: 5 Mb/sec. C-to-B: 7Mb/sec. C-to-A: 10Mb/sec. A-to-C: 20Mb/sec. Required GIS component: [Map display, size: 1Mb] on node#B Required geodata object: [Colorado Road, size 3Mb] on node#C Result destination: node#A Step 2. Calculate the effort values for different movements and access events Relocation Effort = ObjectSize / Bandwidth (FromNode, ToNode) ï‚· ï‚· ï‚· ï‚· ï‚· ï‚· The effort to relocate [Map Display] from node#B to node#A = 1 (MB) / 10 = 0.1 The effort to relocate [Map Display] from node#B to node#B = 0 The effort to relocate [Map Display] from node#B to node#C = 1 / 5 = 0.2 The efforts to relocate [Colorado Road] from node#C to node#A = 3 / 10 = 0.3 The efforts to relocate [Colorado Road] from node#C to node#B = 3 / 7 = 0.43 The efforts to relocate [Colorado Road] from node#C to node#C = 0 Connection Effort = TotalObjectSize / BandwidthMean ï‚· The efforts to remotely connect [Map Display] and [Colorado Road]: between node#A and node#B = (1 + 3) / ((32 + 10) / 2) = 4 / 21 = 0.19 between node#A and node#C = (1 + 3) / ((10 + 20) / 2) = 4 / 15 = 0.27 between node#B and node#C = (1 + 3) / ((5 + 7) / 2) = 4 / 6 = 0.67 Step 3. Generate the total effort values for different scenarios* Total Effort = Relocation Effort + Connection Effort ï‚· ï‚· ï‚· Scenario-1: relocate both [Map Display] and [Colorado Roads] to GIS node#A = (0.1 + 0.3) + 0 = 0.4 Scenario-2: download [Map Display] into node#A and access [Colorado Roads] remotely on node#C = 0.1 + 0.27 = 0.37 Scenario-3: download [Map Display] from node#A, move [Colorado Roads] to node#B, access [Colorado Roads] remotely on node#B = (0.1 + 0.43) + 0.19 = 0.72 105 * (Note: Since the GIS user is located on node#A, the [Map Display] component has to be downloaded into the user-side machine. The three representative scenarios listed above fulfill this requirement.) Step 4. Select the scenario with minimum total effort, then re-arrange GIS components and geodata objects: The minimum total effort value is the Scenario-2 (the effort value = 0.37). This scenario downloads the [Map Display] component to node#A, then remotely accesses [Colorado Roads] at node#C, as described in section 4.5.1., Scenario-B. To summarize, the network-centered design of GIS-nodes aims to provide flexible, dynamic GIServices in distributed network environments. The map display scenario introduced in this section illustrates the mechanism for the dynamic distributed GIService architecture, including the network searching for geodata object and GIS components, dynamic download of GIS components, and the collaborative decision making for the location-allocation of GIS components and geodata objects. This section also introduces a possible algorithm for the machine agents to make a collaborative decision of the location-allocation problems. Other GIS professionals may develop their own algorithm for the decision making processing in the future. The main purpose of this scenario is to illustrate the detail mechanisms of distributed GIService architecture proposed in this dissertation and demonstrates the scalability and flexibility of the architecture. 4.6 Chapter Summary This chapter introduces three essential concepts for the deployment of distributed GIService architecture: 1. A dynamic construction approach for distributed GIServices with object migration and remote invocation, 2. An object-oriented, operational metadata scheme for creating self-describing, self-managing GIS components and geodata objects, and 3. An agent-based communication mechanism for bridging the heterogeneous distributed network environments. This chapter also illustrates an integrated framework (GIS nodes) for distributed GIServices by collaborating software agents, GIS components, and databases. The UML diagrams in this chapter illustrate a specifications, which can be converted into a computer-based model easily. In general, the design of such a dynamic, distributed GIService architecture focuses on the balanced functionality of client/servers, the integration scheme for GIS components and geodata objects, and the communication mechanism among heterogeneous networks. The distributed GIService architecture proposed in this dissertation is an integrated approach for dynamic, customizable, intelligent, and self-managing GIServices. The next section will illustrate the advantages of a dynamic, distributed GIService architecture by introducing software examples. Three different scenarios will be given to compare their use cases between traditional GISystems solutions and dynamic GIServices solutions. 106 CHAPTER 5. SOFTWARE EXAMPLES AND USER SCENARIOS In this chapter, two software examples and three representative user scenarios will be used to demonstrate the advantages of the dynamic GIService architecture. The two software examples will focus on their flexibility and modularized approaches by comparing their software frameworks and implementation procedures to traditional monolithic software packages. Three user scenarios are used to illustrate the differences between traditional GISystems and distributed GIServices with the detailed deployment of a dynamic architecture. The first software example is the plug-in modules for the World Wide Web browsers. The Web plug-ins are software applications which can work with Web browsers to access specific data formats or perform unique multimedia functions that the browsers alone can not handle. The Web plug-in example can demonstrate the advantages of the dynamic software architecture and a feasible approach for automatic software download/distribution. The second software example is the OpenGIS Web Map Server (WMS) Implementation Interface Specifications (version 1.0) (OGC, 2000). The WMS specifications will provide guidelines for current development of Web map servers and standardize the HTTP contents and Uniform Resource Locators (URLs) communication syntax. The WMS specifications will also indicate the major tasks and functions of Web map servers. The framework proposed in the WMS specifications can be applied in the development of distributed GIServices architecture. The three user scenarios focus on the actual implementation of distributed GIServices. The first scenario emphasizes on on-line mapping services and illustrates the advantages of data sharing and remote database access. This case also demonstrates the potential for integrating other information services (hotel reservation) and intelligent information search. The second scenario focuses on the issue of distributed GIS procedures and automatic data conversion. The adoption of distributed GIS components can facilitate the efficient use of computing resources. The third scenario illustrates one solution for cross-platform applications and real-time data update and download. The design of the GIS nodes can extend to different types of computer platforms, such as palm-size PC and Personal Digital Assistant (PDA) devices. There are three major goals in this chapter. First, software examples can demonstrate the advantages of the modularized software architecture by comparing it to traditional software model and monolithic GISystems. Second, three user scenarios will identify the requirements of operational metadata for different GIS components and data objects. Third, the communication mechanisms adopted in these scenarios will justify the behaviors and responsibilities of agents for integrating distributed GIServices. 5.1 Software Examples 5.1.1 The Plug-ins for Web Browsers Web Plug-in modules are software applications which can access specific data formats or perform unique multimedia functions that the Web browsers alone can not handle. The earliest 107 example of the plug-in software model is the Helper program in Macintosh machines developed by Apple Computer, Inc. The development of Web-based plug-ins was first introduced by Netscape with its Web browser, Netscape Navigator (this browser was renamed as Communicator in a later version). Later on, another Web browser, Internet Explorer developed by Microsoft, also adopted similar plug-in models in its ActiveX control framework. Since the early development of Web browser technologies mainly focused on the display of hypertext and limited image formats, the capability of Web browsers does not include multimedia files or 3D representations, such as Apple Computer’s Quicktime movies or Virtual Reality Markup Language (VRML) display. Therefore, many different Web plug-ins, such as the VRML viewers and Quicktime movie players, were developed to expand the limited functions of Web browsers. The development of Web plug-in software model introduces a flexible software architecture and allows Web users to access new types of data or multimedia files on the Internet. Figure 5-1. The MIME configuration on a Web server. Traditionally, the display of multimedia files in a Web document is defined in a Multipurpose Internet Mail Extensions (MIME) table (Figure 5-1). The development of MIME was originally proposed as a way to attach non-textual information to e-mail messages. With the growth of the World Wide Web, the MIME tables became the standard to register multimedia files and different data formats associated with their software applications. The MIME table can tell Web browsers to open the multimedia files with their applications together. For example, if a file extension is AVI which indicates a video format, the Web browser will launch a Microsoft Media Player to play the video file. The design of MIME tables allows Web browsers to access different types of multimedia files and open them with associated software applications. However, if a new media format is not specified in a MIME table, the Web browser may not be able to display the new data objects or multimedia files. By introducing the Web plug-ins for the browsers, the client-side Web 108 browsers can access new data formats which may not be defined in the original MIME table and display new multimedia presentation automatically. The design of Web plug-ins uses two special HTML tags to indicate new types of data, <EMBED> tag and <OBJECT> tag. The <EMBED> tag was introduced by Netscape first and used by its Navigator browsers. The <OBJECT> tag was introduced by Microsoft in Internet Explorer and then was adopted in the HTML 4.0 specification by W3C. These HTML tags are used to specify the Web plug-in folders and programs to handle new forms of data or multimedia. The Web plug-in framework allows users to handle different types data, sounds, movies, and pictures. The following section will introduce two examples of HTML statements for accessing new data format and multimedia files. The first example is an <OBJECT> tag example. The following HTML codes are to include a Macromedia Flash movie in a Web document. <OBJECT CLASSID="clsid:D34CDB7G-BC6D-11cd-96B8-333553540000" WIDTH="100" HEIGHT="100" CODEBASE="http://active.macromedia.com/flash5/cabs/swflash.cab#ve rsion=5,0,0,0"> <PARAM NAME="MOVIE" VALUE="moviename.swf"> <PARAM NAME="PLAY" VALUE="true"> <PARAM NAME="LOOP" VALUE="true"> <PARAM NAME="QUALITY" VALUE="high"> </OBJECT> In this HTML document, the CLASSID inside the <OBJECT> tag indicates the identification number of the new flash movie document. Then the HTML document defines the WIDTH and HEIGHT for displaying the flash movie on the screen. The CODEBASE attribute tells the browser where to find Flash Player for automatic download. Internet Explorers (the Web browser) will prompt the user with a dialog asking if they would like to auto-install the plug-in Player if it's not already installed. The four <PARAM> tags indicate the parameter values for the plug-in player. This <OBJECT> tag example illustrates the advantages of the plug-ins and the automatic download procedures. Since there was a high demand for Web plug-in applications, the W3C adopted the <OBJECT> tag in the new HTML 4.0 specification. “HTML 4 introduces the OBJECT element, which offers an all-purpose solution to generic object inclusion. The OBJECT element allows HTML authors to specify everything required by an object for its presentation by a user agent: source code, initial values, and run-time data.” (W3C, 1999, p. 160). Besides the use of <OBJECT> tag, the <EMBED> tag is another popular approach for adding plug-ins in a Web document. The EMBED tag allows users to display output from a plug-in application in an HTML document. Different from the <OBJECT> tag, the plug-in application must be installed first in the local machine. When the <EMBED> tag is loaded in a HTML document, then the plug-in runs automatically. The following is an example of HTML document with a <EMBED> tag for a midi music file. 109 <EMBED src="bgsound.mid" width="145" height="60" align="right"></EMBED> In general, the function of <OBJECT> and <EMBED> tags are very similar. The major difference between the two tags is that they were originally designed for different Web browsers. The <OBJECT> tag was designed for the Microsoft Internet Explorer and the <EMBED> tag was used for Netscape’s Web browsers. Therefore, the major problem for Web plug-ins is that different platforms and browsers may need difficult types of plug-ins and tags. For example, a flash-movie plug-in for Netscape Communicator is different from a flash-movie ActiveX control in Microsoft Internet Explorer. One possible solution is to detect the version of the Web browser being used and the working environment on the client side and then download the right version of the plug-ins for client-side users. Recently, many Web documents can use JavaScripts, a scripting language developed by Netscape, to simplify the processes of plug-ins download and to provide automatic plug-in installation services. By using JavaScripts, a Web server can detect what version of Web browsers is running on the clients, and with that information the server can provide Web plug-ins for automatic download. The client browser information for automatic download procedures can be acquired by accessing the User Agent information in the HTTP 1.1 Web logs. For example, an Internet Explorer 5.5 running on a Windows 2000 Professional machine will send a User Agent header strings to the server as the following: User Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0). Therefore, the Web server can use JavaScripts to retrieve the client-side information and then provide the appropriate version of plug-in software automatically. The following JavaScript example is an ArcIMS Java Viewer installation, which can automatically detect the browser and software configuration on the client machine. The JavaScript is used to retrieve client browser’s information and then re-direct the HTML document to “ie.htm” (Internet Explorer) or “netscape.htm” (Netscape browser). <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <title>ARCIMS 3.0 Viewer - Checking Browser</title> </head> <body> <SCRIPT LANGUAGE="JavaScript" TYPE="text/Javascript"> var browser = navigator.appName; if (browser.indexOf("Explorer")==-1) { document.location="netscape.htm"; } else { DOCUMENT.LOCATION="IE.HTM"; } </SCRIPT> </body> </html> 110 Figure 5-2. The ArcIMS Java Viewer installation. The ArcIMS Java Viewer example (Figure 5-2) illustrates the advantages of the Web plug-in architecture with JavaScript auto-detection. By retrieving client-side information, such as the browser version and the operating systems, the Web server can distribute modularized plug-in software automatically. In fact, the software architecture of Web plug-ins with auto-download functions is very similar to the distributed GIServices architecture proposed in Chapter Four. Both software frameworks provide dynamic software integration and information services. However, the actual application of distributed GIServices will not only involved software download but also software upload (from clients to servers) and data relocation (both data download and upload). The examples of Web plug-ins demonstrate the advantages of the dynamic software architecture and the feasibility of automatic download/distribution mechanisms. The automatic software download procedures in the previous example can be applied in the development of software agent behaviors. The next section will introduce the OpenGIS Web Map Server Implementation Interface Specification. This software example will illustrate the modularized software architecture for Internet Map Servers. The Web Map Server software framework in the specifications can be applied in the dynamic GIServices architecture proposed in this research. 5.1.2 The OpenGIS Web Map Server Implementation Interface Specifications The OpenGIS Web Map Server (WMS) Implementation Interface Specifications were one of the Web Map Server Specifications introduced by the OGC in 2000 with a series of activities of the OpenGIS Web Map Server Testbed (WMT) initiatives (OGC, 2000). Many software companies 111 and GIS professionals were involved in the design of WMS Specifications. The OpenGIS Web Map Server Implementation Interface Specifications provide guidelines for current Web Map Servers with the specifications of HTTP contents and Uniform Resource Locators (URLs) communication syntax. The WMS Specifications also lay out the major tasks of Internet Map Servers, which can be applied in the architecture of distributed GIServices. The major content of the OpenGIS WMS specifications focus on how to describe a Web Map Server and map services with standardized Uniform Resource Locators (URLs) syntaxs and sematics. A URL is a short string that identifies resources in the Web. The format of URL strings indicates the syntax and semantics of formalized information for location and access to resources via the Internet. Currently, many Internet-based map servers are using URLs to communicate between clients and servers, such as the Xerox Map Viewer and ESRI’s ArcView Internet Map Server (IMS). A URL contains the name of the scheme being used (http, ftp, gopher, etc.), followed by a colon and then a string (://map.sdsu.edu). In most URL schemes, the sequences of characters in different parts of a URL are used to represent sequences of octets used in Internet protocols. For example, in the FTP scheme, the host name, directory name and file names are such sequences of octets, represented by parts of the URL (Berners-Lee et. al, 1994). The following statements are URLs examples: ftp://ftp.is.co.za/rfc/rfc1808.txt http://www.math.uio.no/faq/compression-faq/part1.htm mailto:mduerst@ifi.unizh.ch The OpenGIS WMS Specifications standardize the syntax and semantic contents of the URLs for Web Map Servers and focus on the three major tasks. In general, “a standard web browser can ask a Map Server to do these things just by submitting requests in the form of Uniform Resource Locators (URLs). The content of such URLs depends on which of the three tasks is requested” (OGC, 2000, p.9). The WMS Implementation Interface Specification indicates that a Web Map Server should be able to: 1. Produce a map (as a picture, as a series of graphical elements, or as a packaged set of geographic feature data), 2. Answer basic queries about the content of the map, and 3. Tell other programs what maps it can produce and which of those can be queried further. (OGC, 2000, p.9) In order to accomplish these three major tasks, the OpenGIS WMS Implementation Interface Specification provides three types of interface: GetMap, GetFeature, and GetCapabilities in the 1.0 version. The following section will introduce the three types of URL interfaces. 1. The Map Request (GetMap) Interfaces. The design of Map Request Interfaces focuses on the display and production of Web-based map services. “To produce a map, the URL parameters indicate which portion of the Earth is to be mapped, the coordinate system to be used, the type(s) of information to be shown, the desired output format, and perhaps the output size, rendering style, or other parameters” (OGC, 2000, p. 9). The parameters of Map Request Interfaces include the map layers, picture format, picture size, background color, etc. Table 5-1 illustrates the parameters used in the map request interfaces. 112 Table 5-1. The Map Request Interfaces. 2. The Feature Request (GetFeature) Interfaces. The Feature Request Interfaces identify the request mechanisms for map contents and feature attributes. To query the content of the map features, the URL parameters indicate what map (layer) is being queried and which location on the map is of interest (X, Y coordinates). Table 5-2 indicates the elements of Feature Request Interfaces. Table 5-2. The Feature Request Interfaces. 113 3. The Capabilities Request (GetCapabilities) Interfaces. The Capabilities Request Interfaces are used to provide extensive map services, such as catalog services or metadata queries, in addition to the basic map display and attribute query (Table 5-3). For example, to ask a map server about its holdings, the URL parameters can be included in the Capabilities requests, such as “Database=Colorado+California”. However, current OpenGIS WMS Specifications do not specify the exact contents of the GetCapabilities Interfaces. The WMS Specifications only suggest possible use of GetCapabilities Interfaces and leave the detailed design of the interfaces and contents to software vendors with their vendor-specific parameters. Table 5-3. The Capabilities Request Interfaces. Besides the specification of three major Web Map Server tasks, the WMS Specifications also identify four main processing stages in a Web Map Server: Filter Service, Display Element Generator, Render Service, and Display Service (Figure 5-3). Figure 5-3. The four processing stages in a Web Map Server. Filter Service stage is the procedure to create a connection between a Map server and the GIS database (GeoData source). Usually, the connection can be established by standard Database Management Systems (DBMS) communication techniques, such as SQL, ODBC, or OLE DB. The goal of the Filter Service is to retrieve a subset of geographic data items based on user's requests from a large GIS database. Display Element Generator (DEG) Service focuses on how to process the information item received from the Filter Service and to convert the information from a database format (coverages, shapfiles, DLG) into well-defined geodata objects (vector or raster) with appropriate symbols and colors associated with each geographic theme. 114 Render Service is the stage for converting the well-defined geodata objects from the graphic display elements (geodata objects associated with symbols and colors) to an actual graphic display format, such as GIF, JPG, PNG, or vector-based graphics. Display Service is the final stage to display the graphic images or vector-based graphics (generated from the Render Service) on the client-side's screen or the Web browser. Usually, the display service is provided by Web browsers. If the graphic format is unique or vendor-specific, the Web browsers may need to download plug-ins or viewers to display Web maps. These four stages of Web Map Services (Display, Render, DEG, and Filter Services) may be located on client side machines or server side machines depending on the design of Web Map Services. The WMS Specifications identify three possible software frameworks for a Web Map Server: Thin Client, Medium Client, and Thick Client (Figure 5-4). The thin client framework is to locate only the display service on the clients and put the rest of services on the servers. The medium client model is to allow client machines to provide both the display and render services, while the server machines are responsible for DEG and filter services. The thick client model is to locate only the filter (database) services on the server machines and put the rest of services on the client side machines. Figure 5-4. The three types of client models for Web Map Servers. Three Types of Web Map Service Examples Based on the four stages of Web Map Services, the OpenGIS WMS specifications illustrate three types of Web Map Service examples: the Picture Case (thin client), the Graphic Element Case (medium client), and the Data Case (thick client). 1. The Picture Case. The Picture Case is the thin client framework within which the client side machines are only responsible for display services (Figure 5-5). The servers would retrieve the requested geodata from GIS databases, generate the image files, and send the files to the client’s Web browser for display. Most early examples of Web Map Servers are the Picture Cases, such as the Xerox Map Viewer, the GrassLinks, and ArcView/MapObject IMS. The advantage of the Picture Case is that with current Web browsers already supporting the display of image files, such as GIF or JPEG pictures, the map services in the Picture Case 115 can be offered in HTML documents from a regular Web browser without specialized plugins or viewers. Map users do not need to download and install software extensions in addition to the regular Web browsers. The implementation of Web Map Servers is easier than the other cases in terms of installation procedures and serve-side programming (such as CGI or Java servlets). However, map display functions, map symbols, and user interactions are very limited in the Picture case. Figure 5-5. The Picture Case. (OGC, 2000, p. 15) 2. The Graphic Element Case. The Graphic Element Case is the medium client model that the client side machines can provide both display and render services (Figure 5-6). The servers will process the geodata from the GIS databases and generate well-defined geodata objects with associated symbols and colors. Currently, the AutoDesk’s MapGuide is one example of the Graphic Element Case. The advantage of the Graphic Element Case is that the combination of render and display services can allow more interactive user manipulation of map features, such as the vector-based highlights/selections and dynamic graphic display elements. In the Graphic Element Case, map users can create a new graphic element on the client side and send it back to the server for update or new data ingests (such as the Map Notes function in MapGuide). The response time and display performance is be faster and better than the Picture Case, especially in the zoom-in, zoom-out types of display functions. However, map users have to download specialized Web plug-ins, ActiveX controls, or Java applets besides the regular Web browsers in order to see the graphic elements. The implementation of the Web Map Server is more difficult than the Picture Cases because the Graphic Element Case needs to modify the functions of the HTTP servers and to add a middleware on the server, such as Java Servlet Engine or CGI, for the communications between Web servers, GIS databases, and client-side Viewers. 116 Figure 5-6. The Graphic Element Case. (OGC, 2000, p. 15) 3. The Data (Feature) Case. The Data Case is the thick client architecture where the client side machines can perform display, render, and DEG services (Figure 5-7). The servers will only be responsible for communicating GIS databases and the client-side map viewers. The communication between the client-side map viewers and servers may use XML or GML to specify the geodata elements and map display properties. All the map tasks, such as projections and symbols, will be performed locally in a viewer. Currently, ESRI’s ArcIMS feature services is one of the examples of the Data Case. The advantage of the Data Case is that it allows users to have the most freedom in manipulating geographic data items. Users can change the symbols and colors of map features locally without sending requests to the servers. Also, users can display both the Web-based map features with the data layers from local machines in local hard drives. Since the client viewer already has all the display capabilities, map users may use the client viewer to perform basic GIS operations, such as buffering and overlay operations. However, the map users may need to pay client-side software license fee in the Data Case because such powerful client-side map browsers can be used as regular GIS software packages. Figure 5-7. The Data Case. (OGC, 2000, p. 15) In general, the three case examples have their own advantages and disadvantages. Currently, the Picture Case is the most popular framework adopted by the GIS industry. However, the Picture Case only provides limited map display functions and less user interactions. Along with the 117 progress of Web mapping facilities and information technologies, the Data Case and the Graphic Element Case may become more popular than the Picture Case in the future. The WMS Implementation Interface Specifications (version 1.0) only focus on the Picture Case (thin clients) with the standardization of URLs syntax and semantic contents. The next version of WMS may focus on the Graphic Element Cases or the Data Cases with XML specifications or GML applications. The three cases in the WMS Specifications demonstrate that different types of GIServices may need to adopt different types of software architecture. However, the software models proposed in the OpenGIS WMS Specifications do not provide an approach for dynamically changing the architecture of Web map services. For example, the software framework in the Picture Case will not be able to upgrade to the Graphic Element Case or Data Case if client-side map users ask for a higher level of map services or want to change their map applications. The ad hoc WMS Specifications do not provide a flexible mechanism for migrating a software framework from one case to another. Scenario-A (thin client) Scenario-B (medium client) Client Client GIS GIS Display Component Component Container Container Display Render Download Server Server GIS Render Component DEG Container Filter GIS Component Container Render DEG Filter Figure 5-8. The dynamic architecture for Web Map Services. One possible solution for providing an upgrade-able software framework for Web map servers is to adopt the dynamic GIService architecture proposed in this dissertation. By adopting the dynamic framework proposed in Chapter Four, the WMS software framework can be easily upgraded from the Picture Case to the Graphic Element Case or the Data Case by relocating the map service elements. Figure 5-8 illustrates such a dynamic architecture for Web Map Services, where each service element can be freely moved or relocated among client side machines or 118 server side machines. This dynamic architecture will be able to provide a flexible software architecture for Web Map Services. Figure 5-8 illustrates that different map users can access the same server which provides map services in either the Picture Case (scenario-A) or the Graphic Element Case (scenario-B). For example, scenario-A could be that a map user wants to display road maps in Boulder, Colorado, and the client machine only requires display services (the Picture Case is the best choice). Scenario-B could be that a map user wants to find out the top ten cities in the U.S. with the highest population growth rate. This scenario may require advanced map query capabilities and more flexible map display functions. Thus, the client machine could dynamically download a Render service element from a server to the client machine (the Graphic Element Case). By introducing the GIS component container and the dynamic GIService architecture, map users can download different types of map service components based on their needs from servers to clients or vise versa. The dynamic change of the architecture will provide more flexible, upgrade-able, and user-oriented Web map services for users. The next section will introduce three user scenarios and compare their solutions between traditional GISystems, OpenGIS WMS solution, and distributed GIServices framework. The three user scenarios will be used to identify the requirements of GIS components, geodata objects, and their metadata schemes. Also, the comparisons between traditional GISystems, OpenGIS Specifications, and distributed GIServices solutions in these scenarios will justify the design of dynamic architecture and the advantages of distributed GIServices. 5.2 Scenario One: Travel Plan (On-line Mapping) 5.2.1 Scenario Description A GIS user, Mike, plans a trip from Boulder, Colorado to Utah’s Arches National Park. He needs to acquire map information and make a hotel reservation for two nights in the city of Moab, Utah. Based on the scenario, Mike will need the following geographic information: 1. Colorado/Utah state highway road map; 2. The Arches National Park trails map; 3. Moab city road map; and 4. The hotel/motel locations in Moab, Utah. Justification: This scenario demonstrates four advantages of distributed GIServices: on-line, multi-layer mapping; data integration; on-line transactions; and smart information searching. GIS tasks: The scenario performs four major GIS tasks (Figure 5-9): 1. Display maps 2. Create a shortest route 3. Printout maps 4. Make a hotel reservation 119 In this scenario, there are three major human actors. Mike, a GIS user, wants to accomplish the trip plan. Tina, a map designer, designs appropriate symbols and layouts for several national park maps, including the Arches National Park trails and puts them into a GIS server for on-line access. In fact, there should be other map designers for highway roads, city roads, and hotel locations. Since the map designers’ activities are similar, this scenario will only identify Tina, as a representative map designer. Kevin, the hotel reservation manager, will provide the hotel reservation information (price, room availability, etc.) and accept the reservation from Mike. Beside the three human actors, several software agents will be mentioned later. Display Routing Tina: Map Designer Print out Mike: GIS User Hotel reservation Kevin: Hotel reservation manager Figure 5-9. The travel plan scenario. 5.2.2 Traditional GISystems Solution Mike purchases a CD-ROM called [Travel Plan U.S.A] and tries to design his trip plan. The CD-ROM includes detailed roads for all 50 states and hotel information. However, the data sets in the CD-ROM do not include trail maps of the Arches National Park. Mike has to purchase another CD-ROM called [U.S. National Parks]. Since the two CD-ROMs use different types of databases and map formats, there is no way for Mike to integrate the trail maps with the road maps. Therefore, Mike prints out two different sets of travel maps: one for highways/hotel/roads and one for national park trails. However, the trail maps use many color-based symbols and the printouts look terrible in Mike’s black-and-white laser printer. Mike decides to use the color laser printer in the main office to print out the trail map again. The next task for Mike is to make a hotel reservation. Mike uses the CD-ROMs to retrieve ten hotels located in Moab, Utah and their phone numbers. He makes ten phone calls, one to each hotel, to compare room availability and prices. Finally, he makes the reservation. He spends roughly ten hours completing his travel plan. 5.2.3 OpenGIS Solution This user scenario focuses on four types of geographic information services: map display, network routing, map printout, and on-line hotel reservation. The OpenGIS Consortium is developing several Implementation Specifications, including the Web Map Server (WMS) (OGC, 2000), Catalog Services (OGC, 1999), and Simple Features (see section 3.2.1 and section 120 5.1.2 for details). These OpenGIS Specifications can be applied in the user scenario. The following solution will adopt the OpenGIS Specifications in the actual implementation plan. First of all, three types GIServices may be provided in the framework of the OpenGIS WMS Specifications as the following: ï‚· Map display services can be provided by the design of GetMap Interfaces. ï‚· Network routing services can be provided by the design of GetFeature (Query) Interfaces. ï‚· Map printout services can be provide by the design of GetCapabilities interface. Nevertheless, the OpenGIS WMS framework does not indicate the detailed implementation procedure for map printout functions. However, current OpenGIS WMS Specifications do not specify the mechanisms in bridging GIServices with other types of on-line information services, such as the on-line hotel reservation services. In general, there are two advantages in the OpenGIS Specification framework for this user scenario. The first advantage is that the WMS Specifications can provide a software framework to share multiple map servers and distributed GIS databases. If all client-server communication interfaces for map display services are standardized, a client map browser can access different map servers at the same time. The second advantage is that the OpenGIS Simple Feature Specifications can be applied in the data structure for Web map services. With the adoption of the OpenGIS Simple Feature Specifications, Web map servers can provide more comprehensive map query and dynamic map display functions for users. Simple Feature Specifications can provide a structured, content-oriented scheme in disseminating geographic information across the Internet. There are three potential problems in adopting the current OpenGIS framework in this case. The first problem is that the OpenGIS WMS Specifications do not indicate any type of software distribution mechanisms for downloading/uploading distributed GIS components between clients and servers. Thus, the Web map servers under the OpenGIS framework will not be able to migrate from one software architecture to another. The second problem is the information services are limited in the OpenGIS framework because the OpenGIS specifications do not indicate any mechanism for integrating other types of online information services, such as the financial transaction services or the on-line hotel reservation services. Finally, the OpenGIS Specification framework does not provide an alternative solution for legacy map servers or unique software packages, which may not adopt the OpenGIS standards or interfaces. Those obsolete map servers and legacy databases will not be able to share geodata or map services with those Web map servers which follow the OpenGIS specifications and standards. On the other hand, the dynamic GIServices solution proposed in this research can adopt software agents to bridge the gap between the legacy map servers and the OpenGIS map servers. For example, if a unique map server does not follow the OpenGIS standards, GIS programmers can design a software agent to convert the server interfaces from the vendor-specific format to the OpenGIS compatible standards. The following section will illustrate in detail the advantages of the distributed GIServices solution. 121 5.2.4 Distributed GIService Solution In order to achieve such a scenario in distributed GIServices, four players need to collaborate: Mike, Tina, Kevin and several software agents. Their actions will be described in the following paragraphs. Mike’s actions: Mike’s first action is to log on to his GIS workstation and open a local GIS component called Map Display. In the menu, Mike checks out the Extended Component: Travel Plan option. The Map Display component loads the Travel Plan component. Mike describes his travel plan to the Travel Plan component with the following information: Start point: Boulder, Colorado End point: Moab, Utah Required maps: 1) highway roads, 2) the Arches National Park trails, 3) Moab city roads, and 4) hotel locations. One minute later, Mike sees four maps being overlaid and displayed on the Map Display window. Mike’s machine remotely accesses the data sets via the Internet and has the match index for their qualification in this case (Figure 5-10). He selects Generate the shortest route function and the component generates the shortest road path from Boulder, CO to Moab, UT and displays the route on his screen. Then, Mike picks a few hotels in Moab and opens the On-line Reservation Agent function from the Agents menu. He describes the possible prices and date for his trip. The hotel reservation agent retrieves the information Mike requests and sends the results back in ten seconds. Mike compares prices and availability. Then, Mike selects one hotel and makes an on-line reservation for his two day vacation. Twenty seconds later, Mike gets a confirmed message from the on-line hotel reservation system. Figure 5-10. The travel plan component. * * (Note: this interface is a mockup window, designed with MS Visual Basic.) 122 After Mike finishes the hotel reservation, he prints out the travel maps. Mike selects Map printout function in the Travel Plan component with black and white laser printer. A warning message is sent back to Mike: The current map display is designed for a color screen display or color printer. Suggestion (Please enter your choice): 1. Reselect a color printer. 2. Use the pre-defined B&W display format to print out your maps. Mike decides to print out the color map remotely in the main office, which has a color laser printer. He goes to the main office and picks up these map printouts. Mike spends 30 minutes to finish his travel plan. Tina’s actions: Tina updates and maintains the databases of the National Park maps on the www.nationalpark.gov GIS server. In order to provide smart map display and printout functions, Tina designs appropriate legends and pre-defined symbols for different types of media, including CRT screen, B/W printers, and color printers. The different types of map representation and symbols will be encapsulated inside the [National Park trails] geodata object. Kevin’s actions: Kevin updates hotel information (pricing, room availability, reservation list) by accessing the online hotel reservation system. The on-line hotel reservation system can accept the reservation from Mike by collaborating with the [hotel reservation agent]. Agents’ actions and collaborations: The machine agent located in Mike’s workstation collects the first request from Mike’s action as the following: ï‚· Requested GIS component: [Travel Plan]. ï‚· Client machine profile: 200Mhz PC, B/W laser printer, 10BT Ethernet. Then, the machine agent asks the component agent for the Travel Plan extension. Mike’s component agent determines that the Travel Plan component is already located on Mike’s machine. Then, Mike’s component agent integrates it with the Map Display component. Mike’s geodata agent collects Mike’s second request as the following: ï‚· Map extent: from [Boulder, CO] to [Moab, UT] ï‚· Required geodata objects: [U.S. Highways], [the Arches National Park trails], [Moab city roads], and [hotel locations]. Then, the geodata agent broadcasts the request to several GIS nodes on the Internet and identifies archives of possible geodata objects: ï‚· U.S. Highways: www.aaa.com (100% match), www.randmcnally.com (80% match). ï‚· The Arches National Park trails: www.national-park.gov (100% match). ï‚· Moab City roads: www.moab.ci.us (95% match) ï‚· Hotels and motels: www.hotels.com server (90% match) 123 The geodata agent selects the highway data set in aaa.com instead of usroads.com based on the level of match grade. Then, the five GIS machine agents (Mike’s workstation, www.aaa.com, www.national-park.gov, and www.moab.ci.us, www.hotels.com) begin to determine discuss whether these data objects should be downloaded to Mike’s machine. Their final decision (based on the network performance and machine profiles) is to leave these data objects on the servers and build remote database connections to Mike’s machine. With support from the geodata agent, the Travel Plan component can remotely access four data objects located in different servers and display four layers on Mike’s screen. When Mike requests the price check for several hotels, the hotel reservation agent connects to the on-line hotel reservation system and searches for the hotel, which fulfill Mike’s request. The search results are displayed on Mike’s screen ten seconds later. When Mike decides to reserve one hotel and gives his credit card information, the hotel reservation agent confirms his reservation by accessing the on-line hotel reservation systems. When Mike requests the print out function, the machine agent suggests that Mike print out the trail maps on a color printer instead of a B/W printer based on the metadata descriptions in the [National Park trails], which indicates a color requirement. Then, Mike’s office machine accepts the request and prints out the maps on the color laser printer. 5.2.5 The Deployment of the Dynamic GIService Architecture This scenario is established dynamically on five GIS nodes and one external system (Figure 511). 1. Mike.colorado.edu GIS node (Mike’s machine) 2. www.aaa.com GIS node 3. www.national-park.gov GIS node 4. www.moab.ci.us GIS node 5. www.hotels.com GIS node 6. hotel-reservation.com (on-line hotel reservation system) 5.2.5.1 The Arrangement of Distributed GIS Components and Geodata Objects This case utilizes two distributed GIS components: Map Display and Travel Plan. Both reside on Mike’s GIS node. To accomplish his task, Mike needs remote database access to four geodata objects, located in distributed GIS nodes (www.aaa.com, www.national-park.gov, www.moab.ci.us, and www.hotels.com). Their access methods are defined in the metadata contents, which will be mentioned in the next section. 124 GIS node: Mike.colorado.edu GIS node: www.aaa.com Map display component U.S Highways M Travel Plan component Machine agent Color Printer Component agent Geodata agent Hotel reservation agent GIS node: www.national-park.gov M GIS node: www.moab.ci.us M On-line hotel reservation system: Database Ramada Moab city roads GIS node: www.hotels.com M hotel-reservation.com Days Inn The Arches N. P. M Super 8 Hotels and motels : metadata : data object Figure 5-11. The dynamic architecture of travel plans scenario. 5.2.5.2 Required Operational Metadata Contents In order to provide intelligent, self-managing data objects, this case needs to define the contents of GIS-operation Requirement Metadata for two GIS components: [Map Display] and [Travel Plan] and Database connectivity for four distributed geodata objects as the following: The contents of GIS-operation Requirement Metadata for [Map Display] component should include map extent, coordinate system for georeferencing and map overlay functions; and predefined symbols and legends arrangement for different media output formats (CRT screen, color printer, B/W printer, etc.). The contents of GIS-operation Requirement Metadata for [Travel Plan] component should include Road types and network segmentation for two data object, [U.S. Highway] and [Moab city roads] in order to generate the shortest route; and Hotel reservation ID for [hotel] data object in order to access on-line reservation system. The contents of Database Connectivity Metadata should include data format, database type, remote access method, and migration tool for remote database connections. 125 5.2.5.3 Required Agents’ Responsibilities Four types of agents are required in this scenario: machine agents, component agents, geodata agents, and hotel reservation agents. Machine agent’s responsibilities include accepting Mike’s request and retrieving system information on Mike’s workstation, passing Mike’s request to other agents, and verifying the printout options with other agents. Component agent’s responsibilities include searching for [Travel plan] component and integrating the [Travel plan] component with the [Map Display] component, and communicating with geodata agents for map display requirement. Geodata agent’s responsibilities include searching for [Highway roads], [The Arches National Park Trails], [Moab city roads], and [Moab hotel locations], remote database connections for distributed geodata objects, communicating with the hotel reservation agent to retrieve hotel ID, and communicating with the component agent for map display. Hotel Reservation agent’s responsibilities include accepting Mike’s requests on the hotel pricing and reservation date, retrieving the hotel ID from the [Moab hotel] with the help of geodata agents, and communicating with the on-line hotel reservation system. 5.2.6 Discussion This scenario illustrates four advantages of distributed GIServices. First, the network-based, dynamic architecture provides an efficient mapping solution for travelers and GIS users. The online mapping services can help federal agencies and private companies provide accurate geographic information to the public and their users in an effective and efficient way. Comparing to information on traditional paper or in CD-ROM format of information, on-line mapping for the distribution of maps is cheaper, easier, and more efficient. Second, this architecture illustrates the data integration from four different GIS nodes and displays multiple layers (highways, city roads, park trails, and hotel locations) together at once. Traditional media, such as maps and CD-ROMs have problems in integrating and displaying together due to different data formats and heterogeneous software. The integrated map display and printout functions can provide more flexible services for different types of GIS tasks. Third, this scenario demonstrates the potential capability of on-line transaction services for distributed GIServices. The connection between GIS nodes and external databases (hotel reservation system) will extend the scope of GIServices into many different types of services. Users can use distributed GIServices to reserve hotel rooms, rent cars, or even order a pizza online. These on-line services will increase the total value of distributed GIServices. For example, on-line hotel reservation services will create additional value for the original on-line mapping service in this case. Fourth, the dynamic architecture can provide smart information searching and customizable representation by using agent-based communication mechanisms. With the help of intelligent agents, GIS users can get the best qualified data sets and produce the well-designed maps. The agent-based communication will reduce the complicated tasks for the GIS users and provide more friendly, effective information services. 126 5.3 Scenario Two: Wal-Mart Site Selection (Spatial Analysis) 5.3.1 Scenario Description A GIS spatial analyst, Dick, wants to locate a new Wal-Mart store in Boulder. He needs to obtain related map information and perform a GIS overlay analysis for this task. The following criteria must guide the Wal-Mart site selection: 1. 2. 3. 4. The land use must be in a residential urban area. The site must lie above the 500 year flood plain. The site must be located within 200 meters of a major road. The neighborhood of this site (within 1 mile) should have appropriate demographic characteristics: high incomes (annual salary > $50,000), young ages (median ages < 40), higher population density (population-density > 1,000 per square mile), and low crime rate (crime risk index < = 2)*. 5. The new store will need 50,000 square feet in a compact shape. 6. The site which fulfills the previous criteria (1-5) and has the lowest land value. * (Notes: The crime risk index scales from 1(lowest) to 5(highest). The crime risk index is generated by the Boulder County Policy Department based on the annual crime records for each census tract in Boulder County.) Based on this scenario, Dick will need the following data for his analysis: 1. Land use; 2. Flood zone; 3. Roads; 4. Census data and crime risk index; and 5. Land values and parcel records. Justification: This scenario demonstrates four major tasks for distributed GIServices: distributed GIS processing; automated data conversions; customizable GIS software packages; and sharing GIS models and knowledge. Overlay analysis Buffering analysis Jack: GIS software vendor Shape fitting analysis Dick: GIS analyst Data conversion Matt: GIS programmer Figure 5-12. The Wal-Mart site selection scenario. 127 GIS Tasks: This scenario requires the following GIS tasks (Figure 5-12): 1. Overlay analysis; 2. Buffering analysis; 3. Shape fitting analysis; and 4. Data conversion. In this scenario, three human actors need to collaborate. Dick, a GIS analyst, wants to select a site for Wal-Mart. Jack is a GIS software vendor who sells GIS packages and provides a general solution for GIS projects. Matt is a freelance software programmer, who can customize software functions by writing GIS components. 5.3.2 Traditional GISystems Solution Dick goes to the Boulder County Planning Department and asks for the required data sets. The Planning Department has [Land use], [Flood zone], and [Census] data sets. Dick finds out that [Land value and parcel records] is stored in the Tax Assessor Department, [Roads] is stored in the Colorado Department of Transportation (CODOT), and the [Crime Risk Index] is stored in the Police Department. Dick goes to each department one by one and finally gets all the required data sets. Dick spends two weeks for contacting people and requesting data from these departments. Dick also finds out that the Police Department doesn’t have the [Crime Risk Index] for the current year. The Police Department tells Dick that the updated index will be released in their annual reports six months later. So, the Police Department gives him the original crime report records of this year in a text-based format. Dick comes back to his office and uses his GIS software to display the data sets provided by the Planning Department, CODOT, and the Tax Assessor Department. He uses the overlay function and buffering GIS functions and generates a map called procedure-A, which includes the area above the 500 years flood plain, within the residential urban area and within the 200 meters buffer of major roads. The procedure-A map includes four candidate polygons (Figure 5-13). Figure 5-13. The [procedure-A] layer in the Wal-Mart Site location. 128 Next, Dick converts the crime data from thousands of text records into the [Crime Risk Index] in a map format. Dick gets the equation for calculating the crime risk index from the Police Department. Dick uses desktop GIS software to convert these text data by address-matching, classifying, and overlaying the crime rate areas with the census tracts. The conversion task is very time-consuming because Dick needs to reformat the crime rate records, import them into the GIS database, and write a program to calculate the index. Dick spends one week finishing the crime rate data conversion. Finally, Dick generates another map layer called [Crime Risk Index], which integrates the demographic data and crime rate in census tracts. Dick’s next step is to generate a one-mile buffer from the center points of the four [procedure-A] polygons, then identify the demographic characteristics and [Crime Risk Index] within these buffer zones. Dick compares the demographics statistically, and chooses one polygon with the lowest crime rate, high population density, and highest incomes as the potential site for the location of Wal-Mart site (Figure 5-14). Figure 5-14. The buffer procedure in the Wal-Mart Site location. The final step is to determine footprints for the exact location of the Wal-Mart building, which can fit inside the candidate land parcels. However, Dick finds out that in his GIS software, there is no such function called rectangle shape fitting. He makes a phone call to Jack, the GIS software vendor, and asks if extension modules are available. Five hours later, Jack calls back and says that no such function is available right now, but the GIS software company is willing to develop this new function in the next version release next year. Dick decides to take an alternative approach. Dick uses the graphic tools in his GIS software to draw several squares inside the potential areas (Figure 5-15). He inserts these squares into another map layer and identifies them to the land value parcel in order to generate the cost of land for each potential site. Since Dick is manually doing the drawing and calculation, he can only test 10 possible squares in the candidate site. Finally, Dick picks four sites with the lowest land value. The four sites will be considered as the possible 129 locations for the new Wal-Mart store in Boulder. Dick spends four weeks to finish this GIS project. Figure 5-15. The shape fitting analysis for the Wal-Mart site selection. 5.3.3 OpenGIS Solution This user scenario illustrates four types of geographic information services: overlay analysis, buffering operations, shape fitting analysis, and data conversion. The OpenGIS solution will focus on the adoption of the WMS Specifications, Catalogue Services, and Simple Features proposed by the OGC. According to the OpenGIS WMS Specifications, three types of GIServices can be provided as the following: ï‚· Overlay analysis service can be provided by the design of GetFeature Interfaces. ï‚· Buffering operations and shape fitting analysis services can be provided by the GetCapabilities interface. Nevertheless, the OpenGIS WMS framework does not illustrate detailed implementation procedures for buffering operation and shape fitting analysis services. However, the current OpenGIS WMS specifications do not specify the mechanism in converting geodata objects from one format to another (data conversion). The major advantage of adopting the OpenGIS solution in this scenario is that the Catalog Services in the OpenGIS Implementation Specification (OGC, 1999) can provide an efficient way to search geodata items from the spatial data clearinghouses and distributed GIS databases. With the help of the OpenGIS Catalog Services, users can retrieve the metadata of spatial information from geodata clearinghouses and process related information, such as the data access URLs, authors, and data accuracy, for further use. There are three potential problems with the current OpenGIS framework in this scenario. The first problem is that the OpenGIS WMS Specification framework mainly focuses on the basic map display functions and query functions instead of spatial analysis functions. Therefore, the future development of the OpenGIS Web Map Servers will focus on map-driven capabilities 130 (zoom-in, zoom-out, pan, etc.) instead of analysis-driven functions (overlay, buffering, network analysis, etc.). The map-driven capabilities are highly demanded by the AM/FM industry, federal or local government agencies, and the public. However, what the scientific research community really needs is the analysis-driven GIS capabilities. The OpenGIS WMS Specifications do not have such analysis-driven functions in their framework. The second problem is that the OpenGIS Specifications do not indicate an automatic data conversion mechanism for distributed GIServices. Data conversion is one of the most important tasks in GIS projects and distributed GIServices. However, the OpenGIS Specification frameworks leave this essential task to the GIS vendors to develop their own data conversion tools. As different vendor-based data formats have marketing and copyright issues to consider, it is unlikely for individual companies and GIS vendors to develop the automatic data conversion mechanism. The third problem is that the OpenGIS Specifications do not provide an independent software development framework that allows individual GIS programmers or small companies to develop new modularized GIS components for distributed GIServices. Current software development frameworks proposed by the OpenGIS Specifications are all systemspecific or vendor-specific, such as the COM-based framework, the SQL framework or the CORBA framework. On the other hand, by adopting the distributed GIServices solution proposed in this research, the software development framework (the GIS nodes) is independent from systems and software vendors. Software programmers can develop their own GIS components under this framework and distribute their components dynamically to other GIS nodes. The following section will illustrate in more detail the advantages of the distributed GIServices solution. 5.3.4 Distributed GIServices Solution In order to complete the site selection using distributed GIServices, four players must collaborate: Dick, Matt, Jack and agents. Dick’s Actions: On his GIS node, Dick launches his GIS software. He uses the Map Display component and asks for the required Boulder maps (flood zone, land use, census data, roads, crime risk index, parcel records). These data sets are displayed on his computer screen 20 minutes later. The screen indicates that these data are retrieved from the GIS servers in the Boulder Planning Department, the Police Department, the Tax Assessor Department, and the CODOT. The screen also displays a message that an automatic procedure for converting the crime risk index resides on the server of Boulder’s Police Department. Five minutes later, the geodata agent informs Dick that the conversion is complete and displays the crime risk index map on Dick’s screen. Dick creates a GIS operation model provided by his GIS software to formalize his GIS modeling procedures (buffering and overlaying) for the Wal-Mart site selection. Dick sends out his GIS operation procedure to the component agent on his GIS workstation. Twenty minutes later, the final result layer, [Procedure-A], is sent back to Dick’s workstation. Dick overlays four [Procedure-A] polygons with the [Crime Risk Index] and census data and identifies a candidate area (with the lowest crime rate, high population density, and highest incomes) for the Wal-Mart site. 131 The next step is to find a rectangle shape fitting function for the land parcel records within the candidate area. The component agent tells him that there is no such function on his workstation. However, there may be a similar component on other nodes on the Internet. Dick sends a request to search other nodes. Ten minutes later, the search results indicate three similar GIS components available for downloading. Dick reviews the descriptions and functionality of the three GIS components and purchases one for $100 for one day’s use. Dick downloads the new GIS component and plugs it into his GIS software. Then he uses this new component to perform the rectangle shape fitting analysis. He gets 100 candidate sites and automatically generates a list of the cost of land parcels for 100 candidate sites. The new GIS component helps Dick to display the ten sites which have the lowest land value. The total process of rectangle shape fitting takes about one hour to finish. Dick spends a total of six hours to finish this GIS analysis of the Wal-Mart site selection. Jack’s Actions: Jack is the GIS software vendor who provides desktop GIS software for general use. His responsibility is to provide modularized GIS software modules with external interfaces, which can accept new components or functions from third-party developers. Matt’s Actions: Matt is a freelance GIS software developer. He develops customizable GIS components for specific tasks, such as shape fitting analysis, hydrological models, and 3D visualization. He puts his GIS components on the Internet for downloading and charges a usage fee. Agent’s Actions and Collaborations: The machine agent located in Dick’s workstation collects the first request from Dick’s action: ï‚· Requested GIS component: Map Display. ï‚· Requested geodata objects: land use, census data, flood zone, crime risk index, roads, and land value and parcel records. ï‚· Client machine profile: 600Mhz PC, color ink-jet printer, 10 BaseT Ethernet. The machine agent asks the component agent to search for [Map Display]. Dick’s component agent determines that the [Map Display] component is already located on Dick’s machine. The component agent launches the [Map Display] component. The machine agent also sends a request to the geodata agent, which broadcasts the request for geodata objects to several GIS nodes on the Internet. The geodata agent locates these geodata objects on four GIS nodes, at the Planning Department, the Police Department, the Tax Assessor Department, and CODOT. The geodata agent downloads the [land use], [roads], [flood zone], [land value and parcel records] and [census data] onto Dick’s local GIS database. However, the geodata agent finds that the crime risk index in the Police Department is two years old and decides to generate a new index based on the recent crime records report. The geodata agent in Dick’s machine asks the remote geodata agent in the Police department to convert the crime records into the crime risk index format. The geodata agent in the Police Department launches an Auto Data Conversion tool and converts the text-based crime records 132 into an ARC/INFO coverage, [Crime Risk Index]. Ten minutes later, the converted data object is sent back to Dick’s machine and stored in the local GIS database. After the data conversion, the component agent accepts the second request from Dick, who wants to perform a GIS modeling operation, called [Procedure-A]. The content of [Procedure-A] includes three GIS operations. The first operation is to buffer 200 meters from [Roads] and generates a new data item called [Buffer zone]. The second operation is to overlay [Land use], [Flood zone], [Crime Risk Index] and [Buffer zone]. The final operation is to re-select [Land use = residential area], [Flood zone = 500 (years)], [Buffer zone = inside buffer], and [Crime Risk Index < 2]. Dick’s component agent brings the [Procedure-A] model to related data servers and performs the whole operation across the networks (Figure 5-16). After the completion of [Procedure-A] GIS operation, the machine agent accepts the second [Shape Fitting Analysis] request from Dick. CODOT The Tax Assessor Department Roads Procedure-A Land value and parcels Procedure-A Procedure-A The Policy Department The Planning Department Flood zone Crime Risk Index Land use Procedure-A Procedure-A Dick’s GIS node GIS model template Procedure-A Figure 5-16. The roaming [Procedure-A] operation in the Wal-Mart site selection. The component agent finds that there is no [Shape Fitting Analysis] function on Dick’s desktop. The component agent broadcasts the request to the Internet, and identifies a GIS node called Matt-GIS.com that can provide the [Shape Fitting Analysis] component via the Internet. The component can be downloaded for one day’s use by paying $100. The agent downloads the component and plugs it into the desktop. Dick runs the shape fitting analysis and finishes the site selection analysis. The entire process takes six hours. 133 5.3.5 The Deployment of the Dynamic GIService Architecture This scenario is established dynamically on six GIS nodes: (Figure 5-17) 1. Dick.colorado.edu (local GIS node) 2. Boulder-Planning-Department.ci.us (public GIS node) 3. Boulder-Police-Department.ci.us (public GIS node) 4. Boulder-Tax-Assessor.ci.us (public GIS node) 5. CODOT.ci.us (public GIS node) 6. Matt’s-GIScomponent-shops.com (commercial GIS node) GIS node: Boulder-Planning.ci.us GIS node: Dick.colorado.edu M Land use M Flood zone Map Display Overlay Analysis Procedure-A GIS node: Boulder-Tax-Assor.ci.us Buffering Land value and parcels M Machine agent Geodata agent Component agent GIS node: CODOT.ci.us Land use Roads Census Land use Shape fitting analysis M Roads Crime risk index GIS node: Boulder-Police.ci.us GIS node: MattText-based crime records Shape Fitting Analysis GIS.com Auto Data Conversion 3D-shading Spatial Statistic Model M Crime Risk Index Figure 5-17. The dynamic architecture of the Wal-Mart Site selection. 5.3.5.1 The Arrangement of Distributed GIS Components and Geodata Objects Five distributed GIS components are used in this case: [Map Display], [Overlay Analysis], [Buffering], [Shape Fitting Analysis], and [Auto Data Conversion]. The first three components are located on Dick’s machine. This [Shape Fitting Analysis] component is transferred from Matt’s GIS node to Dick’s GIS node for the site selection task. The [Auto Data Conversion] 134 component is located on the Boulder Police Department node, where Dick’s GIS node can invoke the remote process and launch the conversion procedure in the Police Department. Several distributed geodata objects are downloaded onto Dick’s GIS database. These geodata objects include [Land use], [Roads], [Flood zone], [Census data], [Land value and parcel], and [Crime Risk Index]. The reason for using download instead of remote data base connectivity (like Scenario One) is that the buffering operation and overlay operation may require complicated data process. Local database management will be more efficient than remote database access. Also, the [Procedure-A] template is roaming from one node to another during the overlay operations and brings the results back to Dick’s node. 5.3.5.2 Required Operational Metadata Contents In order to provide such an intelligent GIS operation, the metadata content of GIS-operation Requirement must be defined for five GIS components: [Map Display], [Overlay Analysis], [Buffering], [Shape Fitting Analysis], and [Auto Data Conversion]. The content of GIS-operation metadata for [Map Display] component should include map extent, coordinate system for georeferencing and map overlay functions, and pre-defined symbols and legends arrangement for different media output formats (CRT screen, color printer, black-andwhite printer, etc.). The content of GIS [Overlay Analysis] component will include specifications of a data type (point, line or polygon) and of a map extent for defining the processing area. The contents of the [Buffering] component will include specifications of map units and ground units for specifying the buffering distance and verification of topology for the qualification of the buffering operation. The [Auto Data Conversion] component will require specifications of a data format for the data conversion tools and an inventory of available conversion tools for identifying the tools/components for the conversion. The [Shape Fitting Analysis] component will include type (point, line, or polygons), map unit, distance unit, and topology for the qualification of the shape fitting operation. 5.3.5.3 Required Agents’ Responsibilities Machine agents, component agents, and geodata agents are required in this scenario. Although most of the agents’ responsibilities are similar to the First Scenario, some unique responsibilities in this case should be emphasized. First of all, the component agents generate new metadata for each new data object after the overlay and buffering operations. This process is completed using the inheritance feature of Object-Oriented Modeling. Figure 5-18 shows an example of an overlay operation, which indicates that some metadata contents can be generated or inherited from the parents’ metadata, including map extent and coordinate system. The input data objects are [land use] and [flood area]. The output data object is criterion-A. The overlay operation also updates several contents in metadata, such as Data-ID, and Data Producer. Finally, the overlay operation generates a new metadata item called Data Lineage in the criterion-A object. 135 The geodata agent searches for requested data on the Internet and establishes the download procedures for distributed data objects. The method of download will be determined by the [Database connectivity metadata], encapsulated in distributed objects, the network configuration, and machine profiles. Additional considerations include database integration and compatible storage formats. Land use Flood areas Database connectivity Database connectivity GIS-operation Metadata: Metadata ï‚· Map extent: a1,a2, b1,b2 ï‚· Coor. System: decimal degree ï‚· Data producer: USGS ï‚· Data ID: USGS-133252-352 GIS-operation Metadata Metadata: ï‚· Map extent: x1,x2, y1,y2 ï‚· Coor. System: decimal degree ï‚· Data producer: Boulder county ï‚· Data ID: CO-10323-32 Overlay Analysis Component Agent (Union) Criterion-A Database connectivity Metadata GIS-operation Metadata: ï‚· Map extent: max(x1,a1), min(x2,a2), max(y1,b1), min(y2,b2) ï‚· Coor. System: decimal degree ï‚· Data producer: Dick GIS consulting company. ï‚· Data ID: DICK-235252-232 ï‚· Data lineage: date (2-12-200), operation (union), input-1(CO-10323-32), input-2(USGS-133252-352). ï‚· Figure 5-18. The new metadata generated by an overlay operation. The component agent needs to access remote GIS nodes (Matt’s machine) and negotiates the download procedure for distributed GIS components (Shape Fitting Analysis). Permission for downloading the GIS components is established by sending the usage fee to Matt’s GIS node. The component agent also needs to integrate different types of components, including Java applets, ActiveX control, etc. into the component container. Finally, the data conversion processes need the collaboration between geodata agents and component agents. The geodata agent will identify the types of source data and the target data format. The geodata agent will also retrieve the name of the conversion programs by accessing the Database connectivity Metadata in distributed geodata objects. After the geodata agents tell 136 the component agents the name of conversion program, the component agent will remotely invoke the conversion program and perform the conversion automatically on the remote machine. 5.3.6 Discussion This scenario illustrates four advantages of distributed GIServices. First, the dynamic architecture can facilitate the efficient use of computing resources. This scenario demonstrates that the collaboration between four GIS nodes can improve computing speed and database access capability. The six distributed data objects can be accessed and downloaded into Dick’s local databases for the overlay GIS operation. Second, the scenario illustrates that the interactions between agents and GIS components can help users convert geodata objects automatically by accessing appropriate conversion programs and the metadata descriptions inside the geodata objects. The [Auto data conversion] is executed on the remote machine at the Police Department, which has better data conversion tools. With the help of geodata agents and component agents, GIS users will no longer struggle with the data conversion problems and be able to concentrate on the actual GIS tasks. The automatic conversion process can provide more accurate and efficient use of geographic information. Third, this scenario indicates that LEGO-like GIS components can extend the functionality of traditional GIS software. The extensible functions will provide more efficient use of GIS software and facilitate more user-oriented GIS applications. Fourth, the distributed GIServices provide a great communication channel among GIS professionals, programmers, and researchers. For example, by accessing and downloading Matt’s specialized GIS component, GIS professionals can provide a more comprehensive location analysis for their customers. Some researchers may put a new hydrological model on the Internet, which can be accessed by other hydrological researchers. The exchangeable programs and GIS models will encourage the communications among the GIS community by sharing their knowledge and experiences. 5.4 Scenario Three: GPS Navigation (Cross-platform Application) 5.4.1 Scenario Description Eva wants to visit her friend in Superior, Colorado. She will use her Palm-size PC with Colorado maps to connect with the Global Positioning System (GPS) device in her car. She wants to locate her friend’s home address, 1400 Begonia St., Superior, Colorado and a Chinese restaurant close to her friend’s home for dinner. Based on this scenario, Eva will need the following geographic information: 1. Roads in Colorado 2. Points of Interest in Colorado 137 Justification: This scenario demonstrates four major advantages of distributed GIServices: crossplatform GIS applications on Palm-size PCs; real-time data update and wireless data download; integrated GIS/GPS applications; and distributed data ingest and modification. GIS task: This scenario requires the following GIS tasks (Figure 5-19): 1. Map display and data download on a Palm-size PC; 2. GPS navigation; 3. Address Matching; and 4. Data ingest and modification. In this scenario there are three human actors. Eva, a GIS user, wants to use the GPS navigation system to find her friend’s home and also locate a Chinese restaurant nearby. Ron, a GPS map data provider, provides current geographic information for GPS users. Dave, a Chinese restaurant owner, needs to update the restaurant address in the GPS map server when he relocates his restaurant. Map display and download GPS navigation Address matching Eva: a GIS user Distributed data ingest and modification Ron: GPS map data provider Dave: the Chinese restaurant owner Figure 5-19. The GPS navigation scenario. 5.4.2 Traditional GISystem Solution Eva uses her desktop computer and a [US roads ‘97] CD-ROM to convert a map area which includes Boulder and Superior into a Palm-size PC-readable map format. Eva connects her Palm-size PC to her desktop computer using serial cables and uploads the map into her Palm-size PC. Eva also uses this CD-ROM to find out a Chinese restaurant located on 1100 Rock Creek Parkway, which is close to her friend’s home. Eva connects the GPS device on her car and links it to the Palm-size PC. Eva starts her car and types the destination address 1400 Begonia St., Superior, Colorado. However, the Palm-size PC shows an error message “Unable to match this address” on the screen. The problem is that Eva’s friend lives in a new community and Begonia street was only built three years ago. Obviously, the [US roads 97] CD-ROM has not updated the changes in this area for three years. Eva has to call her friend to ask for detailed directions. She gives up on the idea of using the GPS navigation tool. After thirty minutes, Eva arrives at her friends’ home. Two hours later, they decide to have dinner together. Then they drive to 138 Rock Creek Parkway and try to find the Chinese restaurant, shown in the [US roads 99] CDROM. Unfortunately, they can not find the restaurant on Rock Creek Parkway. Again, the CDROM didn’t update the restaurant address information. Eva and her friend have to change their dinner plans. 5.4.3 OpenGIS Solution This scenario illustrates four types of geographic information services: map display, GPS navigation, address matching, and distributed database update and modification. The OpenGIS solution will focus on the adoption of the WMS Specification and Geography Markup Language (GML) Specifications. According to the OpenGIS WMS Specifications, two services can be provided as the following: ï‚· Map Display can be provided by the GetMap Interfaces ï‚· Address matching can be provided by the GetFeature Interfaces However, the OpenGIS WMS Specifications do not specify the services for distributed database update and modification, nor as the integration of GPS navigation services. The advantages of adopting the OpenGIS solution in this scenario is that the OpenGIS GML Specifications use the Extensible Markup Language (XML) to develop the metadata scheme and feature objects (OGC, 2001). GML can provide dynamic, multiple styles in map display and data representation. Therefore, the OpenGIS map servers can generate a map document written in GML and distribute the GML map document to different types of devices, such as a CRT monitor, PDA screen, or the display window of a cellular phone. Since the GML-based map documents only specify the geographic features and map properties, the actual map display mechanisms, such as symbol widths, text-size, and color/contrast values can be automatically adjusted in different map browsers designed for different devices. By adopting the GML-based map documents, the Web map servers will be able to provide more flexible map display choices for different computer environments. On the other hand, three problems are found in the current OpenGIS framework in this scenario. The first problem is that the OpenGIS Specifications do not indicate how to update or modify distributed databases. The distributed GIS database management is essential in providing accurate geospatial information services across the Internet. The second problem is that the OpenGIS Specifications do not illustrate the GIServices for mobile devices or GPS units. With the progress of micro-computers and information technology, the mobile devices, such as PDAs, Pocket PCs, and cellular phones, will be integrated with wireless computer networks and the Internet services in the future. The final problem is that the development of the OpenGIS Specifications and Standards are too slow comparing to the advancement of information technology in the GIS industry. For example, the current version of the OpenGIS WMS Implementation Interface Specification (1.0) only specifies the Picture Case scenario. However, many commercial GIS products, such as ESRI’s ArcIMS and AutoDesk’s MapGuide already provided the software solution in the Data Case or Graphic Element Case. The WMS Specifications provided by the OpenGIS are probably one year behind the current GIS market. Because of the gap between the commercial software solution and the OpenGIS Specifications, it will be difficult to ask the GIS industry to follow the OpenGIS standards in the future. 139 The distributed GIServices solution will not provide standards or specifications for the GIS industry. Instead, this research only provides a development framework to allow different technologies to co-exist at the same time and allow different map services to communicate with each other via software agent communication mechanisms. The following section will illustrate more detailed descriptions about the distributed GIService solution for this scenario. 5.4.4 Distributed GIService Solution In order to realize this scenario in a distributed computing environment, four players need to collaborate (Eva, Ron, Dave, and software agents). Eva’s Actions: Eva uses her Palm-size PC with modem connection to download the Boulder-Superior maps directly from the Internet using her cellular phone. Eva connects the Palm-size PC with the GPS in her car. Eva types in her friend’s address as 1400 Begonia St., Superior, Colorado. The Palmsize PC shows the destination location on the screen and designs an appropriate route for Eva. On the way to Superior, Eva also selects a Chinese restaurant close to her friends’ house, and identities the Empress restaurant located on McCaslin Blvd. Eva visits her friends and they have a very nice dinner together. Ron’s Actions: Ron’s task is to provide updated GPS data for the multi-purpose GPS users. He needs to maintain a database server from which GPS users may download geographic information, including roads and points of interest. The data server must be updated frequently by database administrators. Therefore, the GPS database should provide password-protected access for distributed data providers to update their own geospatial information. For example, the owner of the Chinese restaurant, Dave, should be able to update the address of his restaurant via his home PC. Dave’s Actions: Dave’s task is to make sure the address of his restaurant is correct in the GPS databases. He will get the access password from Ron for permission to update the Points of Interest data object in the GPS database. When the restaurant moves to another location, Dave can use his PC and modem to access the GPS databases and update the change immediately. 140 Figure 5-20. Compaq Palm-PC with Trimble CrossCheck AMPS Cellular mobile unit. Agents’ Actions and Collaborations: The machine agent in Eva’s Palm-PC will configure the hardware installation for GPS and cellular phone automatically by accessing the plug-and-play descriptions in these peripherals and hardware drivers. The machine agent identifies two built-in components, [Map display] and [GPS navigation] in the Palm-PC. The component agent launches both [Map display] component and [GPS navigation]. When Eva sends her request for the destination address 1400 Begonia St., Superior, Colorado, the geodata agent looks for the [Roads, Colorado] geodata object inside the Palm-size PC and verifies the date and version for the requested information. The geodata agent compares the date in the metadata of [Roads, Colorado] and [Points of Interest, Colorado] as 3-20-1997 and decides that both data objects need to be updated. Then, the geodata agent sends out a ‘need to update’ request to the GPS-Data.COM server by wireless communication (cellular phone). Another geodata agent located in the GPS-Data.COM gets the request from Eva’s Palm-PC and initiates the download procedure for the [Roads, Colorado] and [Points of Interest, Colorado]. Ten minutes later, the data download is complete. When Eva requests the Chinese restaurant, the address is correct because Dave already updated the address two months ago when the restaurant was relocated. When Dave updates the address, the geodata agent in the GPS-Data.COM will verify the authorization of Dave’s access privilege and allow Dave to update the databases. The geodata agent will also update the date information for [Points of Interest, Colorado] since the object has been changed. 5.4.5 The Deployment of the Dynamic GIService Architecture This scenario will be established dynamically on two GIS nodes and one PC (Figure 5-21). 1. Eva.palm-pc.net (Eva’s Palm-PC GIS node) 2. GPS-Data.com (Ron’s GPS/roads database) 3. Dave’s PC (Dave’s Chinese Restaurant Desktop) 141 GIS node: Eva.Palm-PC.net Map display component GPS navigation Machine agent M Modem Cellular Phone Component agent Geodata agent M Point of Interest (CO) Dave’s PC Roads (CO) Update the current address of Chinese restaurant: 1300 McCaslin Blvd GPS Modem GIS node: GPS-Data.com Geodata agent M M M Point of Interest (CO) : metadata : data object Roads (CO) Figure 5-21. The dynamic architecture of GPS navigation. 5.4.5.1 The Arrangement of Distributed GIS Components and Geodata Objects Two GIS components are used in this case: [Map Display] and [GPS navigation], both of which reside in the Palm-PC. These two components are cross-platform compatible for both PCs and Palm-PCs. In order to provide such a function, the component container in the GIS node will need to accept these components and convert them into machine readable programs. Two distributed geodata objects are used in this case: [Roads (Colorado)] and [Points of Interest (Colorado). These geodata objects will be updated automatically by accessing the GPS-data.com server. 5.4.5.2 Required Operational Metadata Contents Automatic download and update functions are provided in this case specifying two GIS components, [Map Display] and [GPS Navigation]. The content of [Map Display] component 142 will include specification of a map extent, a coordinate system for georeferencing and map display, pre-defined symbols and legends arrangement for different media output formats, and screen sizes for Palm-size PC (240x300). The contents of [GPS navigation] component will include the specification of a coordinate system for converting the original GPS data projection to the target geodata projection for georeferencing. 5.4.5.3 Required Agents’ Responsibilities The responsibility of machine agents is to integrate the external hardware, GPS and modem, with the OS and GIS components. Component agents will need to integrate the [Map Display] and [GPS navigation] in order to display the real-time GPS location on the map display window. The geodata agent’s responsibility is to check the current version of data in the local disk and initiate the download procedure via the GPS-Data.COM server. 5.4.6 Discussion This scenario illustrates four advantages of distributed GIServices. First, the design of GIS component container will provide the capability of cross-platform applications. The same [Map Display] component can be used in both traditional PC environments and Palm-size PC which has different operating systems and hardware. Second, with the help of geodata agents, this scenario demonstrates that the real-time data download and update can improve the accuracy of geographic information and improve the quality of information services. Third, the LEGO-like GIS components make the integration of GIS operations and GPS services easier and more efficient. Fourth, the distributed database update and maintenance can provide more efficient database management. 5.5 Chapter Summary This chapter introduces two software examples and demonstrates distributed GIServices solutions for three real-life GIS tasks. The software examples illustrate the possible software framework for distributed GIServices and the advantages of modularized GIS component architecture. These scenarios examine actual requirements of dynamic architecture, efficient use of GIS components and geodata objects, and the collaboration among machine agents, component agents, and geodata agents. The detailed deployment of dynamic architecture also illustrates the main responsibilities of different agents and the considerations of operational metadata contents. As opposed to traditional, monolithic GISystems solutions, distributed GIServices can provide modularized, customizable and user-oriented applications for GIS professionals and users. The first scenario emphasizes on-line mapping services and illustrates the advantages of data sharing and remote database access. This case also demonstrates the potential in integrating other information services (hotel reservation) and intelligent information search and representation. The second scenario emphasizes distributed computing and automatic data conversion. The adoption of distributed GIS components facilitates the efficient use of computing resources, and provides a dynamic framework for customizable GIS software and specialized GIS functions. The third scenario illustrates one solution for cross-platform 143 applications, real-time data update and download. The design of GIS nodes can extend the use of geographic information to different computer platforms, such as Palm-size PCs and Personal Digital Assistant (PDA) devices. With the help of agents, the integration for GPS and GIS applications and distributed database management becomes easier, faster, and more efficient. Several issues are highlighted through the scenarios. First, these three scenarios demonstrate that some GIS components are essential for general GIS operations while other components may only be used for specific tasks. For example, the [Map Display] component was used in all three GIS scenarios. The [Shape Fitting Analysis] component is used only for site location analysis. Therefore, the LEGO-like design of GIS components can facilitate the reusability of GIS components and the customizability of GIS applications. Second, to provide the cross-platform capability for GIS components and data objects, the design of GIS nodes must provide appropriate containers (like a virtual machine and a virtual database), to link heterogeneous hardware and databases to GIS data objects and programs. Third, by defining the actual contents for operational metadata, the three scenarios illustrate that different GIS tasks will require different metadata content for both GIS components and geodata objects. Thus, the design of operational metadata should provide the flexibility and extensibility for actual metadata implementation. By using the object-oriented modeling technique, the operational metadata in GIS component and data objects become more flexible and extensible. Also, the object-oriented metadata can facilitate the automatic metadata generation for new data objects by inheriting from their parent objects. Fourth, network security, program stability, and data protection are important for the quality of distributed GIServices. Since the distributed architecture allows users to access GIS databases and geodata objects via the network, how to protect the data from outside invaders or crackers, how to prevent the data input mistakes by distributed data providers, and how to provide robust GIS components for different computer platforms, will become the major considerations for the actual implementation of distributed GIServices. In summary, these three scenarios serve as important lessons for distributed GIServices: 1. Dynamic, distributed GIServices can provide flexible map services and add value to information by sharing geographic processes with other people, systems, and services. 2. Collaboration among different types of agents is the key to the success and integration of distributed GIServices. 3. Operational metadata for geodata objects and GIS component can make geographic information services more convenient, intelligent and effective. 144 CHAPTER 6. SUMMARY AND IMPLICATIONS 6.1 Overview of the Research The goal of this dissertation is to establish an intelligent, dynamic, and flexible architecture for distributing geographic information services on the Internet. Thirty-six years ago, when the first GIS project, Canada Geographic Information System, was established by Dr. Tomlinson (Star and Estes, 1990), computer systems were so huge that several rooms were required to place the hardware and vacuum tubes. At that time, no one could imagine the dramatic development of GIS in subsequent decades. Today, people can put a Palm-size PC on their hands and search for the location of City Hall in their hometown or use the Web map applications to plan their vacations in California during the spring break. Dynamic and interactive GIS applications permanently changed the way people live and work now. On-line geographic information services have become more and more important to the public and to researchers. Currently, hundreds of World Wide Web servers provide on-line mapping functions, including urban plans, natural resource management, and census data (Limp, 1997; Coleman,1999). The need for global access and decentralized management of geographic information is pushing the GIS community to distribute mapping services on the Internet. This research provided an overview of current developments in distributed computing and online GIS services in Chapter Two and Three. In Chapter Four, this dissertation described a dynamic architecture for distributing GIServices on the Internet, which will facilitate on-line geographic information services and improve the efficiency and development of GIS software. It identified the required elements and responsibilities for metadata and software agents by reviewing three hypothetical scenarios and two software examples in GIServices in Chapter Five. By adopting distributed GIS components, an operational metadata scheme, and agentbased communication mechanisms, distributed GIServices will encourage sharing of analysis methods, spatial models, and geographic knowledge. The results of this research are intended to help the GIS community adopt a sustainable, technology-independent strategy in developing open and distributed GIServices. It clarifies the operational relationships between clients, servers, geospatial data sets, and GIS operations, and justifies the role of metadata and agents for distributed GIServices. By combining GIS components and data objects dynamically across a network, computing resources may be utilized more effectively. The real value of distributed GIServices lies not in the architecture itself, but in the actual content underlying the architecture. The members of the GIS community, including geographers, GIS professionals and GIS users, are the people who actually create the valuable content in distributed GIServices. Well-designed applications will energize the growth of GIService networks. Content creators are the key to the success of distributed GIServices. This chapter will argue that as more people participate in networking GIServices, the GIService 145 networks will become more valuable. Hopefully, the increasing value of GIServices will attract more people to participate and ensure the sustainable development of the GIService networks in the long run. A central issue in this dissertation is to design a sustainable framework, which can cope with the rapidly changing technology and various needs in the Twenty-first century. In fact, our world is changing so fast. New technologies are invented every month. New milestones are reached in scientific research every year. Every day the Internet and the World Wide Web are reshaping the whole world with the power of “E” (e-commerce, e-mail, e-business, e-trade, etc.). Along with the rapid change of information technology, traditional computing systems can no longer survive in the jungle of the Information Technologies. A flexible and upgradable architecture for distributed GIServices is essential to ensure sustainable development of GIS in the next few decades. Another underlying reason for the adoption of Internet-based GIServices is the need for communication in the GIS community. The reason for distributing GIServices on the Internet is not that the Internet technology is fancy and popular, but that the Internet can help people communicate more efficiently and effectively. For example, local and federal governments can use the Internet to disseminate maps and geospatial data to citizens. GIS software vendors can use the Internet to deliver upgraded software products or get feedback from users. Geographers can use the Internet to exchange hydrological models and spatial analysis procedures. The Internet-based communication of GIServices will facilitate the synergy of the GIS communities by sharing geospatial data, programs, models, and knowledge. 6.2 Implications The previous discussion illustrates that the main goal of this dissertation is to establish a dynamic architecture for comprehensive information services rather than to design a robust information system. Different from the traditional focus of GISystems, distributed GIServices emphasize the establishment of a blending of roles between GIS users and service providers. The deployment of distributed GIS should focus on user-centered design and interactive GIS operations between users and software developers. The following sections will discuss the unique features and indications of distributed GIServices based on three topics: service-oriented applications, valueadded information processes, and the exponential growth of GIS networks. 6.2.1 Service-oriented Applications The first unique implication of distributed GIServices is the establishment of service-oriented applications. This term, service-oriented, indicates that geographic information services are provided to help people accomplish their works and meet the needs of the public. Three criteria for distributed GIServices characterize the service perspective. Distributed GIServices are user-centered. Traditional GISystems and projects usually isolate the users from the actual system design and operations due to the complicated user interfaces and programming languages. Most GIS users can only submit their requests to GIS professionals. Then, the GIS professionals use GISystems to generate the analysis results and gave them back 146 to the end-users. The isolation of actual users and GIS operations is inefficient. Misuse of geographic information results from miscommunication between the GIS users and the GIS professionals. Many GIS users cannot get the information they really need in traditional GISystems, due to the second-hand processes of geographic information. In order to solve these problems, LEGO-like GIS components and agent-based communication permit GIS users to submit their own GIS operations with the help of agents, and to modify their GIS software modules based on the specific tasks. In Chapter Four, a GIS user (Mike) adds a [Travel Plan] extension to his GIS package. Eva integrates GPS navigation component with her Palm-size PC and utilizes the GPS navigation service module. In short, distributed GIServices allow end-users more choices in modifying their GIS programs, models, and spatial operations. Customizable GIS operations will help GIS users to develop their own applications based on their needs. Distributed GIServices focus on long-term, interactive relationships between users and service providers. Traditional GISystems adopt specific information technology without considering the long-term development strategy. In contrast, distributed GIServices emphasize long-term information services and interactive relationships. For instance, the on-line mapping services and GPS navigation services mentioned in the previous chapter are intended to provide daily information services. A well-designed GIService may be used for over ten or twenty years. For example, the Canada Geographic Information System was developed over two decades and was still being used in the middle of 1990’s (Foresman, 1998). Thus, the design of distributed GIServices should focus on extended commitments between information providers and users. Moreover, user feedback is essential for the success of long-term distributed GIServices. By accepting user feedback and suggestions, service providers can improve the quality of information services. The relationship between users and service providers will become closer, more interactive, and mutually dependent. Diversified GIServices are required to solve real world problems. Traditional GISystems are usually fixed and unchangeable and can not provide customizable services without special software modification. In the real world, different users require different information services, different tasks require different information services, and different countries require different information interfaces. This research proposed a dynamic architecture and customizable GIS components in order to provide diversified, customizable information services for different people and different situations. Geodata objects and GIS components in the dynamic architecture are used to provide comprehensive services for users. The modern financial service framework can be used as an example to illustrate the focus of service-oriented applications in distributed GIServices. By using network telecommunications technology, modern financial services allow users to access accounts or withdraw money easily via ATM machines or telephones. People can use computer modems to transfer money between different bank accounts. With the help of network technology, financial services become more flexible and customizable. End users have more choices in managing money in different accounts. It then builds a long-term relationship between users and their financial service providers. Now, distributed GIServices face the same challenge, which is flexibility and customizability for different users and tasks. In the future, geographic information services will be integrated with other types of information services. For example, the first scenario (travel plan) mentioned in Chapter Four illustrates integration between an on-line hotel reservation service and on-line mapping services. The 147 integrated services can demonstrate the benefits of electronic information services, which are fast, efficient, and customizable. These integrated frameworks for different types of information services may be called e-services. In the future, the development of e-services will be applied in our daily lives, such as transportation, financial management, shopping, and entertainment. The role of geographic information services will be like the compass for e-services, which can provide directions and geo-referencing functions for related themes and services. By distributing GIServices on the Internet, the public can enjoy the integrated e-services with many powerful GIS capabilities, such as spatial query, location analysis, map representation, and visualization. 6.2.2 Value-added Information Processes The second implication of distributed GIServices is to add value to information processes. A value-added information process is an information service that uses available facilities to add additional services and in doing so, increases the total value of services. By combining distributed GIS components and geodata objects on the Internet, GIServices will be able to provide more types of services and generate added worth to traditional GISystems. Two types of value-added processes could work this way. First, distributed GIServices can generate new services and add new values by combining different GIS components (Figure 6-1). Traditional GISystems have limited GIS functions inside the closed system framework. The value (X1) generated from a single GISystem is constrained by system functions and closed architecture. With the advantages of LEGO-like GIS components, distributed GIServices can combine different functions/components to combine values (X1 + X2 + X3) under the dynamic architecture. Traditional GISystems Raw Data Raw Raw Data Data Items Map Display Component Service-A (New value: X1) Items Items Distributed GIServices Raw Data Raw Raw Data Data Items Items Items Service-A (New value: X1) Map Display Component Service-B (New value: X2) Combining [Map Display] and [Travel Plan] Travel Plan Component Service-C (New value: X3) Combining [Map Display] and [Spatial Analysis] Spatial Analysis Component Figure 6-1. The new values generated from the usage of GIS components. The second type of value-added process is that multiple GIS projects can access a single GIS database remotely via the network, adding value to it (Figure 6-2). Traditional GISystems lack remote database access capabilities and require manual data conversion procedures. By adopting 148 standardized communication protocols and agent-based data conversion approaches, distributed GIServices can facilitate the reuse of databases and reduce the cost of new data generations for GIS projects. For example, a completed Boulder City GIS database can be used for three different projects, such as Wal-Mart site selection, urban zoning, and the school bus routing analysis. The total value of Boulder City database becomes more valuable than if used by a single application. The more a database is used, the more valuable that database becomes. In other words, the distributed database access and management will encourage reuse of existing geodata objects and prevent redundancy of data reproduction. The cost of data production takes up 70% of the total application cost; this cut in data production can also save labor and application program money and add value to GIS applications. Project-A: Wal-Mart Site selection Boulder City GIS Database Project-B: Urban Zoning Geodata objects Geodata Geodata objects objects (Boulder data sets) Project-C: School bus routing The total value of GIS database has increased Each project cost has been reduced Figure 6-2. The value-added process in distributed databases. In general, the life cycle of information value for distributed GIServices is quite different from traditional GISystems. Figure 6-3 illustrates the life cycle of information value in traditional GISystems. update V0 C 0 without update T0 T0 TN TN Figure 6-3. The life cycle of information value in traditional GISystems. (Costs are assumed to account for inflation over time.) The initial value of information is fixed (V0) in traditional GISystems. If information items are not updated, the value will decrease gradually over time due to the decrease in data quality, 149 currentness, and the increase in data uncertainty. Therefore, GIS database managers have to update information items frequently in order to keep the information value relatively stable. The cost of database management in traditional GISystems is also illustrated in Figure 6-3. The initial cost carries a significant startup expense due to the creation of new databases. The subsequent operation cost will be lower than the initial cost because the GIS database has been established, but the cost will increase again concurrently with data update operations. Figure 6-4 indicates a different life cycle of information value in distributed GIServices. The value of information items will not decrease but increase gradually over time by sharing information with others. C 0 Vs I II III I T T 0 N T0 II III TN Figure 6-4. The life cycle of information value in distributed GIServices. (Costs are assumed to account for inflation over time.) There are three phases in the life cycle of information value in distributed GIServices. The first phase (I) is the initial stage of data sharing and exchange. The number of users is limited at this time because not many users are aware that the information item is available on-line. As more people become aware, data will be used in multiple applications (phase II), and the value of information will increase during this time. In phase III, information sharing will decrease because data become out of date and used only for specialized tasks (historical). The growth of information value will slow down. When the factor of information uncertainty becomes higher than the factor of corrected errors, the growth of information value will drop or perhaps stabilize, unless the data retain historical value. The cost of database management in distributed GIServices also has three stages (Figure 6-4). The initial cost requires a significant startup expense in the first phase (I), same as the traditional databases. The subsequent operation cost in phase II will be much lower than that of the traditional databases because as more people use the information, they will report errors or incompleteness to help data providers correct errors and improve the data quality. When the number of users decreases in phase III, the cost of GIS database management will increase again concurrently with data update operations. 150 Cn I Un II III TN T0 Figure 6-5. The life cycle of information uncertainty in distributed GIServices. Figure 6-5 illustrates the life cycle of information uncertainty in distributed GIServices. There are three phases similar to the life cycle of information value. The first phase (I) is the initial data sharing and exchange. Few user feedbacks and corrections happen at this stage and the information uncertainty increases gradually because of temporal currentness. In phase II, the number of corrected errors (Cn) will increase along with the popularity of the information items and allow the original data providers to improve the data quality. The feedback from users will slow down the growth of information uncertainty. Eventually, the information uncertainty (Un) will rise again over time because of temporal currentness and the decrease of corrected errors from users (phase III). To summarize, distributing GIServices can add value to information. Distributed GIServices can generate new information value by sharing geodata objects, by combining GIS components, and by exchanging knowledge and metadata between the members of the GIS community. Moreover, distributed database access and maintenance can reduce the cost of database management. In short, distributed GIServices will encourage collaborative error checking processes and improve the data quality by user feedback and update. In the computer industry, an example of value-added information processes can be found in the development of the Linux operating system (Welsh et. al., 1999). Because of the open environment provided by the Linux, the feedback from a large number of users improved the kernel of Linux, making the operating system more stable and therefore, more useful. By releasing the power of autocratic control, the developers of Linux get thousands of free software engineers and programmers working together via the Internet to improve this operating system. Thousands of Linux applications are being developed by software companies, which will add value to the Linux operating system and to software companies. Distributed GIServices borrow the development strategy from the Linux example by releasing the control of data objects and GIS components. The sharing of data sets and programs for distributed GIServices will encourage the reuse of these computing resources, improve data quality and reduce uncertainty, and reduce the cost of data production and GIS programming. 151 6.2.3 The Exponential Growth of GIS Network Values The third implication of distributed GIServices is the exponential growth of GIS network values. As mentioned earlier, the increasing value of distributed GIServices is derived from sharing and exchanging information and programs. The basic requirement for sharing is the establishment of network connections for GIServices. If a network builds more connections, the more sharing and exchanging events may happen, and then more value would be generated from the network. Kelly (1998: 23) states that “mathematics says the sum of value of a network increases as the square of the number of members. In other words, as the number of nodes in a network increases arithmetically, the value of the network increases exponentially.” According to his statements, a network value will be increased according to the growth of nodes and links in a distributed GIServices network. The mathematical results are shown in Figure 6-6. Additional links Original links GIS node GIS node Node# = 4; Link# = 24 GIS node Node# = 5; Link# = 40 Node# = 6; Link# = 60 Node# = 7; Link# = 84 Node# = 8; Link# = 112 GIS node Node# = 9; Link# = 144 GIS node Node# =10; Link# = 180 Node # = 2 Link # = 4 Node # = 3 Link# = 12 Figure 6-6. The exponential growth of distributed GIServices network. Figure 6-6 illustrates the growth of simplified GIS nodes and the network connections in distributed GIServices. In this figure, each GIS node is simplified as one grouped geodata object and one GIS component (two dots inside each box). In the real world, the arrangement of geodata and components may be more complicated in distributed network environments. If we only consider the simplified situation, the growth of network connections comes along with the increasing number of GIS nodes. The number of GIS nodes increases arithmetically from 1, 2, 3 to 10, and the number of network connections increases exponentially from 4, 12, 23, to 180. Along with the exponential growth of connections, the values of the whole network will also grow exponentially. This example shows the unique feature of network development. It also illustrates that the power of network comes with the increasing number of members in the network. In order to know how network connections facilitate the increase of the total value of network, the following discussion will combine the three scenarios mentioned in Chapter Five and put them together inside an integrated GIServices network. By integrating computer resources, geodata objects, and GIS components in all three scenarios, the three GIS users (Mike, Dick, and Eva) can generate more valuable services for themselves. It can also demonstrate the exponential growth of network 152 values. The following paragraphs will describe briefly the original design of the three scenarios and illustrate their advantages and new values if we combine them together in an integrated GIService network. The first scenario introduced in Chapter Five (section 5.1.1) illustrates that Mike, a GIS user, plans a trip from Boulder, Colorado to Utah’s Arches National Park during the spring break. He acquires related map information (highways, park trails, city roads, and hotel locations) from the Internet, prints out his travel maps, and makes an on-line hotel reservation in Moab, with the help of a hotel reservation agent. In the second scenario (section 5.2.1), Dick wants to locate a site for a new Wal-Mart store in Boulder. He obtains related map information from the data server in Boulder’ Planning and Police department and performs a GIS overlay analysis for his task. He also downloads a GIS component called [Shape Fitting Analysis] to help him finish the location analysis more efficiently. The third scenario in Chapter Five (section 5.3.1) indicates that Eva, a homemaker, wants to visit her friend in Superior, Colorado. She uses her Palm-size PC, wireless phone and the Global Positioning System (GPS) device in her car to navigate to her friend’s house. The Colorado data sets in her Palm-size PC has been updated in real-time via her cellular phone from the GPS-data.COM server. She also uses the same service to choose a Chinese restaurant close to her friend’s house. By integrating all computer resources, geodata objects, GIS components, users in the three scenarios described in Chapter Five, the new GIService framework will be able to provide many new services and add new value for the three GIS users (Figure 6-7). Integrated GIS components in the new framework: ï‚· [Map Display], [Travel Plans], [Hotel Reservation] ï‚· [Overlay Analysis], [Buffering], and [Shape Fitting Analysis]; ï‚· Mike [GPS Navigation] Available geodata objects in the new framework: ï‚· [U.S. Highway], [National Park Trails], [Hotels], etc. ï‚· [Boulder GIS Database] and [Boulder Crime Rate Index] ï‚· [Points of Interest (CO)] and [Roads (CO)] Dick Eva Figure 6-7. The integrated GIService network for different users. The following is a list of examples of possible new services in the integrated GIService framework. GIS users can utilize the integrated computer resources and accomplish different GIS tasks according to their needs. 153 A. Mike can use the [Boulder GIS Database] and [Overlay Analysis] to make a housing plan and find the potential housing area for his family. B. Mike can use the [Point of Interest (Colorado)] data object to locate the possible gas stations for his spring break vacation. C. Dick can use the [Roads (Colorado)] data object and the routing function in [Travel Plan] to make a routing plan for the UPS delivering system. D. Dick can use the routing function in the [Travel Plan] component and [Boulder GIS Database] to generate the shortest route for the school bus. E. Eva can use the [Travel Plan] component with external database systems to purchase an airplane ticket and rent a car. F. Eva can use the [Boulder Crime Rate Index] data objects generated from Dick’s machine to monitor neighborhood safety for her community. These six examples mentioned above are some representative applications and projects which can benefit from combining the three scenarios. Under an integrated GIServices framework, both geospatial data objects and GIS programs are shared and can provide more flexible information services for different users. In fact, dozens of new services can be generated from the integrated network of the three scenarios. In short, the exponential growth of network values is a very attractive advantage for the deployment of distributed GIServices network. The exponential growth will attract more people and projects to participate in the network with a distributed, dynamic architecture. As more people join in the GIS network, more information and programs will be available for sharing and exchanging, and thus the GIS network will become more valuable. As the GIS network becomes more valuable, it will attract more people to join in. Therefore, the positive reinforcement can ensure sustainable growth of distributed GIServices and the popular use of geographic information in the future. 6.3 Future Impact Along with the popular use of the Internet and telecommunications technologies, on-line geographic information services are essential in providing us with an informative and convenient living environment. This section will discuss the major impact on three groups in the GIS community: the GIS industry, geographers, and the public. 6.3.1 Future Impact on the GIS Industry The development of distributed GIServices has three major impacts on the GIS industry. Future design of GIS software will adopt advanced distributed component frameworks, such as DCOM, CORBA, and Java. By adopting these dynamic, modularized frameworks, distributed GIServices will exploit the reusability and compatibility of GIS software and data objects. Currently, the software design in traditional GISystems rarely emphasizes reusability due to a framework that is closed, isolated, and vendor-proprietary. Heterogeneous software and database engines have caused serious problems in data sharing and processing incompatibility. With the help of modern software engineering tools, software reusability will generate higher productivity, increase the efficiency of software programming, and provide a mechanism for prototyping when developing a new system or adopting a new technology (Yourdon, 1993). Reusable GIS software can reduce the programming workload significantly. GIS users will have 154 more choice in developing their own GIS applications without the constraints of GIS software vendors. The second impact on the GIS industry is that the design of distributed GIServices can help the GIS industry migrate gradually from legacy systems and adopt new technologies. It is expensive for traditional GISystems to adopt new technologies due to the ad hoc design and the lack of modularized frameworks. The LEGO-like GIS components, agent-based communication, and operational metadata scheme proposed in this dissertation can facilitate the GIS industry to migrate from the legacy GISystems to a new framework of distributed GIServices. Future IS [Information System] technology should support continuous, incremental evolution. IS migration is not a process with a beginning, middle, and end. Continuous change is inevitable. Current requirements rapidly become outdated, and future requirement cannot be anticipated. The primary challenge facing IS and related support technology is the ability to accommodate change (e.g. in requirements, operation, content, function, interfaces, and engineering) (Brodie and Stonebraker, 1995, p. xvii). The design of a dynamic GIServices framework in this dissertation will provide a possible solution for the migration challenge and facilitate a long-term, sustainable development of GIServices. By modularizing the dynamic framework of distributed GIServices into three containers (agent containers, component containers, and data containers) within GIS nodes, the GIS industry can gradually migrate these components from the legacy GISystems into a new framework easily. For example, the migration process can start from the user interface components, then the upgrade of GIS database engines, and finally, the replacement of the core GIS programs. Brodie and Stonebraker referred this type of migration as the incremental approach, which can be carried out by small incremental steps until the desired long-term objective is reached. Each step requires a relatively small resource allocation. Another type of migration mentioned in their book is the Cold Turkey approach, which attempts to rewrite the legacy IS from scratch to produce the target IS, using modern software techniques and the hardware of the target environment (Brodie and Stonebraker, 1995). Their conclusion is that the incremental approach is much better than the Cold Turkey approach, as the latter usually fails in real world cases. In short, the modularized framework of distributed GIServices will facilitate the GIS industry to adopt the incremental approach for the migration from the legacy GISystems and the adoption of new technologies. The third impact on the GIS industry is that distributed GIServices will change the development strategies that GIS software vendors have and the current monopolized GIS market into an open, free competition environment. Traditional markets for GISystems are usually targeted to the GIS professionals, the GIS consulting companies, government agencies, and the high-level education institutes. Along with the development of distributed GIServices, the major markets of GIS software will shift towards the public and end users. Small, modularized, Web-based GIS components will replace huge, workstation-type, and all-in-one GIS software packages. The design of GIS components will emphasize extensible GIS operations and customizable tools for different applications. The pricing for GIS software will also change from year-based site licensing to the usage-based, individual charges. For example, a user may be charged for a 10time use or a three-day use of a buffering module. The different pricing schemes of GIS 155 software will encourage more people to use GIServices in their daily lives and short-term activities. Moreover, the distributed GIServices will release the power of control in GIS software from the major GIS vendors into individual GIS software programmers and small software companies. Since the future development framework of GIS software will provide application programming interfaces (APIs) in a distributed component framework, such as the Microsoft COM-based applications and the Java platform, individual programmers can access or invoke GIS programs without knowing the original source code. Thus, GIS software programmers can easily extend the functionality of GIS packages by adding new GIS components and develop their own specialized products. In the future, the traditional GIS software vendors who currently control the major GIS markets will shift their focus of software development from the all-in-one types of GIS products into the main GIServices engines and frameworks. The GIServices framework will include the core programs for major GIS operations, the Internet map server for disseminating GIServices, and the spatial database engines for efficient data storage and management. These products developed by the major vendors will allow third-party companies to develop extensions and additional functions for these core products. Distributed GIServices will change the market focus in the GIS industry and re-assign the new responsibilities for its players, including the consulting companies, software vendors, and application users. The dynamic architecture and open APIs will facilitate more effective and collaborative software development among different GIS companies and programmers. GIS users will have more choice of GIS software in the future thanks to the open and free competition GIS market. The distributed, modularized architecture can provide a long-term, sustainable development strategy for the GIS industry and adopt new technologies over time. Since “today’s new system will be tomorrow’s maintenance problem” (Yurdon, 1993, p. 276), methods to upgrade legacy systems and take advantage of new technologies will be the major challenge for the GIS industry in the Twenty-first century. The design of distributed GIServices in this research may provide a possible solution to the challenge and help the GIS industry adopt new technologies in a smooth and efficient fashion. 6.3.2 Future Impact on Geographers Contemporary GIS promises geographers comprehensive spatial analysis functions and modeling techniques as a powerful tool for synthesis (Abler, 1987). However, spatial analysis and modeling functions are in fact weak and premature in traditional GISystems. Even if some GISystems do provide limited spatial analysis functions, most users rarely use the analysis or modeling functions in their GIS application, which causes the problem of system under-use (Davies and Medyckyj-Scott, 1996). The possible explanation would be that "developments in the GIS industry largely reflect the demands of the GIS marketplace. This has been dominated for the past decade by applications in resource management, infrastructure and facilities management, and land information. In these areas, GIS tends to be used more for simple recordkeeping and query than for analysis" (Goodchild, et. al., 1992, p. 410). The market-oriented development of GISystems is inappropriate for individual users, such as geographers and spatial scientists. What geographers really need is the spatial-analysis-oriented and question-driven GIServices. By adopting the dynamic framework of distributed GIServices, geographers may be 156 able to develop the specialized programs and models, which can really focus on the spatial analysis theories and geographic problems. The following paragraphs will describe three major impacts on geographers and spatial scientists in adopting the dynamic framework of distributed GIServices in the future. First of all, geographers and spatial scientists can build more realistic models to solve their research problems by combining the LEGO-like GIS components and models in distributed GIServices. Currently, geographic research and scientific problems focus on large scope, multidisciplinary issues, such as the global change, sustainable development, and urban growth, which involve many experts and specialists from different disciplines. However, traditional GISystems can not easily integrate their specialized knowledge and models into an integrated framework. For example, the urban growth research may require the spatial statistic analysis functions, the graphic display of population profiles, network analysis for transportation, digital elevation model, and hydrological models for underground water condition. In the past, these specialized models and tools were developed separately in different software and platforms and were not compatible. By using the distributed GIServices and LEGO-like component framework, different types of GIS models can be easily integrated and provide a more realistic modeling environment for the scientists. Geographers can combine different GIS components in order to provide better explanations for geographic problems. Moreover, geographers can share their expertise and models with other scientific communities by distributing their models and programs over the Internet. In short, the LEGO-like frameworks of distributed GIServices will facilitate GIS modeling in a question-driven and exploratory method (Fischer, et. al., 1996). Geographers and spatial scientists can then easily share or exchange their GIS models and analysis methods under the dynamic architecture of distributed GIServices and facilitate the continuous progress of Geographic Information Science. Second, intelligent GIServices will help scientists and geographers focus on the domain of problems rather than the mechanism of system implementations. Traditional GISystems with unfriendly user interface and complicated programming tools prevent many geographers from adopting GIS modeling tools to solve their research problems. In the past, the primary obstacle of using GIS tools was the mechanism of GIS model implementation. To construct a GIS model, geographers had to understand the details of implementation mechanisms, including macro language programming, database management, data conversion, and hardware device configuration. These tasks require comprehensive programming experience and computer literacy, which may go beyond the regular training geographers receive. By adopting distributed GIServices, the intelligent agents can help geographers construct their GIS models more easily and relieve most of implementation tasks from geographers’ shoulders. The collaborations among intelligent agents will take care of system-level problems and the details of model implementation, such as database management, data conversion, hardware configuration, and software compatibility. Geographers will have more energy to concentrate on spatial analysis issues and the examination of geographic problems. Finally, the flexible data access approach and the operational metadata scheme in distributed GIServices will help geographers utilize on-line information more efficiently and facilitate the reusability of geospatial data for geographic research. In a traditional GISystems environment, the most expensive cost of GIS implementation is in data input and data conversion (Korte, 157 1992). Most geographic research requires multiple geographic data sets, which need to be generated by costly procedures, including map digitizing, image scanning, or the classification of remote-sensing images. These procedures cost both hardware investment and labor. For an individual geographer, the expensive cost of digital data hinders the construction of GIS models and spatial analysis procedures. With the help of distributed GIServices, geographers will be able to exchange and share spatial data sets and be free from the constraints of heterogeneous data models and isolated system environments. Moreover, the operational metadata described inside geodata objects will improve the efficiency of data modeling and help geographers get the appropriate data sets for their research projects. Such a dynamic, distributed GIServices architecture will provide a cost-effective way for geographers to download, exchange, or share spatial data sets in distributed network environments. Geographers will get better data for better use with the help of distributed databases and operational metadata. To summarize, distributed GIServices will help geographers in the establishment of GIS models, the design of spatial analysis tools, and the better use of geospatial data sets. Moreover, the adoption of intelligent agents and operational metadata will encourage geographers to formalize the geographic knowledge for the training of agents and the definition of object behaviors for geodata objects and components. The construction of geographic knowledge base will facilitate the future process of Geographic Information Science. Geographers will have more energy to dedicate their effort to geographic research and make scientific contributions. 6.3.3 Future Impact on the Public One of the major differences between traditional GISystems and distributed GIServices is the extension of target users. Traditional GISystems are designed for the GIS professionals, geographers, or spatial analysts, who are associated with consultant companies, universities, or government organizations. Distributed GIServices are designed for the public and provide useful information in our daily lives. The public will be the major group who gets the most out of it if the whole society adopts the dynamic framework of distributed GIServices. However, there are also some negative impacts on the public along with the development of distributed GIServices. The following sections will identify both positive and negative aspects of GIServices from a public service perspective. 6.3.3.1 Positive Aspects There are two major benefits of distributed GIServices for the public services. The first benefit is to provide transparent, ubiquitous GIServices in daily life. Distributed GIServices will be intricately connected into the web of everyday life, along with ubiquitous computing (Weiser, 1993; Armstrong, 1997). With the popular use of cellular phones, Personal Digital Assistant (PDA), Auto-PCs, and GPS receivers, daily use of distributed GIServices will be essential in a transparent way. For example, many travel-based Web sites, such as Microsoft Expedia and MapQuest, can provide integrated services for travel plans, including the purchase of airplane tickets, hotel reservation, and car rental. Another example is automobile navigation systems, which have become the standard equipment for taxis, rental cars, police cars, and luxury sedans. In the future, more and more GIServices will be available in a more transparent way. For example, each bus stop may set up a digital display board, which can indicate when the next bus 158 arrives, whether the bus will be delayed, and what is the real-time location of the bus. The information can help the passengers know how long they will wait before their bus comes and improve the efficiency of public transportation. Another example is the automatic parking arrangement in shopping malls. When people drive to the shopping mall, each car with its AutoPC may build network connections with the parking lot server when the car enters the parking lot of the mall. The parking lot server will tell the car whether there is parking space available and indicate where is the nearest parking lot from a favorite shopping store. Such kinds of GIServices will be very popular and feasible in the future with the help of network communication and intelligent framework of distributed GIServices. The public may not even recognize that these services are GIS applications. With the help of distributed GIServices, such public services will become more friendly and useful for the public. The second benefit of distributed GIS is to deliver real-time, integrated services for emergency events. For example, the dispatch of emergency vehicles (ambulances and police cars) with realtime road condition reports can avoid unnecessary delays and ensure the rescue team arrives promptly at the scene. Another example of real-time services is natural hazard report and evacuation/rescue plans. Distributed GIServices can facilitate efficient hazard management and rescue/relief plans for floods, tornadoes, earthquakes, or typhoons. The intelligent network of hazard management services with the real-time data gathering devices, such as video cameras, GPS receivers, wired weather stations, will provide essential information for the local and federal governments in building effective warning systems with the quick response in setting up rescue/relief plans. With the help of telecommunications networks, the real-time data reports and monitor for nature hazard damages can be distributed to related departments immediately. Different departments, such as the police department, transportation department, fire department, etc., can work together based on the real-time geographic information and provide the necessary services to the damaged areas immediately. With the help of real-time distributed GIServices, federal and local governments can provide a more secure and safe life protection for their citizens. In general, distributed GIServices can provide the public a safe civil life and a convenient way to live in the Twenty-first century. 6.3.3.2 Negative Impact Although distributed GIServices can provide useful and essential information for the public, they also have some negative aspects. The major problem in adopting distributed GIServices in public services is the creation of Information Ghetto (Graham and Marvin, 1996) and Digital Divide (NTIA, 1999). Even though society uses computer technology and network facilities as the major tools to provide public information services for its citizens, some people will be deprived of their civil rights and miss the opportunities for success due to the lack of tools for accessing public information services. While affluent and elite groups are beginning to orient themselves to the Internet and home informatics and telematics systems, other groups are excluded by price, lack of skills or threaten to be exploited at home by such new technologies. Advanced telecommunications and transport networks open up the world to be experienced as a single global system for some. But others remain physically trapped in ‘information ghettos’ where even the basic telephone connection is far from a universal luxury (Graham and Marvin, 1996, p 37). 159 The similar problem was also identified in research on the Digital Divide carried out by the National Telecommunications and Information Administration (NTIA), U.S. Department of Commerce. NTIA initiated their first survey in 1994, then continued it in 1997 and 1998. These surveys discovered that “the situation of Digital Divide – the divide between those with access to new technologies and those without – is now one of America’s leading economic and civil right issues” (NTIA, 1999. P. xiii). The survey identified user profiles of telephone services and Internet access and cross-tabulated the information according to several variables (such as income, race, age, and education) in three geographic categories -–rural, urban and central city. The first report identified the problem of disproportionate access to the Internet in rural areas and central cities. “Black households in central cities and particular rural areas have the lowest percentages of PCs, with central city Hispanics also ranked low” (NTIA, 1995, p.3). Two years later, the second survey indicated the widening gap of the Digital Divide along with the popular growth of Internet and PCs. The results also indicated that the group of female-headed households lag significantly behind the national average of Internet access and PC usage (NTIA, 1998). The third report released in 1999 found that “a Digital Divide still exists, and, in many cases, is actually widening over time. Minorities, low-income persons, the less educated, and children of single-parent households, particularly when they reside in rural areas or central cities, are among the groups that lack access to information resources” (NTIA, 1999, p. xiii). For example, “urban households with incomes of $75,000 and higher are more than twenty times more likely to have access to the Internet than rural household at the lowest income levels, and more than nine times as likely to have a computer at home” (NTIA, 1999, p. xv). In general, the Digital Divide reports indicate that provision of public services require in-depth considerations beyond the deployment of technology. The social aspect of public access needs to be considered while local or federal governments deploy the framework of distributed services. Internet technology and distributed GIServices should help the public, especially those who need additional support from the governments and social welfare. NTIA’s research also reveals that “many of the groups that are most disadvantaged in terms of absolute computer and modem penetration are the most enthusiastic users of on-line services that facilitate economic uplift and empowerment” (NITA, 1995, p 4.). Besides the adoption of network technology and real-time information services, government may need to provide public facilities for the public to access these information and the Internet in local public libraries, schools, and community centers. Also, some essential geographic information may use alternative media for the public, such as paper maps, telephone services, or public broadcast systems. In general, distributed GIServices should allow the public to access essential information from both public and private places and should not be limited to certain societal groups or classes. 6.4 Future Work This dissertation focuses on the high-level architecture for the deployment of GIServices. Follow-up work needs to be done, especially in the area of actual implementation and low-level system specifications. The following discussions will identify possible directions of future research for distributed GIServices, which includes possible implementation tools, the organization and hierarchy of GIS networks, and the creation of intelligent agents. 160 6.4.1 The Possible Implementation Tools In Chapter Two, the literature review introduces some possible technologies and programming tools for the implementation framework of GIServices, including DCOM, CORBA, and Java platform. Besides these three major technologies, several new languages and techniques may be good candidates for the future development tools of GIServices. These new technologies may be appropriate for constructing the operational metadata scheme, or embedding the rational engines for agents, or for the agent communication protocols. First of all, the development of XML and XHTML will be very likely to be used in the construction of operational, object-oriented metadata scheme. XML is the acronym of Extensible Markup Language, which was developed by an XML working group formed under the World Wide Web Consortium (W3C) in 1996. “XML is a highly functional subset of SGML. The purpose of XML is to specify an SGML subset that works very well for delivering SGML information over the Web” (Murch and Jognson, 1999.p.71). XML provides a mechanism to impose constraints on the storage layout and logical structure. W3C released the first version of XML specification (XML 1.0) in 1998 (Bray, et. al, 1998). After that, many software companies have adopted the XML widely as a major tool for the development of content-based services. In January 2000, W3C released a reformulation of HTML, called XHTML 1.0, which defines the XHTML as a reformulation of HTML 4.0. In general, XHTML documents are XML based, and ultimately are designed to work in conjunction with XML-based user agents. “The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content’s backward and future compatibility” (W3C, 2000, p. 3). In general, the concise language format and metadata-oriented structure of XML and XHTML will improve tools for the future development of operational metadata schemes in distributed GIServices. Next, Knowledge Query and Manipulation Language (KQML) and Agent Communication Language (ACL) may be used for the communication protocol among agents in distributed GIServices. These two languages were designed to provide the intelligent communication among agents and software. KQML is a new language and protocol for exchanging information and knowledge developed by ARPA Knowledge Sharing Effort, which is aimed at developing techniques and methodology for building large-scale knowledge bases which are sharable and reusable. KQML has the goal of developing a high-level communication standard based on speech acts, to allow cognitive agents to cooperate. A KQML message is divided into three layers: the content layer, the message layer, and the communication layer. The content layer refers to the actual content of the messages which can be in any representation language and is up to the application to process it. The communication layer encodes some low level communication details, like sender and receiver of the message, a unique identifier associated to this message etc. The message later determines the kind of interaction the receiver is willing to initiate with the sender. It is expressed by a keyword selected from a predefined set of keywords (called performatives) and expresses the sender’s intentions (Skarmeas, 1999, p.12). 161 Another communication standard is emerging, called the Agent Communication Language (ACL) developed from the Foundation for Intelligent Physical Agents (FIPA). The FIPA Agent Communication Language is also based on speech act theory: messages are actions, but using a more rigorous semantic formula (Ferber, 1999). The specification consists of a set of message types and the description of their pragmatics, that is the effect on the mental attitudes of the sender and receiver agents. Every communicative act is described in both a narrative form and a formal semantic based on modal logic. … The specification also provides the normative description of a set of high-level interaction protocols, including requesting an action, contract net and several kinds of auction etc (FIPA, 1998, p.vii). Both KQML and ACL could be used in the agent communication protocols. Each has its own advantages and weaknesses. Currently, KQML seems more popular than ACL in the industry applications. The GIS community may need to consider the uniqueness of GIS knowledge and choose one of them as the best communication protocol and the formalization tool of geographic knowledge. These new tools and languages mentioned above are very promising and useful in establishing an intelligent, comprehensive architecture of distributed GIServices. The GIS community should watch their development closely and adopt some of them if possible. If these technologies have been widely accepted in the computer industry, the GIS community should begin to develop the related products and applications immediately and participate in the standardization and specification processes. 6.4.2 The Organization and Hierarchy of GIS Networks Another important research direction of distributed GIServices is the organization and hierarchy of GIS networks. This dissertation only proposed the basic unit (GIS node) and its architecture for distributed GIServices. In the future, a truly distributed GIService will be provided by the collaboration among hundreds of GIS nodes. When these GIS nodes are connected together, they will need to be organized for different roles and types based on different GIS tasks. Three levels of organization can be constructed in multi-agent systems (Ferber, 1999): 1. The micro-social level, emphasizing interactions between agents, and in the various linkages between agents. It is at this level that most studies into distributed artificial intelligence have been undertaken. 2. The level of groups, where intermediary structures intervene in the composition of a more complete organization. At this level, the emphasis is placed on differentiation of the roles and activities of the agents, the emergence of organizational structures between agents, and the general problem of the aggregation of agents during the constitution of organizations. 3. The level of global societies (or populations), where interest is concentrated on the dynamics of a large number of agents, together with the general structure of the system and its evolution. Research in relation to artificial life is quite frequently located on this level. (Ferber, 1999, p. 13) 162 This dissertation research focuses on the second (group) level of organization, by defining and classifying the tasks of agents. Based on the uniqueness of GIS tasks, three types of agents are defined: geodata agents, component agents, and machine agents. Two classes of agents are defined based on their mobility: mobile agents and stationary agents. Future research may need to extend the classification of agents and their collaborations into the level of global societies along with the specification of the micro level of agent’s behaviors and knowledge implementation. Other consideration for future research may include coordination between agents, such as the distribution of agent tasks, the relocation of mobile agents, the voting or negotiating behaviors of agent groups, etc. In general, the future research of distributed GIServices may need to emphasize the organization of GIS nodes and make the interactions among GIS nodes more efficient. For example, some GIS nodes may become the data clearinghouse in the colorado.edu domain, or GIS-component center in the gis.com domain, or the security department for sensitive data sharing. The GIS community should re-think the whole strategy of distributed GIServices and develop an appropriate hierarchy for different types of GIS nodes and networks. 6.4.3 The Creation of Intelligent Agents One of the contributions in this dissertation is the introduction of agent-based communication mechanism for distributed GIServices. However, the design of agents in this research does not implement intelligence for agents. The main reason is that the agents mentioned in this research are not necessarily intelligent, only following simple rules defined by metadata. These agents can be treated as dummy agents with limited knowledge and behaviors. For example, conditional clauses can be used for specifying a more diverse set of allowable agent interactions, particularly between agents (Holland, 1998). Advanced intelligence implementation may be carried utilizing fuzzy logic or neural network engines in each agent. One thing to keep in mind is that these knowledge engines should be small enough to be encapsulated inside the agent itself. Actually, intelligence indicates that the agents pursue their goals and execute their tasks such that they optimize some given performance measures. To say that agents are intelligent does not mean that they are omniscient or omnipotent, nor does it mean that they never fail. Rather, it means that they operate flexibly and rationally in a variety of environmental circumstances, given the information they have and their perceptual and effectual capabilities (Weiss, 1999, p. 2). Based on this definition, future research on intelligent agents will focus on the construction of rational behaviors and the logical representation of knowledge rules. The agent research will need to collaborate with Computer Science, Cognitive Science, and Geography, in order to develop appropriate knowledge engines for GIS tasks and applications. Another direction of intelligent agent research is in collaboration behaviors and decision making procedures. Research into multi-agent systems demands integrational rather than analytical science, and prompts us to ask a certain number of questions. What is an agent that interacts with other agents? How can they cooperate? What methods of communication are required for them to distribute tasks and coordinate their actions? What architecture can they be 163 given so that they can achieve their goals? These questions are of special importance, since the aim is to create systems possessing particularly interesting characteristics: flexibility, the capacity to adapt to change, the capacity to integrate heterogeneous programs, the capacity to obtain rapid results and so on (Ferber, 1999, p.xv). 6.5 The Alternative Futures The previous discussions in this chapter highlight the promising future of distributed GIServices and the possible on-line GIS applications and research directions for the GIS community. However, it does not guarantee that comprehensive, distributed GIServices will be available in the future automatically. If the GIS industry does not apply distributed component technologies in their products, or if the leading GIS vendors and their Internet map server packages do not adopt the OpenGIS Specifications and Standards, the comprehensive GIServices may not be available. The following paragraphs will provide such a different perspective and discuss the alternative prospect for the development of GIServices. First of all, there are three possible paths for the GIS community. The first path is that the GIS community follows the traditional paradigm of GISystems, which focuses on standalone workstations with limited network capability, instead of distributed GIServices. The second path is that the GIS community will develop distributed GIServices but the software frameworks will be developed by private GIS vendors without providing an open environment for the integration of different GIServices. The third path is that the GIS community will be able to develop distributed GIServices in an open and distributed software framework which is proposed in this research. The following discussion will illustrate the possible future for the first two paths by focusing on the impact on three GIS user groups: the public sector, the private sector, and the scientific community. 6.5.1 The First Path: Centralized GISystems Many people in the GIS community argue that centralized GISystems are better than distributed GIServices because standalone GISystems are stable, robust, powerful, and worry-free. On the other hand, network-based, distributed GIServices are slow to respond, complicated, insecure, unreliable, and troublesome. Some GIS professionals and research scientists may prefer to continue the traditional development of GISystems rather than explore the possible applications of distributed GIServices. What will happen if the GIS community chooses to retain the paradigm of GISystems? The following discussion will analyze the possible impact on three user groups in the GIS community and prospect the advantages and disadvantages of this alternative future. For the public sector, one advantage for adopting traditional GISystems is that federal agencies and local government can continue to use existing GISystems and computer facilities without worrying about re-training their IT experts, purchasing new hardware and software, and the change of system administrative tasks, such as setting up new web servers and re-organizing local network connections. Also, federal agencies and local governments can keep their data production and process procedures in a traditional GISystems approach, which makes it easier to set up their annual budget plans. If federal agencies or governments decide to provide distributed GIServices for their users, the initial cost of the hardware/software will be very 164 expensive, including purchasing enterprise web servers, testing and installing map server packages, the training programs for adopting new technology, and the maintenance of web servers and GIServices. On the other hand, there are three major disadvantages for the public sector. First of all, the dissemination of geographic information will be expensive and difficult to update the change of geodata by using traditional media such as paper maps and CD-ROMs. Second, the centralized GIS databases management may cause a management problem along with the increase in the size of GIS databases. For example, with the modern remote sensing technology and GPS-based data collections, the size of a working GIS database may increase with new information being added everyday. Without a distributed GIService architecture, the maintenance of traditional GIS databases will become more difficult. Third, without providing catalogue services or metadata along with distributed GIServices, many GIS projects may duplicate GIS data sets, which may have already existed in other agency’s databases. For example, a transportation department in a local county government may want to create city road data sets by digitizing a paper map generated from the USGS. However, the USGS may already have the city roads in a digital format. Under current GISystems framework, it is really difficult to facilitate the sharing and exchanging of geospatial information among different agencies and governments. For the private sector, one advantage for keeping the traditional GISystems paradigm is that the pricing policy for current GIS software is easier to handle. In general, software company can charge users for a single GIS package or an annual licensing fee. Even though the users may only use a fraction of the total software functions or just one month of GIS project during the whole year. Also, standalone GIS packages are easier to keep track of the licensing policy and charge for each user and machine seats. The pricing policy for distributed GIServices is more difficult to make, and the usage of GIServices is harder to track. Also, the data protection, system security, and copyright issues are simpler in the GISystems than distributed GIServices. There are three disadvantages for the private sector. The first disadvantage is that the cost of software development and prototyping is higher and the cycle of new products is longer. Distributed GIServices with distributed component technologies and modularized software frameworks, such as COM, Java or CORBA can improve the processes of software development for GIS vendors and software companies. The second disadvantage is that the number of users in traditional GISystems is small comparing to the other types of IT application. By adopting distributed GIServices, the number of potential GIS users will increase dramatically. These GIServices include wireless car navigation systems, mobile mapping devices, virtual shopping guides, etc. If the private sector does not provide new types of distributed GIServices, the growth of GIS users will be very limited in the future. Finally, it is difficult to integrate GISystems with other types of computer applications, such as visualization software and statistic packages. The problems of software integration may prevent the further adoption of GIS in many software applications. For the scientific community, the first advantage for GISystems is that traditional GISystems are easier to handle and learn than distributed GIServices. Scientists can quickly apply GISystems to their research projects. For example, oceanographers can use a basic GIS package (like ArcView or MapInfo) to display their research findings. However, it is more difficult to ask a scientist to set up a Web site or Internet Map Server. 165 On the other hand, without adopting distributed GIServices in the future, the scientific community may need to pay a very expensive price for GIS software packages and may only use 10% of their GIS functions. Also, the nature of traditional GISystems (standalone) may prevent or discourages scientists from sharing their data results with other disciplines because of the lack of collaboration between heterogeneous GIS databases. 6.5.2 The Second Path: Private, Vendor-specialized GIServices Another possible path for the GIS community is to develop distributed GIServices in a private, vendor-specialized framework instead of an open, standardized architecture. People, especially from the private sector, may argue that the vendor-based framework is better because GIS vendors can adopt the most advanced information technology and respond to their customers faster, which means better IT support and services. Also, vendor-specialized products can provide customized services on different domains, such as utilities management, traffic controls, or emergency dispatch. The following discussion will illustrate the advantages and disadvantages in the same three groups: the public sector, the private sector, and the scientific community. For the public sector, the major advantage is that the vendor-based software model may be able to provide better IT support and customer services. Since different software companies have their own software coding and specialized data structure, private GIService frameworks may provide a better data security model and content protection for sensitive geospatial data and remote sensing images. On the other hand, if different federal agencies or local governments adopt different vendorbased frameworks, it will be very difficult to exchange or integrate GIServices from two different software frameworks. Second, with the development of National Spatial Data Infrastructure, it is essential to have both the vertical integration (different locations with the same spatial themes ) and horizontal integration (different data layers in a single areas) of geodata clearinghouses. With the private, vendor-based GIService framework, it is very unlikely to ask all government agencies and data clearinghouses to adopt the same GIS vendor’s solution. Third, in the long run, it is extremely difficult to migrate GIServices from one vendor-based platform to another. Without open and standardized GIServices framework, the migration from one vendor package to another will need to replace many software components, including GIS databases, map server engines, client-side viewers, middlewares, analysis procedures, etc. For the private sector, there are many advantages for adopting vendor-based GIService. First of all, the leading GIS vendors who control the majority of GIS software market will get the most benefits. The private GIService framework can allow these vendors to integrate and monopolize the whole spectrum of GIServices from data production, data analysis, to data distribution within a single vendor-based architecture. The second advantage for the private sector is to keep their customers and users forever, since the migration to another system architecture is extremely difficult. The private, vendor-based GIServices can secure the profit growth of major GIS software vendors and their market value in the long run. 166 The disadvantage of vendor-based GIService framework for the private sector can only be experienced by small GIS companies or individual GIS consultants. Because it is difficult to develop a customized GIS component or extension under a vendor-specialized development framework, the small software vendors and GIS programming will not be able to develop specialized GIS functions or components for their customers. The whole market of distributed GIServices will be monopolized by few leading GIS vendors. For the scientific community, the advantages of private, vendor-based framework is similar to the public sector. The vendor-based GIServices may provide better IT support and services. Also, it is easier for the scientific researchers to rely on the commercial products to set up the whole GIService framework from a single package rather than to build complicated, modularized GIService components steps by steps. On the other hand, under the vendor-based GIServices framework, scientists with programming capabilities will not be able to customize GIServices functions by themselves unless the private vendors provide their source codes. Second, the functionality of the commercial GIService packages will not focus on scientific research tools but on the basic map display and query functions which can attract bigger user groups, such as the AM/FM industry, local governments, and federal agencies. Third, different GIS projects developed in the scientific community may not be able to share with each other if different projects adopt different frameworks. For example, an Arctic research project may involve multiple nations and different governments. Without an open, standardized GIService framework, it is very difficult to gather and analyze GIS data from different sources. To sum up, the first alternative path, GISystems, is very unlikely to be the future trend because the majority of GIS community already realized the importance of distributed GIServices. Currently, many federal governments, GIS companies, and scientists are already providing distributed GIServices via the World Wide Web. However, the second path, vendor-based GIServices, is likely to be the trend as many private software companies are pushing the GIS community to this direction. The public sector and the scientific community may not realize the serious problems in the future if they adopt the private, vendor-based GIServices solution. The only way to prevent a privatized GIService framework is to advocate these negative impact on the public and scientific community and push the private sector to adopt open and standardized GIService frameworks in the future. Hopefully, by doing so, the GIS community will choose the third path where distributed GIServices will be developed in an open and distributed software framework suggested in this research. 6.6 Conclusion This chapter illustrates the unique features and the implications of distributing GIServices on the Internet, which are service-oriented, value-added, and growing exponentially. These discussions tell us that the adoption of distributed GIServices is not only a technology migration or a new framework of GIS, but also has significant impact on society, economics, and daily life. Distributing GIServices on the Internet will facilitate the development of reusable, compatible, and upgrade-able GIS software and databases. By adopting modularized, real-time based GIServices, geographers and spatial scientists can build more realistic models to solve research problems and focus on the domain of geographic problems rather than the mechanisms of system 167 implementation. Geographers can utilize geographic information and share research results and models more efficiently. In general, the deployment of GIServices will facilitate scientific data management in geographic domain. The scientific community, through government science agencies, professional societies, and the actions of individual scientists, should improve technical organization and management of scientific data (National Research Council, 1998) in the following ways: a. Work with the information and computer science communities to increase their involvement in scientific information management; b. Support computer science research in database technology, particularly to strengthen standards for self-describing data representations, efficient storage of large data sets, and integration of standards for configuration management; c. Improve science education and reward systems in the area of scientific data management; and d. Encourage the funding of data compilation and evaluation projects, and of data rescue efforts for important data sets in transient or obsolete forms, especially by scientists in developing countries. (National Research Council, 1998, p 13) Hopefully, the network-based GIServices framework proposed in this research will be able to provide an effective approach for scientific data management and facilitate the creation of selfdescribable geospatial data sets. One of the main goals for distributing GIServices on the Internet is to encourage people and organizations to use geographic information and to make better decisions. The comprehensive framework of distributed GIServices designed in this dissertation will ensure high quality GIServices provided from both public and private sectors. The power of GIServices, as decision-making tools and query engines, will be released from the GIS professionals to the public. In addition to technological implementation, challenges remain for the GIS community to address. One challenge is to make sure that everyone is connected and has access to geographic information services. “Traditionally, our notion of being connected to the nation’s communications networks has meant having a telephone. … To be connected today increasingly means to have access to telephones, computers, and the Internet. While these items may not be necessary for survival, arguably in today’s emerging digital economy they are necessary for success” (NTIA, 1999, p. 77). How to ensure that the on-line resources are available to the public? How to provide more friendly interfaces for information search and query? How to encourage the GIS industry to collaborate with local communities? These questions will need to be answered in the near future by the members of the GIS community. The above discussions have illustrated some research directions and problems for distributed GIServices in the future. All the suggestions, frameworks, and tools designed and specified in this dissertation are aimed to make distributed GIServices more flexible, intelligent, and feasible. In this new millennium, there is a fundamental research topic implied in the design of distributed GIServices, which is the emergence of GIServices networks. In the future, the network of distributed GIServices will connect hundreds of thousands GIS nodes and workstations together, 168 with the collaborations among millions of agents, data objects, and GIS components. Will the collaborations and interactions inside the whole network of GIServices create the emergence of Geographic Information Science? John Holland (1998: 225) said that “emergence occurs in systems that are generated. The systems are composed of copies of a relatively small number of components that obey simple laws. … The interactions between the parts are non-linear, so the overall behavior cannot be obtained by summing the behaviors of the isolated components.” Thus, when sophisticated networks are created for distributed GIServices, when thousands of GIS workstations are connected together, when millions of agents are roaming on/across the GIServices network and accessing on-line data, it is important for the GIS community to watch very closely the evolution of GIServices networks. Another interesting area is the development of multi-agent systems (MAS) along with the research of distributed artificial intelligent (DAI). “Over the past few years, multi-agent systems have become more and more important in many aspects of computer science (artificial intelligence, distributed systems, robotics, artificial life, etc.) by introducing the issue of collective intelligence and of the emergence of structures through interactions” (Ferber, 1999, p.xv). Currently, the research of MAS and DAI are very popular in the computer science community (Kelly, 1994; Moore, et. al., 1997; Brenner et. al, 1998). Different from the traditional research of AI, the applications of MAS and DAI are more feasible and may be applied to all kinds of distributed systems and devices. For example, Web robots and Internet search engines have already incorporated some applications. In adopting the research of MAS and DAI, it is important to consider how intelligent the GIServices network may become and what level of control should be retained. The third consideration for distributed GIServices is the development of standards and specifications from the GIS industry and government agencies. This dissertation introduces the major standards of distributed GIServices developed from the Open GIS Consortium and the ISO/TC 211. These standards and specifications will have significant impacts on the future development of GIServices. However, the GIS community should be aware that these standards may not necessarily be adopted widely in the future. If these standards are not feasible for the actual GIS tasks and implementations, they may be replaced by other standards. One of the most famous standards examples is the design of the Internet Protocol between TCP/IP and ISO. This drawback became very clear in the United Kingdom in the 1980s when the governments of all world were saying that ISO is the standard for networks; many companies fell into the trap of believing this and spent huge sums of money to support ISO. Of course, history shows that TCP/IP became the de facto world standard for a number of reasons, mostly pragmatic and concerning practical working and speed (ISO was so overdeveloped, it was too slow) (Murch and Jognson, 1999, p 66). The lessons from the history of TCP/IP indicate that no matter how carefully articulated, standards must be adopted by a wide enough range of suppliers and corporate bodies to be meaningful (Murch and Jognson, 1999). Currently, the OpenGIS specifications and the ISO 41056 standards are under development by hundreds of experts and specialists. However, these standards and specification have not been widely examined in actual implementation. From this author’s personal perspective, these standards will need to be revised based on real-life 169 applications, including practical working and performance tests of actual GIS tasks to ensure feasibility. Without considering actual use and practical tasks, standards developed by OGC and ISO/TC 211 may not be adopted widely. It is important for the GIS community to have feasible standards for distributed GIServices, and at the same time, prevent the standards and specification from overdeveloping. In the Twentieth century, the visible world was re-shaped by the development of engineering technology. People used technology to create skyscrapers, airplanes, space shuttles, water dams, etc. In the Twenty-first century, the invisible world is going to change by the progress of information technology. The developments of the Internet, wireless phones, biochemistry engineering, DNA-research, e-commerce, virtual reality, and the Web, have already made our lives very different, and will continue to do so. However, when technology is changing the world inside out, one needs to consider whether to use the technology. “The more interconnected a technology is, the more opportunities it spawns for both use and misuse” (Kevin Kelly, 1998, p. 45.). While enjoying the power of network-based GIServices in the Twenty-first century, one may want to ask if technology is being used or misused. This dissertation designed a dynamic framework for the GIS community to adopt new technologies in the future. However, the GIS community will have to ensure that these new technologies will be used appropriately to facilitate research in geography, the growth of GIS industry, and better quality of life for the public. In conclusion, the future research of distributed GIServices should focus on the following areas. First, the dynamic architecture should adopt new tools and new languages for the communication of agents and the formalization of operational metadata. Second, the research of agent intelligence and knowledge implementation for distributed GIServices should collaborate with the computer science community, especially in MAS and DAI domains. Third, the focus on the organization and hierarchy of GIS nodes will need to be explored in depth in order to identify the possible evolution and emergence of distributed GIServices networks. This study gives the opportunities to initiate the research of distributing GIServices on the Internet. Still, continued research on the deployment of GIServices is needed. 170 BIBLIOGRAPHY Abler, R. F. (1987). The National Science Foundation National Center for Geographic Information and Analysis. International Journal of Geographical Information Systems, 1(4), pp. 303-326. Albrecht, J. (1996). Universal GIS Operations: A Task-oriented Systematisation of Data Structureindependent GIS Functionality Leading Towards a Geographic Modelling Language. Unpublished Doctoral Dissertation. Vechta, German: University of Vechta, Geography Department. Anuff, E. (1996). The Java Sourcebook. New York: John Wiley & Sons, Inc. Armstrong, M. P. (1997). Emerging Technologies and the Changing Nature of Work in GIS. In Proceedings of GIS/LIS'97, Cincinnati, Ohio, pp. 800-805. Aronoff, S. (1989). Geographic Information Systems: A Management Perspective. Ottawa, Canada: WDL Publications. Baldonado, M., Chang, C. K., Gravano, L., & Paepcke, A. (1997). The Stanford Digital Library Metadata Architecture. International Journal on Digital Libraries, 1, pp.108-121. Berners-Lee, T., Cailliau, R., Luotonen, A., Frystyk, N., & Secret, A. (1994). The World-Wide Web. Communications of the ACM, 37(8), pp. 76-82. Bishr Yaser, M. (1996). A Mechanism for Object Identification and Transfer in a Heterogeneous Distributed GIS. In Proceedings of the 7th International Symposium on Spatial Data Handling, Delft, The Netherlands, pp. A.1-A.13. Booch, G. (1994). Object-Oriented Analysis and Design with Applications, Second Edition. Redwood City, California: The Benjamin/Cummings Publishing Company, Inc. Booch, G., Rumbaugh, J., & Jacobson I. (1998). The Unified Modeling Language User Guide. Reading, Massachusetts: Addison-Wesley. Bracken, I., & Webster, C. (1990). Information Technology in Geography and Planning: Including Principles of GIS. London: Routledge. Bradshaw, J. M, editor (1997). Software Agent. Menlo Park, California: AAAI Press. Bray, T., Paoli, J., & Sperberg-McQueen, C. M. (1998). Extensible Markup Language (XML) 1.0 Specification. W3C. URL: http://www.w3.org/TR/1998/REC-xml-19980210 (date: 5-11-2000). Brenner, W., Zarnekow, R., & Wittig, H. (1998). Intelligent Software Agents: Foundation and Applications. Berlin: Springer. 171 Brockschmidt, K. (1994). Inside OLE 2. Redmond, Washington: Microsoft Press. Brockschmidt, K. (1996). What OLE Is Really About. Redmond, Washington: Microsoft Online Library. URL: http://msdn.microsoft.com/library/techart/msdn_aboutole.htm (date: 5-11-2000). Brodie, M. L., & Stonebraker, M. (1995). Migrating Legacy Systems. San Francisco: California: Morgan Kaufmann Publishers, Inc. Buehler, K., & McKee, L., editors. (1996). The Open GIStm Guide: Introduction to Interoperable Geoprocessing. Wayland, Massachusetts: Open GIS Consortium, Inc. Buehler, K., & McKee, L., editors. (1998). The Open GIStm Guide: Introduction to Interoperable Geoprocessing and the OpenGIS Specification, Third Edition. Wayland, Massachusetts: Open GIS Consortium, Inc. URL: http://www.opengis.org/techno/guide.htm (date: 5-11-2000). Buttenfield, B. P. (1997). The Future of the Spatial Data Infrastructure: Delivering Geospatial Data. GeoInfo Systems, June 1997, pp. 18-21. Buttenfield, B. P., & Goodchild, M. F. (1996). The Alexandria Digital Library Project: Distributed Library Services for Spatially Referenced Data. In Proceedings of GIS/LIS'96, Denver, Colorado, pp. 76-84. Buttenfield, B. P., (1998). Looking Forward: Geographic Information Services and Libraries in the Future. Cartography & Geographic Information Systems, 25(3), pp. 161-171. Buttenfield, B.P. & Tsou, M.H. (1999). Distributing an Internet-based GIS to Remote College Classrooms. In Proceedings of ESRI International User Conference, San Diego, California, CDROM. Chappell, D., & Linthicum, D. S. (1997). ActiveX Demystified. BYTE, 22(9), pp. 56-64. Coleman, D. J., (1999). Chapter 22: Geographical information systems in networked environments. In P. A. Longley, M. F. Goodchild, & D. J. Maguire (editors), Geographical Information Systems: Principles, Techniques, Applications and Management, Second Edition. New York: John Wiley & Sons, Inc., pp. 317-329. Cook, S., & Daniels, J. (1994). Designing Object Systems: Object-oriented Modelling with Syntropy. Englewood Cliffs, New Jersey: Prentice-Hall. Davies, C. (1995). Tasks and Task Descriptions for GIS. In T. L. Nyerges, D. M. Mark, R. Laurini, & M. J. Egenhofer (editors), Cognitive Aspects of Human-Computer Interaction for Geographic Information Systems. Dordrecht: Kluwer Academic Publishers, pp. 327-342. Davies, C., & Medyckyj-Scott, D. (1994). GIS Usability: Recommendation Based on the User's View. International Journal of Geographical Information Systems, 8(2), pp. 175-189. 172 Davies, C., & Medyckyj-Scott, D. (1996). GIS Users Observed. International Journal of Geographical Information Systems, 10(4), pp. 363-384. Eddy, J. A. (1993). Environmental Research: What We Must Do. In M. F. Goodchild, B. O. Parks, and L. T. Steyaert (editors), Environmental Modeling With GIS. New York: Oxford University Press, pp. 3-7. Federal Geographic Data Committee (FGDC) (1995). Content Standards for Digital Geospatial Metadata Workbook, Version 1.0. Reston, Virginia: FGDC/USGS. Fegeas, R. G., Cascio, J. L., & Lazar, R. A. (1992). An Overview of FIPS 173, The Spatial Data Transfer Standard. Cartography and Geographic Information Systems, 19(5), pp. 278-293. Ferber, J. (1999). Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence, English Edition. Harlow, England: Addison-Wesley. Finin, T. & Weber J. (1993) Specification of the KQML Agent-Communication Language (draft). The DARPA Knowledge Sharing Initiative External Interfaces Working Group. URL: http://www.cs.umbc.edu/kqml/ (date: 1-10-2001). Finin, T, Labrou, Y, & Mayfield, J. (1997). Chapter 14: KQML as an Agent Communication Langauge. In J. M. Bradshaw (editor), Software Agents. Menlo Park: California: AAA Press. Fischer, M. M., Scholten, H. J., & Unwin, D. J., editors. (1996). Spatial Analytical Perspectives on GIS (GISDATA No. IV). London: Taylor & Francis. Flanagan, D. (1997). Java in a Nutshell: A Desktop Quick Reference, Second Edition. Sebastopol, California: O’Reilly & Associates. Flanagan, D. (1999). Java in a Nutshell: A Desktop Quick Reference, Third Edition. Sebastopol, California: O’Reilly & Associates. Foresman, T. W., editor. (1998). The History of Geographic Information Systems: Perspectives from the Pioneers. Upper Saddle River, New Jersey: Prentice Hall. Foundation for Intelligent Physical Agents (FIPA) (1998). FIPA 97 Specification, Part 2: Agent Communication Language, Version 2.0. Geneva, Switzerland: FIPA. URL: http://www.fipa.org/spec/fipa97.html (date: 5-11-2000). Frew, J., Freitas, N., Hill, L., Lovette, K., Nideffer, R., & Zheng, Q. (1998). The Alexandria Digital Library System Architecture. In J. Strobel & C. Best (editors.), Proceedings of the Earth Observation & Geo-Spatial Web and Internet Workshop '98 (Salzburger Geographische Materialien, vol. 27). Salsburg: Instituts für Geographie der Universität Salzburg. URL: http://www.sbg.ac.at/geo/eogeo/authors/frew/frew.htm (date: 5-11-2000). 173 Ganti, N., & Brayman, W. (1995). The Transition of Legacy Systems to a Distributed Architecture. New York: John Wiley & Son, Inc. Gardel, K. (1992). A (Meta-) Schema for Spatial Meta-data. In Proceedings of Information Exchange Forum on Spatial Metadata, Reston, Virginia, pp. 83-98. Gardels, K. (1996). The Open GIS Approach to Distributed Geodata and Geoprocessing. In Proceedings of the Third International Conference on Integrating GIS and Environmental Modeling, Santa Fe, New Mexico, CD-ROM. Gardner, S. R. (1997). The Quest to Standardize Metadata, BYTE, November 1997, 22(11), pp.47-48. Goodchild, M. F. (1995). Alexandria Digital Library: Report on a Workshop on Metadata, Santa Barbara, California. URL: http://alexandria.sdc.ucsb.edu/public-documents/metadata/metadata_ws.html (date: 5-11-2000). Goodchild, M. F. (1996). Distributed Computing, The White Paper for UCGIS Research Priorities No. 2. URL: http://www.ncgia.ucsb.edu/other/ucgis/research_priorities/paper2.html (date: 5-11-2000). Goodchild, M. F. (1997). Towards a Geography of Geographic Information in a Digital World. Computers, Environment and Urban Systems, 21(6), pp. 377-391. Goodchild, M. F., Egenhofer, M., & Fegeas, R., editors. (1999). Interoperating Geographic Information Systems. Dordrecht: Kluwer Academic Publisher. Goodchild, M. F., Haining, R., & Wise, S. (1992). Integrating GIS and Spatial Data Analysis: Problems and Possibilities. International Journal of Geographical Information Systems, 6(5), pp. 407-423. Gosling, J. & McGilton, H. (1996). The Java Language Environment, A White Paper. Sun Microsystems. URL: http://www.Java.sun.com/docs/white/langenv/ (date: 5-10-2000). Graham, I. (1994). Object-oriented Methods, Second Edition. Workingham, England: Addison-Wesley. Graham, S. & Marvin, S. (1996). Telecommunications and the City: Electronic Spaces, Urban Places. London: Routledge. Grimes, R. T. (1997). Professional DCOM Programming. Chicago, Illinois: Wrox Press Inc. Halfhill, T. R. (1997). Today the Web, Tomorrow the World. BYTE, January 1997, 22(1), pp. 68-80. Harrision, C., Caglayan, A. & Harrision, C. G. (1997). Agent Sourcebook: A Complete Guide to Desktop, Internet, and Intranet Agents. New York: John Wiley & Sons. Holland, J. H. (1998). Emergence: from Chaos to Order. Reading, Massachusetts: Addison-Wesley. 174 Horstmann, C. S., & Cornell, G., (1998). Core Java 2, Volume 1: Fundamentals. Englewood Cliffs, New Jersey: Prentice Hall. Huse, S. M. (1995). GRASSLinks: A New Model for Spatial Information Access in Environmental Planning. Unpublished doctoral dissertation. Berkeley, California: University of California at Berkeley, Department of Landscape Architecture. ISO/TC 211 Chairman (1998). Draft Agreement between Open GIS Consortium, Inc. and ISO/TC 211. ISO/TC 211-N563. ISO/TC 211 Secretariat (1998). Program of Work, Version 6. ISO/TC 211 N 507. ISO/TC 211 Secretariat (2000). Program of Work, Version 8. ISO/TC 211 N 854. URL: http://www.statkart.no/isotc211/dokreg09.htm (date: 5-11-2000) ISO/TC 211/WG 1 (1998a). Geographic Information – Part 1: Reference Model. ISO/TC 211-N623, ISO/CD 15046-1.2. ISO/TC 211/WG 1 (1998b). Geographic Information – Part 2: Overview. ISO/TC 211-N541, ISO/CD 15046-2. ISO/TC 211/WG 3 (1998). Geographic Information – Part 15: Metadata. ISO/TC 211-N538, ISO/TC 211-N538, ISO/CD 15046-15. Jacobson, I., Christerson, M., Jonsson, P., & Overgaard, G. (1992). Object-Oriented Software Engineering – A Scenario Driven Approach. New York: ACM Press. Jansen, W. & Karygiannis T. (1999). Mobile Agent Security. National Institute of Standards and Technology, Special Publication 800-19, August 1999. Jones, J. (1997). Federal GIS Projects Decentralize. GIS World, 10(8), pp. 46-51. Karnik, N. M. (1998). Security in Mobile Agent Systems. Ph.D. Dissertation, Department of Computer Science, University of Minnesota, October 1998. URL: http://www.cs.umn.edu/Ajanta/ (date: 910-2000). Kelly, K. (1998). New Rules for the New Economy: 10 Radical Strategies for a Connected World. New York: Penguin Books. Kelly. K. (1994). Out of Control: The Rise of Neo-Biological Civilization. Reading, Massachusetts: Addison-Wesley. Knapik, M., & Johnson, J. (1998). Developing Intelligent Agents for Distributed Systems: Exploring Architecture, Technologies and Applications. New York: McGraw-Hill. 175 Korte, G. B. (1994). The GIS Book, Third Edition. Santa Fe, New Mexico: OnWord Press. Kuhn, W. (1997). Toward Implemented Geoprocessing Standards: Converging Standardization Tracks for ISO/TC 211 and OGC, White Paper. ISO/TC 211-N418. Lange, D. B. & Oshima, M. (1998). Programming and Deploying Java Mobile Agents with Aglets, Workingham, England: Addison-Wesley. Lanter, D. P. & Surbey, C. (1994). Metadata Analysis of GIS Data Processing: A Case Study. In Proceedings of the 6th International Symposium on Spatial Data Handling, Edinburgh, UK, pp. 314-324. Lemay, L. & Perkins, C. L. (1996). Teach Yourself Java in 21 Days. Indianapolis, Indiana: Samsnet. Lewis, C., & Rieman, J. (1993). Task-Centered User Interface Design. Published over the Internet. URL: ftp://ftp.cs.colorado.edu/pub/distribs/clewis/HCI-Design-Book/ (date: 5-11-2000). Li, B. (1996). Issues in Designing Distributed Geographic Information Systems. In Proceedings of GIS/LIS'96, Denver, Colorado, pp. 1275-1284. Li, B., & Zhang, L. (1997). A Model of Component-Oriented GIS. In Proceedings of GIS/LIS'97, Cincinnati, Ohio, pp. 523-528. Limp, W. F. (1997). Weave Maps across the Web. GIS World, September 1997, 10(9), pp. 46-55. Maes, P. (1994). Agents that Reduce Work and Information Overload. Communications of the ACM, 37(7), pp.31-40. Microsoft (1996). OLE Concepts and Requirements Overview. Redmond, Washington: Microsoft Online Library, URL: http://support.microsoft.com/support/kb/articles/Q86/0/08.ASP (date: 5-112000). Microsoft (1998). DCOM Architecture, White Paper. Redmond, Washington: Microsoft Press. Moellering, H. (1992). Opportunities for Use of the Spatial Data Transfer Standard at the State and Local Levels. Cartography & Geographic Information Systems, Special Issue, 19(5), pp. 332-334. Montgomery, J. (1997). Distributing Components. BYTE, April 1997, 22(4), pp. 93-98. Moore, J., Rdmonds, E., & Puerta, A., editors. (1997). Proceedings of IUI'97: International Conference on Intelligent User Interfaces, Orlando, Florida. Murch, R. & Johnson, T. (1999). Intelligent Software Agents. Upper Saddle River, New Jersey: Prentice Hall. 176 National Research Council (1998). Bits of Power: Issues in Global Access to Scientific Data. Washington, D. C.: National Academy Press. National Science Foundation (1994). NSF Announces Awards for Digital Libraries Research. NSF PR 94-52, Washington, D. C.: NSF. National Science Foundation (1998). Digital Government Program Announcement, NSF 98-121. Washington, D. C.: NSF. National Telecommunications and Information Administration (NTIA) (1995). Falling Through the Net: A Survey of “Have nots” in Rural and Urban America. NTIA, U. S. Department of Commerce. National Telecommunications and Information Administration (NTIA) (1998). Falling Through the Net II: New Data on the Digital Divide. NTIA, U. S. Department of Commerce. National Telecommunications and Information Administration (NTIA) (1999). Falling Through the Net: Defining the Digital Divide: A Report on the Telecommunications and Information Technology Gap in America. NTIA, U. S. Department of Commerce. URL: http://www.ntia.doc.gov/ntiahome/digitaldivide/ (date: 5-11-2000). Nemeth, E., Snyder, G., Seebase, S., & Hein, R. T. (1995). UNIX System Administration Handbook. Englewood Cliffs, New Jersey: Prentice Hall. Newton, H. (1996). Newton’s Telecom Dictionary, 11th Edition. New York: A Flatiron Publishing, Inc. Object Management Group (OMG) (1998). The Common Object Request Broker: Architecture and Specification, 2.2 Edition. Framingham, Massachusetts: OMG. Open GIS Consortium, Inc. (OGC) (1997a). Open GIS Simple Feature Specification For OLE/COM, Revision 0. Wayland, Massachusetts: Open GIS Consortium, Inc. Open GIS Consortium, Inc. (OGC) (1997b). Open GIS Simple Feature Specification For CORBA, Revision 0. Wayland, Massachusetts: Open GIS Consortium, Inc. Open GIS Consortium, Inc. (OGC) (1997c). Open GIS Simple Feature Specification For SQL, Revision 0. Wayland, Massachusetts: Open GIS Consortium, Inc. Open GIS Consortium, Inc. (OGC) (1998). The OpenGIS Abstract Specification, Version 3. Wayland, Massachusetts: Open GIS Consortium, Inc., URL: http://www.opengis.org/techno/specs.htm (date: 5-11-2000). 177 Open GIS Consortium, Inc. (OGC) (1999). OpenGIS Catalog Interface Implementation Specification (Version 1.0) Wayland, Massachusetts: Open GIS Consortium, Inc., URL: http://www.opengis.org/techno/specs.htm (date: 1-11-2001). Open GIS Consortium, Inc. (OGC) (2000). OpenGIS Web Map Server Interface Implementation Specification (Revision 1.0.0). Wayland, Massachusetts: Open GIS Consortium, Inc., URL: http://www.opengis.org/techno/specs.htm (date: 1-11-2001). Open GIS Consortium, Inc. (OGC) (2001). Geography Markup Language (GML) 2.0 Wayland, Massachusetts: Open GIS Consortium, Inc., URL: http://www.opengis.org/techno/specs.htm (date: 1-11-2001). ORACLE (1992). Chapter 21: Distributed Databases, In ORACLE 7 Server Concepts Manual, ORACLE, pp. 21.1-21.6. Orfali, R., & Harkey, D. (1997). Client/Server Programming with Java and CORBA. New York: John Wiley & Sons, Inc. Orfali, R., Harkey, D., & Edwards, J. (1996). The Essential Distributed Objects Survival Guide. New York: John Willey & Son, Inc. Ostensen, O. (1995). Mapping the Future of Geomatics. ISO Bulletin, December 1995, pp. 13-15. Peterson, L. L. & Davie, B. S. (1996). Computer Networks, a Systems Approach. San Francisco, California:.Morgan Kaufmann Publishers, Inc. Plewe, B. (1997). GIS Online: Information Retrieval, Mapping, and the Internet. Santa Fe, New Mexico: OnWord Press. Pountain, D. (1997). The Component Enterprise. BYTE, May 1997, 22(5), pp. 93-98. Putz, S. (1994). Interactive Information Services Using World Wide Web Hypertext. In Proceedings of the First International Conference on the World-Wide Web, Geneva, Switzerland. URL: http://www94.web.cern.ch/WWW94/PrelimProcs.html (date: 5-11-2000). Random House Webster’s Dictionary (1993). New York: Random House, Inc. Rowley, J. (1998). Draft Business Case for the Harmonisation Between ISO/TC 211 and Open GIS Consortium, Inc., Resolution 47. ISO/TC 211-N472. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., & Lorensen, W. (1991). Object-Oriented Modeling and Design. Englewood Cliffs. New Jersey: Prentice-Hall. Scambray, J., McClure, S., & Kurtz, G. (2001). Hacking Exposed: Network Security Secrets & Solution (Second Edition). Osborne: McGraw-Hill. 178 Schmidt, D. C., & Vinoski, S. (1998). Object Interconnections: An Introduction to CORBA Messaging. C++ Report, November/December 1998. Schroeder, M. D. (1993). Chapter 1: A State-of-the-Art Distributed System: Computing with BOB. In S. Mullender (editor) Distributed Systems. Wokingham, England: Addison-Wesley, pp. 1-16. Seltzer, L. (1998). NT 5.0 Preview. PC Magazine, 17(20), pp. 100-130. Shoham, Y, (1997). An Overview of Agent-oriented Programming. In Software Agents. edited by J. M. Bradshaw. Menlo Park, California: AAAI Press. Shuey, R. (1989). Data Engineering and Information Systems, In A. Gupta (editor) Integration of Information Systems: Bridging Heterogeneous Databases. New York: IEEE Press, pp. 11-23. Skarmeas, N. (1999). Agents as Objects with Knowledge Base State. London: Imperial College Press. Sloman, M., editor. (1994). Network and Distributed Systems Management. Wokingham, England: Addison-Wesley. Smith, T. R. (1996). The Meta-Information Environment of Digital Libraries. D-Lib Magazine, July/August 1996. URL: http://www.dlib.org/dlib/july96/new/07smith.html (date: 5-11-2000). Sondheim, M., Gardels, K., & Buehler, K. (1999). Chapter 24: GIS Interoperability. In P. A. Longley, M. F. Goodchild, & D. J. Maguire (editors), Geographical Information Systems: Principles, Techniques, Applications and Management, Second Edition. John Wiley & Sons, Inc., pp. 347358. Star, J. & Estes, J. (1990). Geographic Information Systems: An Introduction. Englewood Cliffs, New Jersey: Prentice Hall. Talor, D. A. (1992). Object-Oriented Information Systems: Planning and Implementation. New York: John Wiley & Son, Inc. Tang, Q. (1997). Component Software and Internet GIS. In Proceedings of GIS/LIS'97, Cincinnati, Ohio, pp. 131-135. The Oxford American Dictionary of Current English (1999). Oxford: Oxford University Press. Thomas, C. G. & Fischer, G. (1997). Using Agents to Personalize the Web, In Proceedings of IUI'97: International Conference on Intelligent User Interfaces, Orlando, Florida, pp.53-60. Thompson, C., Linden, T., & Filman, B. (1997) Thoughts on OMA-NG: The Next Generation Object Management Architecture. Published on the Web. URL: http://www.omg.org/docs/ormsc/97-0901.html (date: 5-11-2000). 179 Tsou, M. H., & Buttenfield, B. P. (1998a). An Agent-based, Global User Interface for Distributed Geographic Information Services. In Proceedings of the 8th International Symposium on Spatial Data Handling, Vancouver, Canada, pp. 603-612. Tsou, M. H., & Buttenfield, B. P. (1998b). Client/Server Components and Metadata Objects for Distributed Geographic Information Services. In Proceedings of the GIS/LIS’ 98, Fort Worth, Texas, pp. 590-599. Vckovski, A. (1998). Interoperable and Distributed Processing in GIS. London: Taylor & Francis. Vckovski, A., Brassel, K. E., Schek, H-J, editors. (1999). Interoperating Geographic Information Systems: Proceeding of the Second International Conference, INTEROP’99, Zurich, Switzerland, Berlin: Springer. Vinoski, S. (1997). CORBA: Integrating Diverse Applications within Distributed Heterogeneous Environments. IEEE Communication, February 1997, 35(2). Weber, J., editor. (1997). Special Edition: Using Java 1.1, Third Edition. Indianapolis, Indiana: Que Corporation. Weiser, M. (1993). Hot Topics: Ubiquitous Computing. IEEE Computer, October, 1993. Weiss, G., editor. (1999). Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. Cambridge, Massachusetts: The MIT Press. Welsh, M., Dalheimer, M. K., Kaufman, L., & Welsh, M. (1999). Running Linux (third edition). Sebastopol, California: O’Reilly & Associates. Worboys, M. F. (1995). GIS: A Computing Perspective. London: Taylor & Francis. World Wide Web Consortium (W3C) (1999). HTML 4.0.1 Specification. W3C. URL: http://www.w3.org/TR/html401/ (date: 5-11-2000). World Wide Web Consortium (W3C) (2000). XHTML 1.0: The Extensible HyperText Markup Language: A Reformulation of HTML 4 in XML 1.0, W3C. URL: http://www.w3c.org/TR/xhtml1 (date: 511-2000). Wu, C. V. (1993). Object-Based Queries of Spatial Metadata. Unpublished doctoral dissertation. Buffalo, New York: State University of New York at Buffalo, Geography Department. Yang, Z., & Duddy, K. (1996). CORBA: A Platform for Distributed Object Computing. ACM Operating Systems Review, April 1996, 30(2). Yourdon, E. (1993). Decline & Fall of the American Programmer. Englewood Cliff, New Jersey: Prentice Hall. 180 Zhang, L., & Lin, H. (1996). A Client/Server Approach to 3D Modeling Support System for Coast Change Study. In Proceedings of GIS/LIS’96, Denver, Colorado, pp. 1265-1274. 181