An Introduction to National Geological Application Grid(NGG) Yongbo Zhang1 and Shilong Ma2 1 (China Geological Survey Bureau, Beijing 10035, China) 2 (National Lab of Software Development Environment, Beihang University, Beijing 100083, China) Abstract National Geological Application Grid, briefly NGG, is developed for applications of Chinese national geological survey, for implementing resources interconnections, sharing and coordination. The development and deployment of NGG have completed, a grid platform and an application integration mechanism have constructed, and a prototyping system running efficiently has implemented. NGG is now running on the three-layer network infrastructure in China Geological Survey. So far NGG prepares itself well for the real productive applications. Key Words: National Geological Application Grid (NGG), Grid computing, Service, Groundwater resource assessment 1. Introduction National Geological Application Grid, briefly NGG, is funded by High Performance Computer and Software Program under National 863 Hi-Tech Program. The project is carried out by China Geologic Survey Bureau collaborating mainly with Beihang University. Grid is a new generation distributed computing technology for integrating geographically distributed resources into a “super virtual computer” to achieve complete share of computing resources, storage resources, data resources and software resources[1, 2, 3]. As an application grid, NGG is developed for applications of Chinese national geological survey, for implementing resources interconnections, sharing and coordination. The development and deployment of NGG have completed, a grid platform and an application integration mechanism have constructed, and a prototyping system running efficiently has implemented. Now NGG has already prepared itself well for the real productive applications. NGG is running on the three-layer network infrastructure in China Geological Survey Bureau. To meet the requirements in geological survey applications, China Geological Survey Bureau, as the portal of the application grid and service management platform for NGG, monitors and coordinates system running. As regional centers for data-sharing, the regional Geological Survey centers deploy databases and GIS software. The provincial Geological Surveys collect original data, provide original data service, while the professional research centers with high performance computing environments, process large scale data and share application software. Supported by MAPGIS environment, NGG provides data and application software sharing, including the following, 1. data sharing service for browsing and querying data with MAPGIS, 2. ground water resource assessment service for providing information about groundwater resource assessment, 3. groundwater level forecast service for forecasting future groundwater level in specific areas with dynamically observed data, 4. retrieval and synthesis of mine-shaping information services in gird environment, 5. sharing of computing softwares for solid mineral resource assessment. 2. The Architecture of NGG NGG consists of basic services, a service flow constructor, tasks, a service flow engine and a portal, where a basic service implements a basic function which can be data service, transfer service, aggregation service and computing service in geological applications; the service flow constructor integrates a task from basic services according to service flow language specification of NGG; the service flow engine parses, validates and executes services defined in a task; the portal provides interface to users. For more details, see [11]. The architecture of NGG is shown as in Fig.1. Fig.1:The architecture of NGG In the above architecture, there are four important parts as follows. Portal: Friendly interface to users, located in China Geological Survey Bureau. Basic Services: services to implement atomic functions which can be data service, transfer service, aggregation service or computing service in geological applications. Core Services: MAPGIS Service (Note; NGG portal communicates MAPGIS environment with MAPGIS services which provide the creation, browse and transfer of geological vector data), Service Registry and Discovery Service (Note: NGG communicates with China National Grid (CNGrid) [7] by Service Registry and Discovery Service. NGG will register a node in CNGrid first, and then register all basic services and tasks that NGG provides in the NGG node). Service Flow: service flow constructor for constructing a task on the basis of basic services, according to service flow language specification of NGG, service flow interpreter responsible for parsing, validating, and executing services defined in the task. Thus the service flow engine could query relevant services, bind services dynamically and accomplish the user’s request. A geological application is a task, which is a flow of services; all of services are distributed deployed and registered in the central node of NGG; the task can manage and monitor every inner service in it. Thus, a task is a service flow to accomplish a geological application. 3. Service Flow for Aggregation and Coordination of Services 3.1. Aggregation and Coordination of Services in NGG Services are deployed by different providers in NGG. A single service provides an atomic function in geological applications. In many cases, a single service can not accomplish a geological application, and it is necessary to integrate services into a task. For example, in water quality evaluation application, user needs to retrieve dates from the area, select ingredients from database usually in another computer, submit them to several computers, and get final result for the evaluation. These services are distributed among different nodes. User needs to complete every service step by step manually which consumes user’s time and cost unnecessarily. The need to aggregate and coordinate services is on demand very much in NGG. So it is necessary to implement an automatic mechanism for carrying out the whole process of an application in NGG to integrate geological applications. 3.2. Why to develop a new service flow model for NGG The integration of services is not only to aggregate services but also to define the relationships between services such as data transfer, status control and so on. Though both WSFL and XLANG [8, 9, 10] are well developed service flow definitions, they can not provide substantial support for geological applications in NGG. Applications in NGG need to be organized dynamically. The state of a service usually affects execution of the following services or the whole service flow. One service may be employed repeatedly but it has different state control condition. Thus, we develop our own service flow model for NGG based on the requirements of geological applications. 4. A Service Flow Model for NGG 4.1. A service flow language for NGG The service flow language for NGG is in form of three-layer structure, i.e., service layer, activity layer and flow layer [11, 12]. Service layer provides basic information, including service name, ID, address, status definition and data interface. The interaction between services is defined on higher layer- activity layer. Activity layer consists of a number of services and provides an independent function, which is a light-weight flow. Activity layer describes the relationship of services and its major attributes include name, type of activity, service list, pre/post conditions, data interface and other constraints. Every service defined in the service list has its own behavior, including state control, message transfer, data flow and so on. The interaction between activities is defined on higher layer – flow layer. On the top of this model, flow layer consists of a set of activities. Flow can accomplish a geological application and its behavior is determined by activities. The major attributes of flow include name, ID, data interface, control table, data table and activity list. 4.2. A service flow engine for NGG In NGG a service flow engine is developed for creating and executing automatically service flows for users. In the engine, the interaction between activities is managed by the task, so users do not need to carry out every step manually themselves. The service flow engine for NGG consists of the following components. User Interface, that provides users with a friendly interface to create service flow. Syntax/Semantic Check, that provides simple syntax and semantic check of service flow language. Service Discovery, that queries relevant services according to service ID in service flow. Service Validate, that validates the status of the service found in NGG. Service Binding, that binds logical service in service flow to valid physical services. Execute, that executes service in service flow step by step. Service Monitor, that monitors running status of each service in service flow. Service monitor will catch any exception and submit it to Analyze/Plan module. Analyze/Plan, that analyzes current circumstance. In the event of exception, rebind the running service to other valid services and execute the service flow again. 5. Uniform access to heterogenous data in NGG Geological data refers to all kinds of information in geological applications. Thus, geological data may be in various forms, such as files, Oracle databases, SQL Server databases and so on [4]. NGG presents the following solutions to provide uniform access to heterogenous data. Define a uniform data access service by extracting a model of heterogenous data access from different applications, and encapsulate it to a standard web service [6] to provide services on higher level with uniform data access. Describe input and output of data query using XML [5]. Then, convert XML input into the corresponding data query languages during the course of application running and also translate query results into XML. Compress the query results in XML form if necessary. 6. An Example: A Task Running in NGG Let us illustrate how NGG creates and carries out a task (service flow) by an example of groundwater quality evaluation application in North China area. When users submit the request of groundwater quality evaluation, NGG portal will accept this request and communicate with MAPGIS environment to locate spatial data. NGG portal forward these parameters to service flow engine, which creates an application instance. The instance first queries relevant services from Service Registry and Discovery and selects suitable services for binding, then startups profit service and executes data service, transfer service and computing service step by step, finally the instance finishes profit service, returns results to users and logouts. A simple process is shown in Fig.2 and detailed process is shown in Fig.3. fig. 2:Simple flow of groundwater quality evaluation fig.3:Detailed flow of groundwater quality evaluation 6. Conclusions NGG is a grid-based new generation geological application system that treats resources as services and applications as tasks. NGG implements registry, discovery, dynamical binding and invocation of geological services and achieves connectivity, collaboration and sharing of geological resources. NGG provides services to professional users who organize basic services into new tasks to solve geological problems. To meet the requirements of complex geological applications, we developed our own service flow model for NGG. As an example, in this paper we illustrate how NGG Service Flow model creates and carries out a task (service flow) by the application of groundwater quality evaluation in North China area. NGG is one of the most important application grids in China National Grid (CNGrid). References 1. I. Foster, C. Kesselman, J. Nick, S. Tuecke. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum, June, 2002. 2. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke. The Data Grid: Towards Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications, 23:187-200, 2001. 3. I. Foster, C. Kesselman, J. Nick, S. Tuecke. Grid Services for Distributed System Integration. Computer, 35(6), 2002. 4. China Geological Survey Bureau. Data of Application Grid Technology Communion Conference. October, 2002. 5. Francois Yergeau, John Cowan et al. Extensible Markup Language (XML)1.1. World Wide Web Consortium Recommendation, February, 2004. 6. Heather Kreger. Web Services Conceptual Architecture (WSCA 1.0). April, 2002. 7. Nong Xiao, Hao Ren, Zhi-Wei Xu, Zhimin Tang, XiangHui Xie, and Wei Li, Design and Implementation of Resource Directory in National High Performance Computing Environment. Journal of Computer Research and Development. August, 2002. 8. Frank Leymann. Web Services Flow Language version 1.0. 9. Matthew Duftler and Rania Khalaf. BPEL4WS. October, 2002. May, 2001. Business processes with BPEL4WS : Learning 10. Tony Andrews, Microsoft Francisco Curbera, IBM et al. Specification: Business Process Execution Language for Web Services Version 1.1. May, 2003. 11. Yongxun Li, Haoming Guo, Shilong Ma, Yongbo Zhang, Service-Oriented National Geological Application Grid Architecture, Proceedings of the 3rd Asian Workshop of Foundations of Software, Xi’an, China, 2004. 12. Haoming Guo, Shilong Ma, Aggregation and Coordination of Services on Geological Application Grid, The First Grid@Asia Workshop, June, 2005.