A Data Grid Reference Architecture — Draft of February 1, 2001 — Ian Foster, Carl Kesselman 1 2 Introduction ................................................................................................................. 2 General Statement of Requirements ........................................................................... 3 2.1 Data Grid and Virtual Data Concepts ................................................................. 3 2.2 Data Grid Architecture Elements ........................................................................ 4 3 Architecture Overview ................................................................................................ 5 4 Fabric: Building Blocks .............................................................................................. 7 4.1 Storage Systems .................................................................................................. 7 4.2 Compute System ................................................................................................. 9 4.3 Network............................................................................................................... 9 4.4 Code Repositories ............................................................................................... 9 4.5 Catalog .............................................................................................................. 10 5 Connectivity .............................................................................................................. 10 6 Resource Access ....................................................................................................... 11 6.1 Service Registration and Enquiry ..................................................................... 12 6.2 Storage Systems ................................................................................................ 12 6.3 Compute Resources .......................................................................................... 13 6.4 Code Repositories ............................................................................................. 14 6.5 Catalogs............................................................................................................. 14 7 Collective Services.................................................................................................... 15 7.1 Information Services ......................................................................................... 15 7.2 Replica Management Service ........................................................................... 15 7.3 Catalog Management Services .......................................................................... 16 7.4 Community Authorization Services .................................................................. 17 7.5 Online Certificate Repository ........................................................................... 17 7.6 Request Management ........................................................................................ 17 8 Issues Not Addressed ................................................................................................ 18 Appendix ........................................................................................................................... 18 Acknowledgments............................................................................................................. 19 Bibliography ..................................................................................................................... 19 1 1 Introduction Our goals in defining a Data Grid Reference Architecture (henceforth DGRA) are threefold: To establish a common vocabulary for use when discussing Data Grid systems. To document what we see as the key requirements for Data Grid systems. To propose a specific approach to meeting these requirements. Of course, we hope that others will find our specific approach appropriate for their purposes; but even if they do not, we expect that the common vocabulary and clear statement of requirements will facilitate debate about alternatives. In defining the DGRA, we focus on defining enabling mechanisms rather than a complete vertically integrated solution. We take this approach because we believe that while a vertically integrated solution will solve at most one problem well, appropriately chosen mechanisms can be used to solve a wide range of different problems. In addition, we know that many of the disciplines for which this architecture may be useful already have substantial investments in data management systems. Our hope is that by defining this DGRA, we will both define a roadmap for Data Grid development and make clear where the definition of standard protocols and APIs can enable interoperability and code sharing among different Data Grid projects and systems. We view a Data Grid as providing its users with two basic types of functionality: Publication and collection management services focused on enabling community access to data. Data access and replication services focused on enabling transparency with respect to location as a means of improving access performance (with respect to speed and/or reliability). These services will provide for secure, reliable, and high-speed access to metadata and data, as well as replication and distribution of data in wide area environments. The second element of data grid functionality is concerned with virtualizing data location, allowing a consumer to refer to data in the abstract, with high-level data grid services mapping a virtual data reference to a specific physical location. This use of replicated virtual data is by now a common practice. Another variety of virtual data is the subject of current research activity within the GriPhyN project: namely, the use of data derivation services to enable transparency with respect to materialization, as a means of facilitating the definition, sharing, and use of data derivation mechanisms. We believe that this type of virtual data can also be supported within our DGRA; however, we do not address these issues here but refer the reader to [4]. We note that all three classes of data grid functionality (publication, location transparency, and materialization transparency) build on basic Grid infrastructure services providing basic security, communication, resource access, resource allocation, resource discovery, and other capabilities of general utility. Here we strive to use services being constructed in the larger Grid community. While these various components are logically distinct (e.g., data replication services are of interest even if one is not interested in collections or derived data, and Grid services are useful outside the scope of data grids) we believe that it is important to discuss all these elements of an eventual DGRA in a coordinated fashion. At the same time it is important to leave the door open 2 to interfacing Data Grid services to lower level services that do not adhere to this reference architecture. 2 General Statement of Requirements The following text, taken in part from the GriPhyN proposal (see www.griphyn.org), provides some background on the problems we wish to solve. The large physics experiments targeted by GriPhyN (ATLAS, CMS, LIGO, SDSS) share with other data-intensive applications a need to manage and process large amounts of data [3, 6]. This data comprises raw (measured) data as well as metadata describing, for example, how the raw data was generated, how large it is, etc. The total size of this data ranges from terabytes (in the case of SDSS) to petabytes (in the case of ATLAS and CMS). The computational and data management problems encountered in the four physics experiments targeted by GriPhyN also differ fundamentally in the following respects from problems addressed in previous work: Computation-intensive as well as data-intensive: Analysis tasks are compute-intensive and data-intensive and can involve thousands of computer, data handling, and network resources. The central problem is coordinated management of computation and data, not simply data movement. Need for large-scale coordination without centralized control: Stringent performance goals require coordinated management of numerous resources, yet these resources are, for both technical and strategic reasons, highly distributed and not amenable to tight centralized control. Large dynamic range in user demands and resource capabilities: We must be able to support and arbitrate among a complex task mix of experiment-wide, group-oriented, and (thousands of) individual activities—using I/O channels, local area networks, and wide area networks that span several distance scales. Data and resource sharing: Large dynamic communities would like to benefit from the advantages of intra and inter community sharing of data products and the resources needed to produce and store them. These considerations motivate the definition of a DGRA that will be critical to future dataintensive computing not only in the four physics experiments, but in the many areas of science and commerce in which sophisticated software must harness large amounts of computing, communication and storage resources to extract information from measured data. 2.1 Data Grid and Virtual Data Concepts We introduce the Data Grid as a unifying concept to describe the new technologies required to support such next-generation data-intensive applications. We use this term to capture the following unique characteristics: A Data Grid has large extent—national or worldwide—and scale, incorporating large numbers of resources and users on multiple distance scales. A Data Grid is more than a network: it layers sophisticated new services on top of local policies, mechanisms, and interfaces, so that geographically remote resources (hardware, software and data) can be shared in a coordinated fashion. 3 A Data Grid provides a new degree of transparency in how data-handling and processing capabilities are integrated to deliver data products to end-user applications, so that requests for such products are easily mapped into computation and/or data retrieval at multiple locations. (This transparency is needed to enable sharing and optimization across diverse, distributed resources, and to keep application development manageable.) The need for transparency is an important aspect of our DGRA. In its most general form, transparent access can enable the definition and delivery of a potentially unlimited virtual space of data products derived from other data. In this virtual space, requests can be satisfied via direct retrieval of materialized products and/or computation, with local and global resource management, policy, and security constraints determining the strategy used. The concept of virtual data recognizes that all except irreproducible raw measured data (or computed data that cannot easily be recomputed) need ‘exist’ physically only as the specification for how they may be derived. The grid may materialize zero, one, or many replicas of derivable data depending on probable demand and the relative costs of computation, storage, and transport. In high-energy physics today, over 90% of data access is to derived data. We note that virtual data as defined here integrates familiar concepts of data replication with the new concept of data generated “on-the-fly” (“derived”) in response to user requests. As we note above, the latter is a research problem while the former is common practice. Furthermore, the two concepts are logically distinct: it can be useful and important to replicate data even if derived data is not being generated automatically. However, we believe that there are likely to be advantages to an integrated treatment of the two concepts. For example, an integrated treatment will make it possible for users to choose to regenerate derived data locally even if the data is already materialized at a remote location, if regeneration is faster than movement. However, we stress that the DGRA defined here is completely consistent with data grid deployments that limit the use of virtual data to only replication, or that use no virtual data at all. 2.2 Data Grid Architecture Elements The discussion above indicates some of the principal elements that we can expect to see in a Data Grid architecture. As illustrated in Figure 1, we need data catalogs to maintain information about data itself (metadata), the transformations required to generate derived data (if employed), and the physical location of that data. We have the various resources that we must deal with: storage systems, computers, networks, and code repositories. We require the application-specific request formulation tools that enable the end user to define data requests, translating from domainspecific forms to standard request formats, perhaps consulting application-specific ontologies. We require code that implements the logic required to transform user requests for virtual data elements into the appropriate catalog, data access, and computational operations and to control their execution: what we term in the figure “request planning” and “request execution.” Notice the clean separation of concerns expressed in this figure between virtual data (catalogs) and the control elements that operate on that data (request manager). This separation represents an important architectural design decision that adds significantly to the flexibility of a Data Grid system. Notice also the various infrastructure elements that we must deal with: computers, storage systems, networks, and code repositories. 4 User Applications Request Formulation Request Manager Request Planner Storage Systems Data Catalogs Request Executor Code Repositories Computers Networks Figure 1: Schematic showing some principal Data Grid components. Experience with Grid applications and Grid architectures tells us that this application-specific view of Data Grid architecture does not tell the whole story. When accessing storage or compute resources, we require resource discovery and security mechanisms. When developing request plans, we need to be able to capture and manipulate such plans. We also require logic for moving data reliably from place to place, for scheduling sets of computational and data movement operations, and for monitoring the entire system for faults and for responding to those faults. These various mechanisms and services are for the most part application-independent and so can usefully be developed as independent mechanisms and reused in different contexts. These observations lead us to adopt a view of Data Grid architecture in which Data Grid mechanisms are viewed as being not independent of, but rather built on top of, a general Grid infrastructure [5]. Data Grid-specific mechanisms appear either as extensions to existing services or as entirely new services. We expand upon this view in the next section. It is important to distinguish between the architecture proposed in these figures and the software architectures developed by specific applications. We anticipate that to the maximum extent possible, experiment-specific software should be adapted for Grid operation not by rewriting it but rather by layering it on top of the services and APIs provided by the elements just described (request manager, catalogs, Grid services, etc.). Our goal in proposing the DGRA is hence not to dictate to application architects, but rather to identify the Grid services that those architects can use to “Grid-enable” their applications. It is also important to emphasize that while “User Applications” are represented in this figure as a small box, in most applications (e.g., those found in high energy physics) considerable development may well be required at this level in order to construct application systems that can exploit Data Grid mechanisms effectively. However, this machinery is beyond the scope of this article. 3 Architecture Overview We now proceed to provide a more comprehensive description of the services involved in our DGRA. Figure 2 illustrates the primary functional elements, placing them in the context of a conventional Grid architecture to make clear how they relate to each other and to other Grid services. See [5] for more information on Grid architecture, as well as a discussion of why standard protocols and APIs are needed to enable interoperability and code sharing, and [4] for a proposed catalog architecture for virtual data. In brief, we operate as follows: 5 Fabric: This lowest level of our Grid architecture comprises the basic resources from which a Data Grid is constructed. Some of these elements—for example, storage systems—must be extended to support Data Grid operation. In addition, new “fabric” elements are required, such as catalogs and code repositories for virtual data transformation programs. We attempt to define here what Fabric capabilities are required to support Data Grid operation. Connectivity: Services at this level are concerned with communication and authentication. We believe that we can use standard services at this level. We provide a brief description and refer the reader to [5] for more details. Resource: Services at this level are concerned with providing secure remote access to storage, computing, and other resources. We believe that we can use a variety of standard services here. We provide a fairly detailed description of requirements and possible approaches. Collective: Services at this level support the coordinated management of multiple resources. Here, we believe that significant new development is required: in particular, catalog services; replica management services; community policy services; coherency control mechanisms for replicated data; request formulation and management functions for defining, planning and executing virtual data requests; and replica selection mechanisms. We can also reuse existing collective services, for example, for resource discovery. We provide a fairly detailed description of requirements and possible approaches. Specific applications exploit Data Grid capabilities by invoking services at the Collective or Resource level, typically via supplied APIs and SDKs that speak the protocols used to access those services. From an application perspective, the design of these APIs and SDKs is a critical part of the overall DGRA, as it is here that we constrain the set of behaviors that are supported. For example, they might indicate whether an application can ask for data immediately vs. streaming vs. batch. From the perspective of Data Grid developers, it is the Connectivity and Resource protocols that are perhaps the most important element of the DGRA: these define the “neck” in the protocol hourglass, below which can be placed many different resources and above which can be implemented many services. For example, if standard authentication mechanisms can be defined, then different Data Grid developers can develop different components secure in the knowledge that there are standard mechanisms for establishing the identity of users and resources. Similarly, if a standard access protocol for storage elements can be defined, then developers of storage systems and developers of higher-level capabilities (e.g., replica management) can work independently. In the following, we discuss each DGRA component in turn. In each case, we first define our terms, then present what we see as the key requirements, and finally present a candidate approach to addressing those requirements. We emphasize that the discussion of specific approaches should be viewed as logically distinct from the discussion of requirements: a reader might agree with a particular set of requirements, but disagree with the specific approaches that we propose. We present here only an English-language description of components and protocols. Our colleagues in the Architecture Task Force of the European Data Grid are developing a UML description of Data Grid architecture, and we hope to be able to incorporate UML descriptions in a future revision of this document. To do: Improve and complete this figure; work on introducing UML descriptions if possible. 6 Discipline-Specific Data Grid Applications Application Collective Consistency Management Services Replica Selection Services Replica Management Services Information Coallocation Services Services Resource Usage Accounting Services Request Management Services System Monitoring Services Resource Brokering Services Distributed Catalog Services Community Authorization Service Request Planning Services Online Certificate Repository Storage Compute Network Catalog Code Service Enquiry Mgmt Mgmt Mgmt Mgmt Mgmt Reg. Protocol Protocol Protocol Protocol Protocol Protocol Protocol Connectivity Communication, service discovery (DNS), authentication, delegation Fabric Storage Systems Compute Systems Networks Catalogs Code Repositories Figure 2: A far-from-complete listing of major Data Grid Reference Architecture elements, showing how they relate to other Grid services. Shading indicates some of the elements that must be developed specifically to support Data Grids. 4 Fabric: Building Blocks The fabric layer defines the basic resources and elements from which a data grid is fabricated. Because of the diversity of both types of resources and variations between elements within a resource class, we will inevitably see considerable heterogeneity at the Fabric layer. Nevertheless, there are advantages to defining standard behaviors for Fabric elements. We define five Fabric elements: storage systems, compute systems, networks, code repositories, and catalogs, with the latter two being basically special cases of the first. 4.1 Storage Systems We define a storage system not as a physical device but as an abstract entity whose behavior can be defined in terms of its ability to store and retrieve named entities called files. Various instantiations of a storage service will differ according to the range of requests that they support (e.g., one might support reservation mechanisms, another might not) and according to their internal behaviors (e.g., a dumb storage system might simply put files on the first available disk, while a smart storage system might use RAID techniques and stripe files for high performance on observed access patterns). They can differ according to their physical architecture (e.g., disk farm vs. hierarchical storage system) and expected role in a Data Grid (e.g., fast temporary online storage vs. archival storage). They can also differ according to the local policies that govern who 7 is allowed to use them and their local space allocation policies. For example, an “exclusive” storage service might guarantee that files are only created or deleted, and space reserved, only as a result of external requests; while in the case of a “shared” resource these guarantees might not hold. A related issue might be whether a storage system guarantees that files are not deleted while in use. We believe that within the context of a DGRA it is important to reach consensus on a small set of “standard” storage system behaviors, with each standard behavior being characterized by a set of required and optional capabilities. Regardless of the type of storage system, we can establish base line capabilities that any storage system should have. These required capabilities are: Store, retrieve, and delete named sequences of bytes (i.e., files). Report on basic characteristics of the storage system, such as total storage capacity and available space. Additionally, a storage system any support a number of basic optional capabilities including: Reserve space for a file. Reserve disk bandwidth (guaranteed transfer rate). Support for parallel transfer. Ability to report on more detailed attributes such as supported transfer rate, latency for file access (on average or for a specific file), etc. From the above, we can identify three specific storage system behaviors. Disk Based Storage Element: This behavior captures traditional disk-based file systems. These are characterized by hierarchical file namespace, and essentially constant latency for file access. In addition to the above the following, additional optional behaviors are supported: Fine-grained quotas (i.e., available space may be larger then the largest file a specific user may be able to store) Co-located compute resources (i.e., this storage element may support “local” access mechanisms, such as Unix read/write operations). Temporary Storage Element: The behavior is similar to that of a disk based storage element, with the exception of a local operational policy that does not guarantee longevity of its data. Scratch storage and network based storage element (data caches) fall into this category. Additional optional behaviors include: Guarantee that files are not deleted while a transfer is in progress. Accept hints about desired data lifetimes (e.g. pinning) [Note: This is different from above only with respect to operational policy. Does that make sense?] Archival Storage Element: The behavior that we seek to capture here is an online archival storage system that provides secure remote access to large amounts of data, while also providing higher longevity for its data. Data files may or may not be named by hierarchal storage system. Specialized optional behaviors for these storage systems include: Request staging to reduce latency for access to specific files 8 To do: Flesh out these definitions—and determine the level of detail that is useful in these specifications. If too abstract, we end up saying nothing; if too specific, we limit ourselves unnecessarily. 4.2 Compute System We define a compute system not as a physical device but as an abstract entity whose behavior can be defined in terms of its ability to execute programs. Various instantiations of a compute service will vary along such dimensions as their aggregate capabilities (CPU, memory, internal and external networking) and the quality of service guarantees that they are able to provide (e.g., batch scheduled, opportunistic, reservation, failover, co-allocation of multiple resources). They may also vary in their underlying architecture (uniprocessor, supercomputer, cluster, Condor pool, Internet computing system), although as with a storage service, these details should be of less importance to users than the behaviors that are supported. As with storage systems, we are interested in defining standard behaviors. Here are our candidates: Compute farm: On-demand access to … Parallel computer: Support for parallel execution. To do: Flesh out these definitions. Design and implement dynamic account allocation mechanisms as a means of providing protection without requiring pre-existing accounts for all users. 4.3 Network We don’t say much about networks for now. Quality of service mechanisms will probably be important in the long term. To do: Develop requirements for networks. 4.4 Code Repositories A code repository is a storage service used to maintain programs, in source and/or executable form. We discuss it as a distinct entity because of the critical role that online access to programs plays in Data Grids and because code repositories introduce some specialized requirements. Like a storage or computational resource, a code repository should provide information about its contents, as well as providing access to the actual code elements. The precise behavior of code repositories remains to be defined, but an initial set of capabilities might be: Ability to manage multiple revisions of the same application as well as versions for different architectures. Provide information about code, including complete, platform requirements for execution, performance profiles, and a description of input and output parameters. Support for dynamic linking and shared libraries. Access control and policy enforcement. 9 As with compute and storage systems, there are many instantiations of this basic concept (e.g., many physics applications have developed their own code repositories). 4.5 Catalog A catalog is a storage service used to maintain mappings of some sort. We discuss it as a distinct entity because of the critical role that online access to programs plays in Data Grids and because catalogs introduce some specialized requirements. Like storage and computational resources, catalogs should provide information about their contents, as well as providing access to the actual data elements. The precise behavior of catalogs depends on the catalog architecture that we adopt, but the discussion in [4] suggests that we require catalog mechanisms capable of implementing replica catalogs [1] and metadata catalogs [2] as well as (for the purposes of providing transparency with respect to materialization) derived data catalogs, meta derived data catalogs, and transformation catalogs. All of these catalogs require appropriate support for access control and policy enforcement. One important requirement introduced by Data Grids is for additional attributes in the metadata catalogs maintained by applications, as discussed in [4]. 5 Connectivity As we noted above, it is the protocols defined within the Connectivity and Resource layers that are the main concern of the DGRA. Services at the connectivity layer in the DGRA are concerned with two sets of issues: communication (basic transport, routing, naming, and so forth) and authentication and credential management. We combine the discussion of authentication requirements with a discussion of larger security issues, as these are crosscutting questions that arise at multiple levels in the DGRA but are most easily discussed together. We see the primary Data Grid security requirements being as follows: Secure, bilateral authentication of users and resources: i.e., mechanisms that allow two entities (e.g., user+service, user+resource, service+resource) to verify each other’s identity. Confidentiality of data transferred. Access control (authorization) expressible by resource, code, and data owners. This corresponds to enforcement of local policy. Examples of such local policy may be filesystem and CPU usage quotas, usage priorities, ACLs, etc. Ability to delegate authorization decisions to a community. This ability corresponds to the specification and enforcement of global policies: e.g., “ingestion of new data takes precedence over analysis,” or, “no one ATLAS user may consume more than 30% of the compute resources dedicated to ATLAS.” Ability to publish and query access control policies. Proposed approach: We assume the basic Internet protocols for communication and base our security solution on services provided by the public key-based Grid Security Infrastructure. In brief, the security solution works as follows: At the Fabric level, individual storage and computer systems will use their local access control mechanisms. Grid enabled services are built on top of them. 10 At the Connectivity layer, we use PKI-based GSI as our basic authentication technology. This provides mechanisms for authentication of users and creation of proxy credentials that allow a computation to assert securely that it is acting on behalf of a specific user. At the Resource layer, our various protocols use GSI mechanisms to verify identify and to communicate proxy credentials. At the Collective layer, community authorization servers (CASs) act as entities able to generate proxy credentials that a computation can present at a resource to assert that it is authorized by the appropriate community authorities to perform specified operations, this avoiding a need for every resource to know about every user. These services are currently under design. The overall authorization and policy architecture allows individual resources (storage systems, computers, etc.) to express authorization requirements in terms of identity and/or community certificates, hence enforcing site-specified combinations of local and community policy. Two services that we describe below in more detail contribute significantly to the usability and scalability of the overall architecture: the CAS just referred to and Online Certificate Repository (OCR). Together, these ensure that (a) resources do not need to know about individual users and (b) users do not necessarily need to have a certificate for all operations. 6 Resource Access We now define the resource level protocols services and appropriate APIs. The Resource layer grid-enables Fabric resources, building on Connectivity Layer communication and authentication protocols to define protocols (and APIds) for the secure initiation, monitoring, and control of sharing operations on individual resources. Resource layer protocols are concerned entirely with individual resources, and hence ignore issues of global state and atomic actions across distributed collections. Within the DGRA, we assume the resource model illustrated below in which every resource supports three basic resource level protocol. Enquiry Control Registration Resource The roles of these protocols are: Registration protocols, which are used to notify the data grid environment of the existence of a resource, Enquiry protocols provide status and configuration information about the resource in question, and Control protocols, which are used to initiate and control resource use. We first discuss resource-independent service registration and enquiry protocols and then the resource-specific protocols used to manage storage, compute, code repository, and catalog services. 11 6.1 Service Registration and Enquiry A common protocol requirement for all resources and services is service registration. A service registration protocol allows a resource (or service) to notify another entity of its availability and how to contact it for purposes of enquiry or control. Such a protocol might be used, for example, to register resources with an information service that then constructs an index of known storage systems. A second common protocol requirement is for enquiry protocols, used to determine the structure and state of a resource or service. For example, in the case of a storage system, an enquiry protocol might allow an application to query a storage system concerning its capacity, speed, availability, and functionality. Proposed approach: We propose to use the Grid Resource Registration Protocol (GRRP) developed within the Globus project [reference to be provided] as a standard registration protocol. A resource uses this soft-state registration protocol to provide periodic notifications of its existence to interested parties, indicating the enquiry and control protocols that it supports. We propose to use the Lightweight Directory Access Protocol (LDAP) as a standard enquiry protocol. LDAP defines a protocol for enquiry and a standard format and data model for information. Note that we are not necessarily advocating the use of LDAP server technology, simply the use of LDAP as a protocol. We believe that while different types of resources may well use different control protocols, there are significant advantages to defining standard registration and enquiry protocols within the DGRA. In particular, this facilitates the development of resource discovery services. To do: Complete definition of GRRP. 6.2 Storage Systems As we noted in the Fabric discussion above, we advocate the use of common protocols and APIs to provide standard interfaces to what may be highly heterogeneous devices, in terms of both physical structure and behavior. An obligatory set of core protocols means that an application can always access a storage system in a standard way. These protocols may, for example, allow applications or higher-level services to retrieve, store, delete and copy files. These protocols and APIs should support the following capabilities, with the caveat that it is desirable that Data Grids be able to make reasonable use of “dumb” resources that only support a small subset of these functions. Registration: notify interested entities of the storage service’s existence. Enquiry: request status of storage system, pending requests, etc.; determine capabilities (e.g., support for reservation); publish access and allocation policies. Control: File system manipulation: open, put, get, list, delete, etc., files.; determine the status of pending requests; provide hints indicating future access patterns, reserve space, bandwidth, etc. In general, we will want these mechanisms to be integrated with the DGRA security and policy mechanisms, so that access control can be provided for individual operations in a standard fashion. It is important to note that in our definition of storage system, we only concern ourselves with the mechanics of storing and retrieving data. Issues such as how the data is structured within a file, 12 which storage systems are used for which files, when to delete files, etc. can be expressed in terms of these mechanisms and are generally relegated to higher layers in system architecture. Proposed approach. We propose to: Adopt GRRP and LDAP as our registration and enquiry protocols, as noted above. Define LDAP object classes to represent storage service capabilities. Adopt GridFTP as our control protocol, hence providing for both client-side access and third-party transfers, in addition to support for GSI security, parallelism, striping, etc. FTP’s ALLO command can be used to indicate space requirements. Adopt GRAM-2 as our reservation protocol. We propose further to structure our storage service protocols in terms of a core set (basically those spoken by GSI-enabled FTP servers) plus a set of extensions supporting, for example, storage reservation. Hence, Data Grid storage service protocols can be used to manage both standard FTP servers and also more sophisticated storage resource managers (SRMs), with the latter systems presumably providing richer functionality. Storage management functions, such as advising when to pin files on the server, may be mapped either into the GRAM-2 reservation protocol, or the FTP protocol (using ALLO and SITE commends, for example). We believe that all proposed storage management systems proposed to date can be expressed within this framework, with significant benefits in terms of uniformity. A potential disadvantage of this approach is that the decomposition of system functions into separate enquiry and control protocols may complicate interactions with some systems or obfuscate some functionality. However, we argue that these potential disadvantages are more then outweighed by the advantages of presenting a uniform model across all the types of storage systems that one can envision for Data Grid systems. To do: Develop the above further to indicate required and optional capabilities. Define the object classes used to define a storage service, including capabilities. Develop a more detailed proposal concerning how GRAM-2 and GridFTP are used to provide access to a storage service. 6.3 Compute Resources The data grid does not impose any additional requirements on compute resources over those encountered in other Grid settings. We require the ability to determine availability and capability; to reserve and allocate; to initiate computation; and to monitor and control computation once started. As with storage systems, we advocate the use of standard protocols and client-side APIs so that users can interact with remote resources as logical rather than physical entities. Desired functions are (1) management, e.g., compute system allocation and reservation, (2) initiation, monitoring, and control of computation, and (3) informational: request status of compute system, pending and running tasks, etc.; publish capabilities (4) streaming of logging information produced by the executing program to the remote user. We observe that in many situations data-grid operations will be used to position data in a location so that it may be accessed and processed by a compute resource. To facilitate such operations, it is necessary to know what Grid enabled storage elements can be access from Grid enabled 13 compute resources via non-grid oriented protocols. Such relationships can be provided via the enquiry protocol supported by the compute resource. Proposed approach: We propose to adopt the protocols and APIs used in other Grid settings, namely: Adopt GRRP and LDAP as our registration and enquiry protocols, as noted above. Define LDAP object classes to represent compute service capabilities, compute service capabilities and system configuration. This information may include capacity in formation as well as environmental characteristics, such as the resources connection to the other networked resources as reported by measurement agents such as the Network Weather Service. Adopt GRAM-2 as our access, reservation, and computation management protocol. We will also investigate ClassAds and matchmaking. To do: Develop the above further to indicate required and optional capabilities. Define the object classes used to define a compute service, including capabilities. Complete the GRAM-2 specification. 6.4 Code Repositories We require enquiry mechanisms for determining available software and, for each piece of software, such information as supported architecture, owner, creator’s name, version, last time updated, and performance model. We also require control mechanisms that can be used, for example, to deposit program files, retrieve program files, and update program files. Proposed approach. We propose to: Adopt GRRP and LDAP as our registration and enquiry protocols, as noted above. Define LDAP object classes to represent code repositories and software. Adopt GridCVS as the protocol for specifying and transporting software configurations. To do: Define the object classes used to represent software attributes. Study existing code repositories, e.g. within physics experiments, and use this information to develop a more detailed proposal that can co-exist with existing systems. 6.5 Catalogs We require mechanisms for determining catalog characteristics and for accessing and updating catalog contents. Proposed approach. We propose to: Adopt GRRP and LDAP as our registration and enquiry protocols, as noted above. Define LDAP object classes to represent code repositories and software. Adopt LDAP as the protocol for querying and updating catalogs; develop standard APIs. Develop mediators to interface with application-specific meta data catalogs (MDCs). This will allow the Data Grid to retrieve information from the various MDCs already developed by applications as well as update these catalogs upon data materialization 14 Adopt, in particular, the RC structure and API defined in [1]. To do: Define the object classes used to represent catalog attributes. Study existing metadata catalogs, e.g. within physics experiments, and use this information to develop a more detailed proposal that can co-exist with existing systems. Develop further the derived data catalog structures defined in [4]. 7 Collective Services Collective protocols and services (and APIs) build on resource-layer protocols to enable interaction with multiple resources. We assert that, in general, the set of collective services required in a Data Grid system is not well understood and must emerge as a result of substantial experience. However, there is clearly a set of basic services that are necessary in any Data Grid: specifically, information, cataloging, replica management, and replica selection, and presumably others. We focus our attention on these basic services. 7.1 Information Services An information service supports enquiries concerning the structure, state, availability, etc., of multiple Grid resources. Resource attributes may be relatively static (e.g., machine type, operating system version, number of processors) or dynamic (e.g., available disk space, available processors, network load, CPU and I/O load, rate of accomplishing work, queue status and progress-metrics for batch jobs). Different information service structures may be used depending on the types of information to be queried and the types of queries to be supported. Proposed approach: We propose to adopt the Globus Toolkit’s Meta Directory Service information service architecture. MDS uses the GRRP and LDAP registration and enquiry protocols to construct Grid Index Information Services (GIISs) that receive GRRP registration messages and maintain a list of active services; use LDAP queries to retrieve resource descriptions, from which they construct indices, with time to live information indicating how frequently indices should be updated; and use those indices to process incoming LDAP queries. To do: Investigate further the GIIS functionality required in Data Grid applications. 7.2 Replica Management Service An effective technique for improving access speeds and reducing network loads when many users must access the same large datasets can be to replicate frequently accessed datasets at locations chosen to be “near” the eventual users. However, organizing such replication so that it is both reliable and efficient can be a challenging problem, for a variety of reasons. The datasets to be moved can be large, so issues of network performance and fault tolerance become important. The individual locations at which replicas may be placed can have different performance characteristics, in which case users (or higher-level tools) may want to be able to discover these characteristics and use this information to guide replica selection. And different locations may have different access control policies that need to be respected. A replica management service is an entity responsible for keeping track of replicas, providing access to replicas, generating (or deleting) replicas when required. This service can clearly layer on functionality described above, specifically the Replica Catalog and Storage System access protocols. 15 We note that high-level storage management functions can be created at the collective level by composing elemental storage systems via resource level protocols. For example, hierarchal storage management can be viewed as a replication policy that manages an archival storage element and a networked storage element. Numerous other replica management strategies can be imagined, but the following basic mechanisms appear to have general utility: Reliable, high-speed data movement. Fault tolerant and scalable Replica catalogs able to keep track of where replicas have been created and "in progress" replication efforts. Mechanisms for creating new replicas and removing old ones Mechanisms for checking on and/or enforcing the consistency of existing replicas. (How can we do this without knowledge of the internal structure of files?) Proposed approach. We propose as a first step to support the replica management functions that have been proposed within the Globus Data Grid Toolkit [1]. These functions build on the GridFTP data transport protocol and replica catalog mechanisms described below to provide functions for: The registration of files with the replica management service. The creation and deletion of replicas for previously registered files. Inquiries concerning the location and performance characteristics of replicas. The updating of replicas to preserve consistency when a replica is modified. (A formal definition of consistency in the Replica management services is yet to be defined). Management of access control at both a global and local level. To do: Determine experimentally how effective these techniques are in practice. Develop basic ideas for replica generation/deletion services. 7.3 Catalog Management Services The issue of catalog management is complex, as the catalogs deal with both application-specific and data-grid specific entities. For example, both the derived metadata catalog (DMDC) and the MDC defined in [4] contain attributes that are specific to an application. Applications have spent considerable effort to develop MDCs, and it is not within the scope of this work to design the catalog schemas for applications: instead, we want to design interfaces to existing structures. Therefore it is necessary for DGRA to Utilize currently existing MDCs. Provide means of querying and updating the catalogs. Provide a derived data catalog (DDC) structure that captures the descriptions of transformations performed by the applications. Build DDC and replica catalog (RC) structures that as generic as possible. Provide well-defined interfaces to the catalogs, so that both users and applications can construct relevant queries. Proposed Approach. 16 Design mediators that can interface with the application-specific catalogs. This will allow the Data Grid to retrieve information from the various catalogs already developed by the applications as well as update those catalogs upon data materialization. Augment the databases used by the applications with new attributes that support the Data Grid, such as the transformation used to produce the data, in the case of MDC or the ids of the derived data product, in the case of DMDC. In the DDC support attributes that are common to all applications, such as transformation names, input file names etc… 7.4 Community Authorization Services Community authorization services (CAS) are an important component of the DGRA authorization architecture. A CAS enables resources used by a community to implement a shared global policy for use of these community resources. Examples include enabling group access to a specific collection of file (regardless of where they are stored), or ensuring fair share use of network bandwidth across all members of the community. Proposed approach. A CAS architecture is currently under development and is described in a separate document. This CAS definition leverages Connectivity layer security protocols. The central design concept is to use the CAS to issue credentials to a user that are good for a class of operations, and for other services to be able to interpret the validity of the requested operation without resorting to any other on-line service. 7.5 Online Certificate Repository (Or is it Online CAs that we want?) 7.6 Request Management Resource level services support the basic mechanisms required to manipulate data and resources on the data grid. We anticipate that higher level management services will need to be defined in order to facilitate the use of the Data Grid. These services will capture common usage patterns. A good analogy is the stdio library found on Unix. While applications could code to the lowlevel read/write system calls, common I/O usage patterns mean that most applications can benefit from stdio, which includes buffering, prefetching, etc. We anticipate a range of data grid usage patterns to evolve. Proposed approach. We identify two generic classes of services: Request planning. Request management While the functionality of these services will vary from instantiation to instantiation, we propose to define core protocols and syntax that all planners and management services must adheare to. These include: Common control protocol for initiating, tracking and controlling planning and management operations. GRAM-II may serve this function. A common syntax for expressing requirements and job sequences. We propose that these be XML based. Details of DTD structure needs to be defined 17 Standard way of identifying available resource sets and monitoring status of resources. Planner and management services may participate in GSRP to obtain this information. 8 Issues Not Addressed This list of services has not addressed all DGRA requirements. The following are items that we know we have missed: Accounting. To do: Add to the list of missing services. Appendix Arie’s comments on Virtual Data, that still need to be integrated. The concept of “virtual data” is akin to “views” in database terminology. The general concept of a view is a virtual definition of a data, including the applications of “functions” (similar to “transformations” in this document. Two things can be learned from the way views are defined as guidance to the definition of virtual data. a) The language to define views is the same language used for queries. So, if such a “request language” exists (the point I made above), then it is the same language that should be used to define “virtual data”. Inventing another declarative mechanism (a question that was raised in “research challenges”) is unnecessary. b) Some views can become “materialized” or “instantiated”. This is done if the view is requested a lot. This is similar to the point made in the Challenges section when “users request the same virtual data element multiple times”. This is a very useful concept, and as a goal could be managed dynamically according to usage. Ewa’s Notes on Request Management There are a number of issues that need to be investigated in the area of request management. At this point, we sketch the general behavior of the request manager. Request Management (RM) requires six phases: Request interpretation, which requires the system to understand the applicationspecific query posed by the user/application. Possible request languages are: ODMG, XML-DTD and LDAP. Request planning. At this point the manager needs to be able to query the MDC for data existence. If the data is materialized, RM needs to evaluate the performance of accessing the data from the potential locations. In addition, it might be valuable to evaluate the cost of re-evaluating the data, in case it is cheaper to do so. If the data is not materialized the DMDC, DDC and TC need to be consulted to establish how the data needs to be produced and what is the cost of producing the data 18 In both cases, the information services need to be consulted to establish the availability of resources and estimate the performance costs of data evaluation and retrieval. Based upon the performance models, a plan (or plans) is produced. Interface with user and/or application. It might be desirable to allow the user/application to decide whether, given the estimated cost the user wants to proceed with data materialization or which way to proceed, batch vs. interactive for example. Request execution. The needed resources are allocated (if possible) and the computation is scheduled. Request monitoring. We need to provide the ability of monitoring the progress of the request so that it is available to the user and to RM. RM needs to be able to deal with failures and construct contingency plans. Information update. Upon request completion, we need to update the MDC and DDC to reflect that the data has been materialized and notify and return the output to the user/application. Issue: We need to define what happens when a transformation produces a whole range of data values. Should we register all the datasets, or throw them away. To decide this issue we need to enhance our understanding of how the application programs (transformations in our model) behave. Acknowledgments We are grateful to our colleagues within the Earth Systems Grid, European Data Grid, GriPhyN, and Particle Physics Data Grid projects for numerous helpful discussions on the topics presented here. Bibliography 1. Allcock, B., Bester, J., Bresnahan, J., Chervenak, A.L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D. and Tuecke, S., Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing. In Mass Storage Conference, 2001. 2. Baru, C., Moore, R., Rajasekar, A. and Wan, M., The SDSC Storage Resource Broker. In Proc. CASCON'98 Conference, 1998. 3. Chervenak, A., Foster, I., Kesselman, C., Salisbury, C. and Tuecke, S. The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Data Sets. J. Network and Computer Applications, 2001. 4. Deelman, E., Foster, I., Kesselman, C. and Livny, M. Representing Virtual Data: A Catalog Architecture for Location and Materialization Transparency, 2001. In preparation. 5. Foster, I., Kesselman, C. and Tuecke, S. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Intl. J. Supercomputer Applications, (to appear) 2001. http://www.globus.org/research/papers/anatomy.pdf. 6. Moore, R., Baru, C., Marciano, R., Rajasekar, A. and Wan, M. Data-Intensive Computing. In Foster, I. and Kesselman, C. eds. The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, 1999, 105-129. 19