Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers A Three Tier Web Services Architecture Web Browser Application Authenticated connections XML to HTML servlet Web Service Web Server (Portal) Web Service Web Service Grid Service Grid resources, e.g. Condor Web Service Local Backend Services Storage system (batch, file, etc.) Batch system Remote Web Server Why Web Services? Strong industry support & growing adoption Self describing interfaces & protocol Support in all languages Easy addition of additional input or output parameters Interface evolution w/o breaking what works PPDG Architecture Application DAG Catalog Services Monitoring Planner DAG Executor Info Services Repl. Mgmt. Policy/Security Reliable Transfer Service Compute Resource Storage Resource Network Resource PPDG Architecture Application DAG Catalog Services Monitoring Planner DAG Executor Info Services Repl. Mgmt. Policy/Security Reliable Transfer Service Compute Resource Storage Resource Network Resource Data Grid Web Services Architecture Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Components: Replica Catalog Get replicas: GFN -> SURLs Get best replica? Web Services Meta Data Catalog Replica Catalog Create replica Replication Service File Client Input: GFN, SURL Specify <meta data> for new GFN HRM++ Service File Server(s) HRM Listener Storage Resource Remove replica Single Site Input: GFN, SURL Make / delete directory (recursive) Directory Listing terse or verbose, optionally more than 1 level deep optionally matching a pattern (regexp?) Create / delete link (soft) to another file or directory Components: HRM Listener This component serves as the link between the grid-unaware HRM and the replica system. The HRM / storage resource generates 2 possible types of events. File Client Web Services Meta Data Catalog Replica Catalog Replication Service HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Advice request: proposed deletion of file X. Listener responds with advice as a number in the range of 0.0 (please don’t) to 1.0 (OK). The listener could base this advice upon interaction with the replica catalog to discover if this is the last disk resident copy, for example. State change notification: File X is added, or deleted, or cache state is changed. In this case the listener updates the replica catalog. Components: Replication Service This component acts as an agent for the client to make replicas, and manipulate replica policy Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) Web services: Storage Resource Copy a replica of GFN / SURL to site X. Get status of replication operation. Add / edit / remove a local replication policy (push, maybe pull) To implement a replication policy, it may register as a listener with the HRM HRM Listener Single Site Components: HRM++ Service HRM Web Services: File status (cached, pinned, permanent, size, owner, etc.) File status changes (e.g. stage a file, pin a file, make permanent) Mapping from SURL to TURL for file get, including protocol negotiation Web Services Space allocations for put, including Meta Data Catalog Replica Catalog protocol negotiation to yield TURL Replication Service File Client HRM++ Service File Server(s) Extended functions: HRM Listener Storage Resource Directory listings, search (like replica catalog) Reliable (as much as possible) third party file transfers to/from another Data Grid Site (reliable), or to/from a site with a supported protocol (e.g. ftp site) Single Site Technologies Employed Apache web server Tomcat servlet engine JAXM for SOAP Messages XML data format Web Services Meta Data Catalog Replica Catalog Replication Service File Client HRM++ Service File Server(s) HRM Listener Storage Resource Single Site Implementation Replica Catalog SOAP servlet + mySQL back end (future) global replication policy, client to replication service HRM++ Service HRM: SOAP servlet wrapping JASMine Extensions to HRM: reliable file transfer (wrap gridftp, etc.), queuing directory listings, tree search Replication Service SOAP servlet + mySQL for request persistence & queues (future) listener for new files + policy for replication (push) HRM Listener SOAP servlet, client to Replica Catalog Status Year old raw XML limited prototypes: Replica catalog Read-only listings, GFN -> SURL Loaded with silo info (>100,000 files) Pre-HRM service Read-only listings, SURL -> TURL (multi-protocol) New SOAP components currently in development Replica catalog full capabilities except ACL’s, user defined meta-data (deferred) HRM++ service Recursive file transfer client <-> unmanaged storage (jparss) 3rd party reliable file transfers WSDL Web Services Definition Language (equivalent to CORBA IDL) http://lqcd.jlab.org/grid/gridService_wsdl.xml Data Grid File Manager Client Application Capabilities (prototype) Browse contents of file system Managed disk cache on data grid node Unmanaged Local or Remote file system Tertiary storage (eventually HRM) Move files between managed and unmanaged storage Within a single data grid node Between local file system and data grid node 1Q02: Between data grid nodes (3rd party transfer) Status – displays if file is currently in disk cache Migrate from tape to disk (not released) Standardization Activities PPDG Activity: Jlab is working with the SRB group to standardize web services (WSDL) for managing a data grid Common interface for JASMine and SRB Web services client to inter-operate between dissimilar back ends Extend to additional systems once operational