Java Globus Development Plan Peter Lane, Jarek Gawor, Darcy Quesnel and Gregor von Laszewski Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois Michael Russell Department of Computer Science, University of Chicago, Chicago, Illinois Contents 1 2 3 What is Java Globus?.............................................................................................................................................3 Why Java? .............................................................................................................................................................3 Current Functionality Provided by Java Globus ....................................................................................................3 3.1 APIs ...............................................................................................................................................................4 3.1.1 GSI (org.globus.security) .......................................................................................................................4 3.1.2 GRAM (org.globus.gram) ......................................................................................................................4 3.1.3 MDS (org.globus.mds) ..........................................................................................................................4 3.1.4 GASS (org.globus.io.gass) .....................................................................................................................4 3.1.5 GSIFTP (org.globus.io.ftp) ....................................................................................................................4 3.1.6 UrlCopy (org.globus.io.urlcopy) ............................................................................................................4 3.1.7 RSL (org.globus.rsl) ..............................................................................................................................4 3.1.8 MDSML (org.globus.mdsml) ................................................................................................................5 3.1.9 MyProxy (org.globus.myproxy) ............................................................................................................5 3.1.10 GARA (org.globus.gara) ........................................................................................................................5 3.2 Command-line Tools .....................................................................................................................................5 3.2.1 Supported in Globus ..............................................................................................................................5 3.2.2 Not supported in Globus ........................................................................................................................5 3.3 GUI Components and Tools ..........................................................................................................................6 3.3.1 Configuration Wizard ............................................................................................................................6 3.3.2 Grid Proxy Init .......................................................................................................................................6 3.3.3 MyProxy ................................................................................................................................................6 3.3.4 GARA Workbench ................................................................................................................................6 3.3.5 GridDesktop ...........................................................................................................................................6 3.3.6 MDS Components .................................................................................................................................6 3.4 Other Features ................................................................................................................................................6 4 Current Shortcomings of Java Globus ...................................................................................................................6 4.1 APIs ...............................................................................................................................................................6 4.1.1 GSI .........................................................................................................................................................7 4.1.2 DUROC .................................................................................................................................................7 4.1.3 GASS .....................................................................................................................................................7 4.1.4 GridFTP .................................................................................................................................................7 4.1.5 GARA ....................................................................................................................................................7 4.2 Command-line Tools .....................................................................................................................................7 4.3 GUI Components and Tools ..........................................................................................................................7 4.4 Other Shortcomings .......................................................................................................................................7 5 Java Globus Development Plan .............................................................................................................................7 5.1 General comments .........................................................................................................................................7 5.2 Code Review..................................................................................................................................................8 5.3 Main Development Items...............................................................................................................................8 5.3.1 Short-term items (1-3 weeks).................................................................................................................8 5.3.2 Medium-term items (1-3 months) ..........................................................................................................8 5.3.3 Long-term items (3 or more months) .....................................................................................................8 5.4 Other Development Ideas ..............................................................................................................................9 5.4.1 Ideas that need some evaluation (to investigate feasibility) ...................................................................9 5.4.2 Student projects (summer-long projects) ...............................................................................................9 6 Appendix ............................................................................................................................................................. 10 6.1 Project Participants ...................................................................................................................................... 10 6.1.1 Project coordination ............................................................................................................................. 10 6.1.2 Main developers................................................................................................................................... 10 6.1.3 Contributors ......................................................................................................................................... 10 6.2 Projects Using Java Globus: ........................................................................................................................ 10 7 References ........................................................................................................................................................... 11 1 WHAT IS JAVA GLOBUS? Java Globus (commonly known as the Java CoG Kit [1] [2] [3]) is an implementation of Globus protocols and functionality in Java. The components provided by Java Globus are not limited to those of the C Globus implementation (herein referred to simply as “Globus”). 2 WHY JAVA? First, it should be noted that there is a strong and growing user community in the area of Java Grid application development. For example, the EU Datagrid effort recently defined Java, in addition to C, as one of their target implementation languages [4]. If nothing else, this should be acknowledged as justification for considering Java as a key component in the future evolution of Grid technology. That is not to say that picking Java to develop Grid applications is an arbitrary decision. Java is a modern, object-oriented programming language that, as a result, makes software engineering of large-scale distributed systems much easier. Java also has the advantage of platform-independence because of its intermediate “bytecode” representation. This feature allows any pure Java application to run without recompilation on any system that supports a Java 2 Virtual Machine (see [5] for a list of platform ports). Since platform-independence is important when delivering applications in heterogeneous environments like Grids, Java has a big advantage for Grid developers and users. The Java environment includes core libraries that implement common Internet protocols and functionality. Among them are rich and easy-to-use networking libraries that are of particular use in the Grid environment. Java is a safe language. Java is type safe, has array bounds checking, and sandboxing of running applications. Java also allows for explicit security. The Java environment offers cryptography APIs that allow for authentication, and data integrity and confidentiality. Java also has a number of technologies that would be worth exploring in relationship to Grid computing. Some of these technologies, which could be layered either above or below Globus, include JAAS, JINI, JNDI, JSP, EJBs, and CORBA interoperability. For example, the Java Authentication and Authorization Service (JAAS) [6], enables fine-grained and extensible access control based on who signed the code and/or who runs the code. A myth about Java is that it exhibits poor performance because it is an interpreted language. The reality is that, for example, Java socket and file I/O performance can be just as fast as C++. Furthermore, JIT (justin-time) technology helps dispel the myths about Java’s performance being poor when compared to compiled languages such as C or Fortran. With JIT technology, Java bytecode is compiled into native code just before execution resulting in a significant performance gain. In fact, code running under IBM’s latest JVM outperforms the same code when compiled directly to native code (see [7] and [8]). These and other performance gains by the Java platform show Java to be a good choice for high-performance applications that are submitted to a server, the servers they run on, as well as the client applications that do the submitting. One final note to show the wide acceptance of Java as a serious Internet development language: Java will be embraced as a full member of the .NET suite of tools, according to statements by Microsoft [9]. 3 CURRENT FUNCTIONALITY PROVIDED BY JAVA GLOBUS Java Globus currently provides most of the client side functionality of Globus. All of the libraries and tools are pure Java classes and can be executed on any operating system that supports Java 2 without any recompilation or modification. Java Globus also provides additional tools and libraries that are not currently distributed with Globus, such as MyProxy, GARA, and MDSML. 3.1 APIs The following is a list of libraries and tools that are currently part of Java Globus. Each item is followed by a short, technical description of the library that highlights differences from and similarities to the C Globus implementation. 3.1.1 GSI (org.globus.security) This library is a partial implementation of GSI. It is fully compatible with Globus GSI and can be used to write GSI-enabled clients and servers. It supports both host and subject authorizations. It does not, however, implement the GAA or offer a GSS API interface. It does not support certificate revocation lists (CRL) and does not check certificate extensions. One unique feature of the library is that it can manage multiple credentials at the same time in the same process. Any library that uses the Java GSI library can take advantage of this capability. 3.1.2 GRAM (org.globus.gram) This library is a full implementation of the GRAM client API. It allows for submitting and canceling of jobs, polling for job status and sending signals to a job. The library enables a user to ‘ping’ a gatekeeper to verify if the user can authenticate to it. In addition, this library allows for registering and un-registering of callback listeners that listen for job status updates. The callbacks are implemented as Java events. Beyond the functionality of Globus, the Java GRAM API allows the specification of the delegation type to perform—either full or limited. 3.1.3 MDS (org.globus.mds) This library provides convenience APIs to access the MDS service. It allows for querying the MDS and adding, modifying or deleting MDS entries. The library is based on the JNDI library and enables communication with the MDS over a SASL interface (secure MDS) from both JNDI and the Netscape Directory SDK. 3.1.4 GASS (org.globus.io.gass) This library provides client and server GASS functionality. Java GASS implementation is fully compatible with Globus GASS. It allows, for example, a Java GASS client to connect and transfer a file from a Globus GASS server; or a Globus GASS client to connect and transfer a file from a Java GASS server. The Java GASS client provides the file-access API, while the Java GASS server provides the ‘server-ez’ API. Java Globus does not support the cache management functionality at this point; nor does it follow the full client and server C API. 3.1.5 GSIFTP (org.globus.io.ftp) This library provides a client API for accessing and transferring files from GSI-enabled FTP servers. It provides all the common FTP commands as methods, implements more advanced functionality such as recursive file transfers (transferring of entire directories), and supports third party transfers. It does not however, support any other advanced GridFTP functionality at this point. 3.1.6 UrlCopy (org.globus.io.urlcopy) This library provides a simple API for transferring a file from one location to another. The locations are specified as URLs and any combination of the following protocols is supported: HTTP, HTTPS, FTP, GSIFTP, and FILE. Also, third party transfers can be initialized between any ftp servers that support that feature. 3.1.7 RSL (org.globus.rsl) This library provides a convenience API for creating, manipulating, and checking the validity of RSL expressions. It also handles an XML-based RSL representation. 3.1.8 MDSML (org.globus.mdsml) This library provides an API that aids in the conversion of various schemas to and from the MDSMLv1 format. For example, an MDSML document can be translated into the LDAPv3 schema format, or the DSML format. Work has already been approved for, among other related activities, upgrading this library to conform to the MDSMLv2 specification. The C Globus distribution does not include a corresponding library. 3.1.9 MyProxy (org.globus.myproxy) This library provides the MyProxy client API in Java. It is fully compatible with the C implementation of MyProxy. It allows for uploading the Globus credentials to a MyProxy server, retrieving the stored credentials from the server, and destroying them. The C Globus distribution does not include a corresponding library. 3.1.10 GARA (org.globus.gara) This library is an implementation of the GARA reservation API in Java. It allows for creating, canceling, and modifying various types of reservations, including network, CPU, and SGI graphics pipe reservations. It also allows for polling reservation status and registering and un-registering callback listeners just like in the GRAM library. The C Globus distribution does not include a corresponding library. 3.2 Command-line Tools Java Globus contains a set of command-line scripts with the intent of providing similar client-side functionality as Globus. The command-line tools are Bourne shell scripts and Windows batch files which are thin wrappers around the Java classes which actually implement the functionality. 3.2.1 Supported in Globus 3.2.2 globusrun o Does not support multi-requests (see GramMultiJobRequest below). globus-url-copy grid-proxy-init o Does not mask the entry of one’s passphrase, since echoing of streams to stdout is not defined as part of the language standard. A GUI version called visual-grid-proxy-init (see below) is provided which properly masks an entered passphrase. grid-proxy-info grid-proxy-destroy grid-cert-info globus-gass-server globus-gass-server-shutdown grid-info-search o grid-change-pass-phrase Not supported in Globus visual-grid-proxy-init o Same as grid-proxy-init, but allows graphical manipulation of proxy settings and provides a masked passphrase entry field (see grid-proxy-init above). myproxy visual-myproxy GramMultiJobRequest o Submits multiple GRAM job requests using a DUROC compatible, multi-job RSL. mdsml-converter 3.3 GUI Components and Tools Java Globus contains a set of graphical components that demonstrate the usability of Java Globus as a foundation for graphical grid applications, and that provide a convenient interface to low-level Grid client tools. 3.3.1 Configuration Wizard The Configuration Wizard provides the user with a uniform way of configuring Java Globus. It should be noted that the configuration process is rather trivial and could easily be done manually. 3.3.2 Grid Proxy Init The Grid Proxy Init component provides a visual interface for creating Globus proxies (see visual-gridproxy-init). 3.3.3 MyProxy This component is used to manage proxies on the MyProxy Servers. 3.3.4 GARA Workbench This application provides a simple user interface for managing network reservations. It has been shown as a demo at a past Super Computing conference. 3.3.5 GridDesktop The Grid Desktop is a next generation Portal application that demonstrates the benefits of Java, the Grid, and web-based Portals. Currently, this component is implemented as an application. This component can also be used to demonstrate the integration of drag ‘n’ drop functionality into Grid applications. 3.3.6 MDS Components These components are simple examples of how to do this, and can be used to develop more sophisticated components. 3.4 Other Features 4 Firewall support o Ability to set the port ranges for machines behind firewalls or NAT servers. Support for /dev/urandom device o Uses the /dev/urandom device when available for faster seed generation for use by the GSI library implementation. CURRENT SHORTCOMINGS OF JAVA GLOBUS It should be noted that by the use of the word “shortcomings”, it is not implied that Java Globus is in any way unusable. What is referred to by “shortcomings” are specific features which one might expect or assume exist in Java Globus with respect to the C implementation, but for whatever reasons are not available. 4.1 APIs In general, the Java Globus APIs do not conform to the C Globus APIs. Reasons for this include inherent language differences (object-oriented vs. procedural programming, events vs. callbacks, etc.), different user requirements, and limited personnel and resources. This section describes discrepancies in the Java APIs with respect to the C APIs that may be of particular interest. 4.1.1 GSI The Java GSI library does not offer a GSS API interface and does not implement GAA API functionality. It does not support certificate revocation lists (CRL) and does not check certificate extensions. 4.1.2 DUROC Though originally there was a JNI-based implementation of the DUROC API, it is not supported, as are all of the other JNI versions of the APIs. A pure Java implementation of DUROC has not been developed for the following reasons: 1. 2. DUROC communicates with sub-jobs via Nexus. Although there is a Java port of the Nexus library, the library was not kept up-to-date. Also, the Globus team discourages the use of Nexus in favor of Globus IO. The effort needed to implement DUROC correctly is currently not worth the commitment of the Java Globus team’s limited resources. Once DUROC is re-designed, a pure-Java implementation of the API will be provided. 4.1.3 GASS GASS cache management is not implemented. 4.1.4 GridFTP Basic GridFTP functionality is provided, but advanced functionality such as parallelism and striping is not. 4.1.5 GARA The Java GARA library currently provides support only for network reservations. However, support of other types of reservations can easily be added to the library. 4.2 Command-line Tools 4.3 globus-job-* o These are not currently deemed a priority because GASS cache management is not implemented and the C versions of these tools are not widely used. grid-info-* o Equivalent tools are provided as part of the Netscape Directory SDK for Java distribution. globus-personal-gatekeeper o A pure-Java gatekeeper implementation does not exist at this time. GUI Components and Tools Due to a lack of resources, most of the GUI components can not be reasonably assumed to conform to the JavaBeans component architecture. Though it would be nice to ensure compatibility with the JavaBeans standard, GUI development is not a large priority at the moment. 4.4 Other Shortcomings SSL/cryptography libraries o The SSL and cryptography libraries used to implement the GSI library are currently commercial libraries. Both libraries are from IAIK [10]. They can each be substituted with alternatives with some further work. 5 JAVA GLOBUS DEVELOPMENT PLAN 5.1 General comments The development of Java Globus will be focused on general, reusable libraries and services. Simple, reusable graphical components and tools will only be developed for demonstration purposes. It is also important to note that all the items listed in the plan are either directly requested from Java Globus users, or provide common functionality reusable by these users. The libraries and services will be fully compatible (to the extent possible) with Grid protocols and interoperable with other Grid services. 5.2 Code Review As part of a formal development plan for Java Globus, periodic code reviews will be schedules as needed following an initial review pending the acceptance of the plan. The following are the main points to consider during the code reviews: 5.3 Consistency of the APIs o Naming conventions (among Java Globus APIs and with respect to the C APIs) o Identification of and improvements for deficiencies in existing libraries Coding standards o A document will be drawn up based on the Globus coding standards but with some variation as a result of differences in the languages and issues at hand. Documentation o Missing documentation will be filled in. o Current documentation will be reviewed for accuracy and completeness. Main Development Items The following is a list of items that we would like to implement and include as part of Java Globus. The main goal of this section is to present a list of ideas for consideration. After a finalized set of development items is determined, specific development details will be drawn up for each. 5.3.1 Short-term items (1-3 weeks) 5.3.2 Medium-term items (1-3 months) 5.3.3 Finish GSI-LDAP implementation. o Provide better documentation and examples. More advanced RSL handling API o Create a class with methods for manipulating frequently-used attributes, such as getExecutable(), getArguments(), setDirectory(), etc. Loading of proxies from byte arrays o Add a feature to the GlobusProxy API to allow loading of proxies from byte arrays or InputStreams. Connection pooling o Develop a connection pooling class for managing multiple FTP-based connections. Improve overall and per-package documentation. Develop an advanced job management library A simple component that can do all of the following: o Input data and executable staging o Job submission o Job output movement o Pre- and post-job submission operations Long-term items (3 or more months) GridFTP o Provide full support for the GridFTP protocol. (This was supposed to have been done as a part of the official GridFTP campaign) Replicas o Catalog It is mostly implemented; it just needs to be cleaned up and checked into the Java Globus CVS repository. o Management Implement a Java Gatekeeper (Gatekeeper + Jobmanager) o Explore signed jobs (applications) o Implement a Jar Jobmanager o Explore sandboxing 5.4 Other Development Ideas 5.4.1 Ideas that need some evaluation (to investigate feasibility) 5.4.2 Re-evaluate and switch, if possible, to a free (for commercial use) SSL/JCE security package o Alternatives JSSE (SSL) Cryptix (JCE) IBM PKIX library (distributed with their JVM) Switching security packages might require some extra work such as implementing the PKCS10 library from scratch, but this should not be difficult to do. Provide Kerberos support o Test if the DSTC’s Kerberos (or Sun’s once out) implementation is compatible with Kerberos-enabled Globus services. Provide Smart Card/i-Button support o Add support for the PKCS11 standard to APIs and tools like grid-proxy-init, etc. Student projects (summer-long projects) Implement Client-Side GSI-SSH o Provide an interactive client tool and a client library. o There aren’t many Java SSH clients and libraries to base an implementation on. o It might take a lot of time and effort to implement. Might not be worth pursuing. Implement GSI-RSYNC o The RSYNC protocol itself is quite complex, and there is no known implementation of it in Java. o It might take a lot of time and effort to implement. Might not be worth pursuing. Explore JINI Services for Grids o Study the feasibility of supplementing the current Information Services infrastructure with JINI. Explore JXTA services and frameworks for Grids o Study the feasibility of a language independent peer-to-peer framework as part of the current Grid infrastructure. Implement or enhance various graphical components and tools o Create a Visual globusrun Component Job submission Job monitoring and management Displaying feedback, killing jobs, etc… o Implement Job Creation Component Building RSLs o Enhance Replica Catalog GUI o Enhance Drag ‘N’ Drop Job Submission Application 6 APPENDIX 6.1 Project Participants 6.1.1 Project coordination 6.1.2 Main developers 6.1.3 Jarek Gawor Peter Lane Darcy Quesnel Nell Rehn Contributors 6.2 Gregor von Laszewski Jason Novotny (LBL) Michael Russell (U of C) Benjamin Temko (Indiana University) Projects Using Java Globus: A full list of projects can be found at [11] 7 REFERENCES [1] G. v. Laszewski, I. Foster, J. Gawor, W. Smith, and S. Tuecke, "CoG Kits: A Bridge between Commodity Distributed Computing and High-Performance Grids," in ACM 2000 Java Grande Conference. San Francisco, CA, 2000, pp. 97--106, http://www.mcs.anl.gov/~laszewsk/papers/cog-final.pdf. [2] G. v. Laszewski, I. Foster, J. Gawor, and P. Lane, "A Java Commodity Grid Kit," Concurrency and Computation: Practice and Experience, vol. 13, 2001. [3] G. v. Laszewski, V. Getov, M. Philippsen, and I. Foster, "Multi-Paradigm Communications in Java for Grid Computing," Communications of ACM, 2001 http://www.globus.org/cog/documentataion/papers/cacm.pdf. [4] S. M. F. German Cancio, Tim Folkes, Francesco Giacomini, Wolfgang Hoschek, Brian L. Tierney, "The DataGrid Architecture." [5] "Java Virtual Machine," 2001 http://java.sun.com/cgi-bin/java-ports.cgi. [6] "JAAS: The Java Authentication and Authorization Service," 2001 http://java.sun.com/products/jaas/. [7] "The Volano Report," 2000, http://www.volano.com/report001205/index.html. [8] "The Volano Report," 2001, http://www.volano.com/report.html. [9] J. Niccolai, "Microsoft to unveil .NET software for non-Microsoft platforms," InfoWorld, 2001 http://www.infoworld.com/articles/hn/xml/01/03/13/010313hnnonms.xml. [10] "IAIK Security Libraries," 2001, http://jcewww.iaik.tu-graz.ac.at/. [11] "CoG Projects: Projects using Java CoG," 2001, http://www.globus.org/cog/java/projects.html.